CN108170372B - Data processing method and device based on cloud hard disk - Google Patents

Data processing method and device based on cloud hard disk Download PDF

Info

Publication number
CN108170372B
CN108170372B CN201711298448.6A CN201711298448A CN108170372B CN 108170372 B CN108170372 B CN 108170372B CN 201711298448 A CN201711298448 A CN 201711298448A CN 108170372 B CN108170372 B CN 108170372B
Authority
CN
China
Prior art keywords
file
file system
deleted
data
rbd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711298448.6A
Other languages
Chinese (zh)
Other versions
CN108170372A (en
Inventor
严晓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Jiwei Technology Co ltd
Original Assignee
Xiamen Jiwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Jiwei Technology Co ltd filed Critical Xiamen Jiwei Technology Co ltd
Priority to CN201711298448.6A priority Critical patent/CN108170372B/en
Publication of CN108170372A publication Critical patent/CN108170372A/en
Application granted granted Critical
Publication of CN108170372B publication Critical patent/CN108170372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种基于云硬盘的数据处理方法和装置。其中,该方法包括:确定云硬盘中文件系统的类型,并确定与上述类型对应的扫描策略,其中,上述扫描策略用于确定待扫描数据的存储位置;依据上述对应的扫描策略扫描上述文件系统,确定上述文件系统中的已删除文件的数据块;根据接收到的操作指令对上述已删除文件的数据块进行处理。本发明解决了现有相关技术无法根据文件系统的类型,处理云硬盘中的已删除数据的技术问题。

Figure 201711298448

The invention discloses a data processing method and device based on a cloud hard disk. The method includes: determining the type of the file system in the cloud hard disk, and determining a scanning strategy corresponding to the above type, wherein the scanning strategy is used to determine the storage location of the data to be scanned; scanning the file system according to the corresponding scanning strategy , determine the data block of the deleted file in the above-mentioned file system; and process the data block of the above-mentioned deleted file according to the received operation instruction. The invention solves the technical problem that the existing related technology cannot process the deleted data in the cloud hard disk according to the type of the file system.

Figure 201711298448

Description

Data processing method and device based on cloud hard disk
Technical Field
The invention relates to the technical field of computers, in particular to a data processing method and device based on a cloud hard disk.
Background
With the continuous updating and development of computer technology, Linux continuously enters into an extensible computing space, particularly an extensible storage space, a distributed file system CEPH is randomly added into a file system alternative line in Linux, and the CEPH is used as a distributed file system, so that the POSIX compatibility can be maintained, and meanwhile, the functions of copying and fault tolerance are added.
The existing CEPH may be used to implement a cloud hard disk service provided by RBD (compact disk loaded with DVD), but cannot implement secure deletion and data recovery of cloud hard disk data, which may cause serious problems, for example, when a user creates a 100G cloud hard disk, mounts the cloud hard disk on a virtual machine of a windows system, and writes 100G data in the cloud hard disk, and deletes the data (and does not delete the RBD), wherein if the number of established data copies is 3, the bottom layer actually still occupies 100G × 3 — 300GB of data. From the perspective of secure deletion, this 300G should not be occupied and there is a risk that data will be recovered by others.
Furthermore, from a data recovery point of view, the recovery does not require 3 copies, and one copy can already achieve the recovery. Moreover, whether considering safe deletion or data recovery, the user should have a decision-making right to decide the leaving of the belonging file data, not the constraint. In addition, after the cloud hard disk is deleted (i.e., the RBD is deleted), CEPH directly deletes all data related to the RBD, and does not provide any possibility of recovery.
Aiming at the problem that the deleted data in the cloud hard disk cannot be effectively processed in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The invention provides a data processing method and device based on a cloud hard disk, which at least solve the technical problem that deleted data in the cloud hard disk cannot be processed according to the type of a file system in the prior art.
In one aspect, the invention provides a data processing method based on a cloud hard disk, which comprises the following steps: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; and processing the data blocks of the deleted files according to the received operation instructions.
Further, when the file system is a first type of file system, scanning the file system according to the corresponding scanning policy to determine a data block of a deleted file in the file system, including: scanning preset bytes of the file name of the first type of file system in a file directory table, and recording the file name and a data block pointer corresponding to the file name; and determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
Further, when the file system is a second type of file system, scanning the file system according to the corresponding scanning policy to determine a data block of a deleted file in the file system, including: scanning an operation log of the second type file system or scanning a storage area of metadata corresponding to an existing file in the second type file system to obtain a storage area of the metadata corresponding to the deleted file; and determining the data block of the deleted file according to the storage area of the metadata corresponding to the deleted file.
Further, in a case that the operation instruction is a delete instruction, processing the data block of the deleted file according to the received operation instruction includes: determining the fragments corresponding to the data blocks of the deleted files; and recording the data information of the fragments and deleting the fragments.
Further, in a case where the operation instruction is a recovery instruction, processing the data block of the deleted file according to the received operation instruction includes: acquiring a predetermined path for storing recovery data; determining a deleted file to be restored according to the pointer relation and the incomplete metadata of the data block of the deleted file; and restoring the deleted file to be restored to the preset path.
Further, after deleting the fragment, the method further includes: and establishing a mapping relation between the fragments and the data blocks in the file system according to the data information of the fragments.
On the other hand, the invention also provides a data processing device based on the cloud hard disk, which comprises: the device comprises a first determining module, a second determining module and a scanning module, wherein the first determining module is used for determining the type of a file system in the cloud hard disk and determining a scanning strategy corresponding to the type, and the scanning strategy is used for determining the storage position of data to be scanned; a second determining module, configured to scan the file system according to the corresponding scanning policy, and determine a data block of a deleted file in the file system; and the processing module is used for processing the data blocks of the deleted files according to the received operation instructions.
Further, when the file system is a first type of file system, the second determining module includes: the recording submodule is used for scanning preset bytes of the file name of the first type of file system in a file directory table and recording the file name and a data block pointer corresponding to the file name; and the first determining submodule is used for determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
Further, when the file system is a second type of file system, the second determining module further includes: the scanning submodule is used for scanning the operation log of the second type file system or scanning the storage area of the metadata corresponding to the existing file in the second type file system to obtain the storage area of the metadata corresponding to the deleted file; and the second determining submodule is used for determining the data blocks of the deleted files according to the storage areas of the metadata corresponding to the deleted files.
In another aspect, the present invention further provides a storage medium, where the storage medium includes a stored program, where the program performs the following method steps: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; and processing the data blocks of the deleted files according to the received operation instructions.
In another aspect, the present invention further provides a processor, where the processor is configured to execute a program, where the program executes the following method steps: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; and processing the data blocks of the deleted files according to the received operation instructions.
In the invention, the type of a file system in a cloud hard disk is determined, and a scanning strategy corresponding to the type is determined, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; the data blocks of the deleted files are processed according to the received operation instructions, and the purpose of effectively processing the deleted data in the cloud hard disk according to the type of the file system in the cloud hard disk is achieved, so that the technical effect that a user can select safe deletion and data recovery of the cloud hard disk data is achieved, and the technical problem that the deleted data in the cloud hard disk cannot be processed according to the type of the file system in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a flowchart illustrating steps of a data processing method based on a cloud disk according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating steps of an alternative cloud hard disk-based data processing method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating steps of an alternative cloud disk-based data processing method according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating an alternative process for handling a cloud disk according to an embodiment of the present invention; and
fig. 5 is a schematic structural diagram of a data processing apparatus based on a cloud disk according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, in order to facilitate understanding of the embodiments of the present invention, some terms or nouns referred to in the present invention will be explained as follows:
CEPH file system (CEPH FS): the POSIX-compatible distributed file system can use ceph storage clusters to store data, and can add copying and fault-tolerant functions while maintaining POSIX compatibility.
RBD (i.e., RADOS Block Device): based on a block storage protocol externally provided by rados, the basic principle is that data organization is managed by the concept of a volume in a storage pool of CEPH, after the volume is mapped out, a user can operate the volume like using a local hard disk, and an RBD library can directly call a related interface of rados to realize data I/O and management inside a cluster.
FAT (File Allocation Table for short) File system: a file system common to microsoft in the Dos/Windows family of operating systems is a generic name, wherein FAT12, FAT16, and FAT32 are FAT file systems.
ext (extended File system) File system: an extended or extended file system is a file system implemented on linux using a virtual file system.
NTFS (New Technology File System) File System: the file system belongs to a file system special for a restricted level and replaces an old FAT file system.
XFS File System: a high performance journaling file system which excels in handling large files while providing smooth data transfer; even under the condition of power failure or operating system crash, the method can still ensure the uniformity of the file system.
Fragmentation (IP packet fragmentation): is a kind of database partition, which divides a large database into smaller, faster and more easily managed parts, i.e. data shards.
Example 1
According to an embodiment of the present invention, there is provided an embodiment of a data processing method based on a cloud disk, it should be noted that the steps shown in the flowchart of the drawings may be executed in a computer system such as a set of computer executable instructions, and although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in an order different from that here.
Fig. 1 is a flowchart of steps of a data processing method based on a cloud disk according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:
step S102, determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned;
step S104, scanning the file system according to the corresponding scanning strategy, and determining the deleted file data block in the file system;
and step S106, processing the data blocks of the deleted files according to the received operation instructions.
In the embodiment of the invention, the type of a file system in a cloud hard disk is determined, and a scanning strategy corresponding to the type is determined, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; the data blocks of the deleted files are processed according to the received operation instructions, and the purpose of effectively processing the deleted data in the cloud hard disk according to the type of the file system in the cloud hard disk is achieved, so that the technical effect that a user can select safe deletion and data recovery of the cloud hard disk data is achieved, and the technical problem that the deleted data in the cloud hard disk cannot be processed according to the type of the file system in the prior art is solved.
It should be noted that although different file formats are stored in different ways, they are basically the same and different, and can be mainly divided into two parts, one of which is a metadata part that contains basic information of a file and has a pointer pointing to a corresponding data block, and the other is a data block part.
In addition, it still needs to be stated that, in order to ensure successful data recovery as much as possible, the cloud hard disk can be powered off as early as possible, and the possibility of data recovery is higher at this time; if the power failure is not realized in time, if the cloud hard disk generates reading and writing of new data, the data block of the data to be recovered may be covered, and therefore the possibility of recovering the data may be reduced.
In the embodiment of the application, when the file system of the cloud hard disk to be processed is detected, it may be determined what type of file system the cloud hard disk is, and the metadata portion and the data block portion are scanned according to the type of the file system; the operation instruction may be a safe delete instruction and a data recovery instruction input by a user.
As an optional implementation manner, fig. 2 is a flowchart of steps of an optional data processing method based on a cloud hard disk according to an embodiment of the present invention, and as shown in fig. 2, when the file system is a first type of file system, the file system is scanned according to the corresponding scanning policy, and a data block of a deleted file in the file system is determined, including the following steps:
step S202, scanning preset bytes of the file name of the first type file system in a file directory table, and recording the file name and a data block pointer corresponding to the file name;
step S204, determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
In an alternative embodiment, the first type of file system may be, but is not limited to, a FAT (FAT12, FAT16, or FAT32) file system. If the file system is a FAT file system, the first byte of the corresponding first letter of the file name is marked as "E5" in 16-system in a File Directory Table (FDT), and the file name of the file and its corresponding data block pointer are recorded.
As an optional implementation manner, fig. 3 is a flowchart of steps of an optional data processing method based on a cloud hard disk according to an embodiment of the present invention, and as shown in fig. 3, when the file system is a second type file system, the file system is scanned according to the corresponding scanning policy, and a data block of a deleted file in the file system is determined, including the following steps:
step S302, scanning the operation log of the second type file system, or scanning the storage area of the metadata corresponding to the existing file in the second type file system, to obtain the storage area of the metadata corresponding to the deleted file;
step S304, determining the data block of the deleted file according to the storage area of the metadata corresponding to the deleted file.
In an alternative embodiment, the second type of file system may be, but is not limited to, an ext file system. If the file system is an ext (ext3, ext4, etc.) file system, the operation log is scanned to find the storage area of the metadata (e.g., inode information record) corresponding to the deleted file, and in addition, the storage area of the existing metadata can be scanned to reversely derive the block data block without the inode header.
In addition, the file system may be, but not limited to, the first type file system (i.e., FAT file system) and the second type file system (i.e., ext file system), but may also be an NTFS file system, an XFS file system, or the like.
In an optional embodiment, when the operation instruction is a delete instruction, processing the data block of the deleted file according to the received operation instruction includes: determining the fragments corresponding to the data blocks of the deleted files; and recording the data information of the fragments and deleting the fragments.
In the alternative embodiment provided by the present application, but not limited to, an instruction selection interface or an instruction input interface may be provided for the user to select or input the above operation instruction.
Optionally, the delete instruction may be a safe delete instruction, where safe deletion is performed, that is, the deleted file data cannot be recovered; the fragmented data information may include, but is not limited to: slice sequence number object-no, object offset, length, etc.
In this application, based on the deleted data block portion obtained in the above optional embodiment, the fragment (i.e., stripe) of the CEPH bottom layer corresponding to the data block of the deleted file may be calculated, where it should be noted that one data block may correspond to one or more fragments.
In order to implement safe deletion, the fragment corresponding to the data block of the deleted file may be deleted, before executing safe deletion, data information corresponding to the fragment may be recorded in advance, and then the bit position of the fragment is set to zero, that is, the fragment is deleted, and the release address is cleared.
It should be noted that, in the embodiment of the present application, the fragment data may be recorded in a data structure of objectextend, but the present application is not limited to this specifically, and is only an exemplary description.
In an optional embodiment, in a case that the operation instruction is a recovery instruction, processing the data block of the deleted file according to the received operation instruction includes: acquiring a predetermined path for storing recovery data; determining a deleted file to be restored according to the pointer relation and the incomplete metadata of the data block of the deleted file; and restoring the deleted file to be restored to the preset path.
Optionally, the recovery instruction is an instruction for recovering deleted data; the predetermined path may be: a saving path which is selected or set by a user in advance and is used for storing recovery data; the predetermined path may be a storage pool established separately, an object storage, or a storage path local to the distributed file system.
In an optional embodiment provided by the present application, in a case that an operation instruction input or selected by a user is a restore instruction, a save path for storing restore data that is selected or set in advance by the user may be obtained, and then the deleted file to be restored is restored to the specified save path according to pointer relationships of all data block portions of the deleted file and the incomplete metadata portion.
In addition, because the deleted file is restored to the predetermined path, the deleted data blocks have no meaning at this time, so that the deleted data blocks can be deleted and released in time. The fragments corresponding to the data blocks of the deleted file may be deleted, and before performing the secure deletion, data information corresponding to the fragments may be recorded in advance, and then the Bite positions of the fragments are set to zero, that is, the fragments are deleted, and the release addresses are cleared.
As an optional embodiment, after deleting the fragment, the method further includes: and establishing a mapping relation between the fragments and the data blocks in the file system according to the data information of the fragments.
Optionally, after the fragment is deleted, a mapping relationship between the fragment and the data block in the file system may be established according to the fragment data information recorded before the secure deletion is performed, for example, a function fi le _ to _ extensions () may be used to perform remapping, the mapping relationship between the fragment and the data block in the file system may be established, and the discarded data block data may be deleted, so as to achieve a technical effect of optimizing a storage space of a cloud disk.
In addition, in the embodiment of the present application, besides the safe deletion and the timely data recovery of data in a cloud hard disk, the safe deletion and the recovery of the cloud hard disk can be also performed, fig. 4 is a flow chart of optional steps for processing the cloud hard disk according to the embodiment of the present invention, and how to perform the safe deletion and the recovery of the cloud hard disk in the present application is schematically described as shown in fig. 4 below:
step S401: adding a field of a deletion mark to an object in the RBD in advance;
step S403: it is determined whether the RBD is to be completely deleted.
In the step S403, if the determination result is that the user needs to completely delete the RBD, step S405 is executed, otherwise, step S407 is executed.
Step S405: and completely deleting the RBD in the cloud hard disk, and emptying all data of the RBD.
Step S407: and marking the deletion position as 0 to represent a normal state, and restoring the copy number to the copy number originally stored in the storage pool.
In an alternative embodiment, based on the above steps S401 to S407, a field of a delete flag may be added to the object in the RBD in advance, if the RBD is deleted, no matter how many copies are stored in the storage POOL (POOL) corresponding to the RBD, only the next copy is retained, and the delete position is marked as 1, which indicates that the RBD is deleted. The manager can be further reminded regularly to judge whether the RBD is to be deleted completely, so that unnecessary waste of the storage space of the file system is avoided, and in another optional embodiment, if the user needs to delete the RBD in the cloud hard disk completely, all data of the RBD can be emptied really; and if the user needs to recover the RBD in the cloud hard disk, marking the deletion position as 0 to represent a normal state, and recovering the copy number to the copy number originally stored in the storage pool.
Example 2
An embodiment of the present invention further provides an apparatus for implementing the data processing method based on a cloud disk, where fig. 5 is a schematic structural diagram of a data processing apparatus based on a cloud disk according to an embodiment of the present invention, and as shown in fig. 5, the data processing apparatus based on a cloud disk includes: a first determination module 50, a second determination module 52, and a processing module 54, wherein,
the system comprises a first determining module 50, a second determining module, a third determining module and a fourth determining module, wherein the first determining module is used for determining the type of a file system in a cloud hard disk and determining a scanning strategy corresponding to the type, and the scanning strategy is used for determining the storage position of data to be scanned; a second determining module 52, configured to scan the file system according to the corresponding scanning policy, and determine a data block of a deleted file in the file system; and the processing module 54 is configured to process the data block of the deleted file according to the received operation instruction.
It should be noted that the first determining module 50, the second determining module 52 and the processing module 54 correspond to steps S102 to S106 in embodiment 1, and the modules are the same as the corresponding steps in implementation examples and application scenarios, but are not limited to the disclosure in embodiment 1. It should be noted that the modules described above may be implemented in a computer terminal as part of an apparatus.
In an optional embodiment, when the file system is a first type of file system, the second determining module includes: the recording submodule is used for scanning preset bytes of the file name of the first type of file system in a file directory table and recording the file name and a data block pointer corresponding to the file name; and the first determining submodule is used for determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
In an optional embodiment, when the file system is a second type file system, the second determining module further includes: the scanning submodule is used for scanning the operation log of the second type file system or scanning the storage area of the metadata corresponding to the existing file in the second type file system to obtain the storage area of the metadata corresponding to the deleted file; and the second determining submodule is used for determining the data blocks of the deleted files according to the storage areas of the metadata corresponding to the deleted files.
It should be noted that the above modules may be implemented by software or hardware, for example, for the latter, the following may be implemented: the modules can be located in the same processor; alternatively, the modules may be located in different processors in any combination.
It should be noted that, reference may be made to the relevant description in embodiment 1 for alternative or preferred embodiments of this embodiment, and details are not described here again. Any optional or preferred data processing method based on a cloud hard disk in embodiment 1 may be executed or implemented in the data processing apparatus based on a cloud hard disk provided in this embodiment.
The cloud hard disk-based data processing apparatus may further include a processor and a memory, where the first determining module 50, the second determining module 52, the processing module 54, and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.
The processor comprises a kernel, and the kernel calls a corresponding program unit from the memory, wherein one or more than one kernel can be arranged. The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
The embodiment of the application also provides a storage medium. Optionally, in this embodiment, the storage medium includes a stored program, and when the program runs, the device on which the storage medium is located is controlled to execute any one of the data processing methods based on the cloud hard disk.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
The embodiment of the application also provides a processor. Optionally, in this embodiment, the processor is configured to execute a program, where the program executes any one of the data processing methods based on the cloud hard disk when running.
The embodiment of the application provides equipment, the equipment comprises a processor, a memory and a program which is stored on the memory and can run on the processor, and the following steps are realized when the processor executes the program: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; and processing the data blocks of the deleted files according to the received operation instructions.
Optionally, when the processor executes a program, it may further scan a predetermined byte of a file name of the first type file system in a file directory table, and record the file name and a data block pointer corresponding to the file name; and determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
Optionally, when the processor executes a program, the processor may further scan an operation log of the second type file system, or scan a storage area of metadata corresponding to an existing file in the second type file system, to obtain a storage area of the metadata corresponding to the deleted file; and determining the data block of the deleted file according to the storage area of the metadata corresponding to the deleted file.
Optionally, when the processor executes a program, it may further determine a fragment corresponding to the data block of the deleted file; and recording the data information of the fragments and deleting the fragments.
Optionally, when the processor executes the program, a predetermined path for storing the recovery data may be acquired; determining a deleted file to be restored according to the pointer relation and the incomplete metadata of the data block of the deleted file; and restoring the deleted file to be restored to the preset path.
Optionally, when the processor executes a program, a mapping relationship between the fragments and the data blocks in the file system may be established according to the data information of the fragments.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, wherein the scanning strategy is used for determining the storage position of data to be scanned; scanning the file system according to the corresponding scanning strategy, and determining the data blocks of deleted files in the file system; and processing the data blocks of the deleted files according to the received operation instructions.
Optionally, when the computer program product executes a program, scanning a predetermined byte of a file name of the first type file system in a file directory table, and recording the file name and a data block pointer corresponding to the file name; and determining the data block of the deleted file according to the file name and the data block pointer corresponding to the file name.
Optionally, when the computer program product executes a program, the computer program product may further scan an operation log of the second type file system, or scan a storage area of metadata corresponding to an existing file in the second type file system, to obtain a storage area of the metadata corresponding to the deleted file; and determining the data block of the deleted file according to the storage area of the metadata corresponding to the deleted file.
Optionally, when the computer program product executes a program, a fragment corresponding to the data block of the deleted file may also be determined; and recording the data information of the fragments and deleting the fragments.
Optionally, when the computer program product executes a program, a predetermined path for storing recovery data may also be acquired; determining a deleted file to be restored according to the pointer relation and the incomplete metadata of the data block of the deleted file; and restoring the deleted file to be restored to the preset path.
Optionally, when the computer program product executes a program, a mapping relationship between the fragments and the data blocks in the file system may be established according to the data information of the fragments.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (11)

1.一种基于云硬盘的数据处理方法,其特征在于,包括:1. a data processing method based on cloud hard disk, is characterized in that, comprises: 确定云硬盘中文件系统的类型,并确定与所述类型对应的扫描策略,其中,所述扫描策略用于确定待扫描数据的存储位置;Determine the type of the file system in the cloud hard disk, and determine the scan policy corresponding to the type, wherein the scan policy is used to determine the storage location of the data to be scanned; 依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块;Scan the file system according to the corresponding scanning policy, and determine the data blocks of the deleted file in the file system; 根据接收到的操作指令对所述已删除文件的数据块进行处理;Process the data block of the deleted file according to the received operation instruction; 其中,所述方法还通过如下方式实现云硬盘的安全删除和恢复:预先给RBD中的对象增加一个删除标记的字段;判断是否要彻底删除RBD;如果判断结果为用户需要彻底删除所述RBD,则彻底删除所述RBD并清空该RBD的所有数据,否则将删除位置标记为0,表示正常状态,并将副本数恢复至存储池中原本存储的副本个数。Wherein, the method also realizes the safe deletion and recovery of the cloud hard disk in the following manner: adding a field of deletion mark to the object in the RBD in advance; judging whether to delete the RBD completely; if the judgment result is that the user needs to completely delete the RBD, Then the RBD is completely deleted and all data of the RBD is emptied, otherwise the deletion position is marked as 0, indicating a normal state, and the number of copies is restored to the original number of copies stored in the storage pool. 2.根据权利要求1所述的方法,其特征在于,在所述文件系统为第一类文件系统时,依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块,包括:2 . The method according to claim 1 , wherein when the file system is a first-type file system, the file system is scanned according to the corresponding scanning policy, and the deleted files in the file system are determined. 3 . The data blocks of the file, including: 扫描文件目录表中所述第一类文件系统的文件名的预定字节,并记录所述文件名和所述文件名对应的数据块指针;Scan the predetermined bytes of the file name of the first type of file system in the file directory table, and record the file name and the data block pointer corresponding to the file name; 根据所述文件名和所述文件名对应的数据块指针,确定所述已删除文件的数据块。The data block of the deleted file is determined according to the file name and the data block pointer corresponding to the file name. 3.根据权利要求1所述的方法,其特征在于,在所述文件系统为第二类文件系统时,依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块,包括:3 . The method according to claim 1 , wherein when the file system is a second-type file system, the file system is scanned according to the corresponding scanning policy, and the deleted files in the file system are determined. 4 . The data blocks of the file, including: 扫描所述第二类文件系统的操作日志,或扫描所述第二类文件系统中已有文件对应的元数据的存储区域,得到所述已删除文件对应的元数据的存储区域;Scan the operation log of the second type of file system, or scan the storage area of the metadata corresponding to the existing file in the second type of file system, to obtain the storage area of the metadata corresponding to the deleted file; 根据所述已删除文件对应的元数据的存储区域,确定所述已删除文件的数据块。The data block of the deleted file is determined according to the storage area of the metadata corresponding to the deleted file. 4.根据权利要求1所述的方法,其特征在于,在所述操作指令为删除指令的情况下,根据接收到的操作指令对所述已删除文件的数据块进行处理,包括:4. The method according to claim 1, wherein, in the case that the operation instruction is a deletion instruction, processing the data block of the deleted file according to the received operation instruction, comprising: 确定与所述已删除文件的数据块对应的分片;determining the fragment corresponding to the data block of the deleted file; 记录所述分片的数据信息,并删除所述分片。Record the data information of the shard, and delete the shard. 5.根据权利要求1所述的方法,其特征在于,在所述操作指令为恢复指令的情况下,根据接收到的操作指令对所述已删除文件的数据块进行处理,包括:5. The method according to claim 1, wherein, in the case that the operation instruction is a restoration instruction, processing the data block of the deleted file according to the received operation instruction, comprising: 获取用于存储恢复数据的预定路径;Get a predetermined path for storing recovered data; 根据所述已删除文件的数据块的指针关系和不完整的元数据,确定待恢复的已删除文件;Determine the deleted file to be restored according to the pointer relationship of the data block of the deleted file and the incomplete metadata; 将所述待恢复的已删除文件恢复至所述预定路径。Restoring the deleted file to be restored to the predetermined path. 6.根据权利要求4所述的方法,其特征在于,在删除所述分片之后,所述方法还包括:6. The method according to claim 4, wherein after deleting the fragment, the method further comprises: 根据所述分片的数据信息,建立所述分片与所述文件系统中的数据块的映射关系。According to the data information of the fragment, a mapping relationship between the fragment and the data block in the file system is established. 7.一种基于云硬盘的数据处理装置,其特征在于,包括:7. A data processing device based on a cloud hard disk, comprising: 第一确定模块,用于确定云硬盘中文件系统的类型,并确定与所述类型对应的扫描策略,其中,所述扫描策略用于确定待扫描数据的存储位置;a first determining module, configured to determine the type of the file system in the cloud hard disk, and determine a scanning policy corresponding to the type, wherein the scanning policy is used to determine the storage location of the data to be scanned; 第二确定模块,用于依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块;a second determining module, configured to scan the file system according to the corresponding scanning policy, and determine the data blocks of the deleted files in the file system; 处理模块,用于根据接收到的操作指令对所述已删除文件的数据块进行处理;a processing module, configured to process the data block of the deleted file according to the received operation instruction; 其中,所述装置还通过如下方式实现云硬盘的安全删除和恢复:预先给RBD中的对象增加一个删除标记的字段;判断是否要彻底删除RBD;如果判断结果为用户需要彻底删除所述RBD,则彻底删除所述RBD并清空该RBD的所有数据,否则将删除位置标记为0,表示正常状态,并将副本数恢复至存储池中原本存储的副本个数。Wherein, the device also realizes the safe deletion and recovery of the cloud hard disk in the following manner: adding a field of deletion mark to the object in the RBD in advance; judging whether to completely delete the RBD; if the judgment result is that the user needs to completely delete the RBD, Then the RBD is completely deleted and all data of the RBD is emptied, otherwise the deletion position is marked as 0, indicating a normal state, and the number of copies is restored to the original number of copies stored in the storage pool. 8.根据权利要求7所述的装置,其特征在于,在所述文件系统为第一类文件系统时,所述第二确定模块,包括:8. The apparatus according to claim 7, wherein, when the file system is a first-type file system, the second determining module comprises: 记录子模块,用于扫描文件目录表中所述第一类文件系统的文件名的预定字节,并记录所述文件名和所述文件名对应的数据块指针;a recording submodule, used for scanning the predetermined bytes of the file name of the first type of file system in the file directory table, and recording the file name and the data block pointer corresponding to the file name; 第一确定子模块,用于根据所述文件名和所述文件名对应的数据块指针,确定所述已删除文件的数据块。The first determining submodule is configured to determine the data block of the deleted file according to the file name and the data block pointer corresponding to the file name. 9.根据权利要求7所述的装置,其特征在于,在所述文件系统为第二类文件系统时,所述第二确定模块,还包括:9. The apparatus according to claim 7, wherein when the file system is a second-type file system, the second determining module further comprises: 扫描子模块,用于扫描所述第二类文件系统的操作日志,或扫描所述第二类文件系统中已有文件对应的元数据的存储区域,得到所述已删除文件对应的元数据的存储区域;The scanning submodule is used to scan the operation log of the second-type file system, or scan the storage area of the metadata corresponding to the existing file in the second-type file system, and obtain the metadata corresponding to the deleted file. storage area; 第二确定子模块,用于根据所述已删除文件对应的元数据的存储区域,确定所述已删除文件的数据块。The second determination submodule is configured to determine the data block of the deleted file according to the storage area of the metadata corresponding to the deleted file. 10.一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,所述程序执行如下方法步骤:确定云硬盘中文件系统的类型,并确定与所述类型对应的扫描策略,其中,所述扫描策略用于确定待扫描数据的存储位置;依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块;根据接收到的操作指令对所述已删除文件的数据块进行处理;其中,还通过如下方式实现云硬盘的安全删除和恢复:预先给RBD中的对象增加一个删除标记的字段;判断是否要彻底删除RBD;如果判断结果为用户需要彻底删除所述RBD,则彻底删除所述RBD并清空该RBD的所有数据,否则将删除位置标记为0,表示正常状态,并将副本数恢复至存储池中原本存储的副本个数。10. A storage medium, wherein the storage medium comprises a stored program, wherein the program executes the following method steps: determining the type of a file system in a cloud hard disk, and determining a scanning strategy corresponding to the type, Wherein, the scanning strategy is used to determine the storage location of the data to be scanned; scan the file system according to the corresponding scanning strategy, and determine the data blocks of the deleted files in the file system; The data block of the described deleted file is processed; wherein, the safe deletion and recovery of the cloud hard disk are also realized in the following manner: add a field marked with a deletion mark to the object in the RBD in advance; judge whether to completely delete the RBD; if the judgment result is If the user needs to delete the RBD completely, delete the RBD completely and clear all data of the RBD, otherwise, mark the deletion position as 0, indicating a normal state, and restore the number of copies to the original number of copies stored in the storage pool. 11.一种处理器,其特征在于,所述处理器用于运行程序,其中,所述程序运行时执行如下方法步骤:确定云硬盘中文件系统的类型,并确定与所述类型对应的扫描策略,其中,所述扫描策略用于确定待扫描数据的存储位置;依据所述对应的扫描策略扫描所述文件系统,确定所述文件系统中的已删除文件的数据块;根据接收到的操作指令对所述已删除文件的数据块进行处理;其中,还通过如下方式实现云硬盘的安全删除和恢复:预先给RBD中的对象增加一个删除标记的字段;判断是否要彻底删除RBD;如果判断结果为用户需要彻底删除所述RBD,则彻底删除所述RBD并清空该RBD的所有数据,否则将删除位置标记为0,表示正常状态,并将副本数恢复至存储池中原本存储的副本个数。11. A processor, wherein the processor is used to run a program, wherein the program executes the following method steps when running: determine the type of a file system in a cloud hard disk, and determine a scan policy corresponding to the type , wherein the scanning strategy is used to determine the storage location of the data to be scanned; scan the file system according to the corresponding scanning strategy, and determine the data blocks of the deleted files in the file system; according to the received operation instruction The data blocks of the deleted files are processed; wherein, the safe deletion and recovery of cloud hard disks are also realized in the following manner: add a field marked with a deletion mark to the object in the RBD in advance; judge whether to completely delete the RBD; if the judgment result If the user needs to completely delete the RBD, then delete the RBD completely and clear all the data of the RBD, otherwise the deletion position will be marked as 0, indicating a normal state, and the number of copies will be restored to the original number of copies stored in the storage pool. .
CN201711298448.6A 2017-12-08 2017-12-08 Data processing method and device based on cloud hard disk Active CN108170372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711298448.6A CN108170372B (en) 2017-12-08 2017-12-08 Data processing method and device based on cloud hard disk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711298448.6A CN108170372B (en) 2017-12-08 2017-12-08 Data processing method and device based on cloud hard disk

Publications (2)

Publication Number Publication Date
CN108170372A CN108170372A (en) 2018-06-15
CN108170372B true CN108170372B (en) 2021-02-19

Family

ID=62525628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711298448.6A Active CN108170372B (en) 2017-12-08 2017-12-08 Data processing method and device based on cloud hard disk

Country Status (1)

Country Link
CN (1) CN108170372B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109656888A (en) * 2018-12-24 2019-04-19 山东中孚安全技术有限公司 A kind of file complete deletion method and device based on linux file system
CN111597149B (en) * 2020-04-27 2023-03-31 五八有限公司 Data cleaning method and device for database

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101937377A (en) * 2009-06-29 2011-01-05 百度在线网络技术(北京)有限公司 Data recovery method and device
CN105653731A (en) * 2016-02-02 2016-06-08 厦门市美亚柏科信息股份有限公司 Method for restoring deleted data of journaling file system

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100507919C (en) * 2007-05-11 2009-07-01 北京中星微电子有限公司 A FAT file system and its processing method
CN101909068A (en) * 2009-06-02 2010-12-08 华为技术有限公司 A file copy management method, device, and system
CN101707700B (en) * 2009-11-18 2013-02-27 中兴通讯股份有限公司 Video monitoring method and device and system
CN102207898B (en) * 2011-07-11 2013-01-16 秦玉海 Electronic data recovery method
CN102360318B (en) * 2011-09-27 2013-07-31 深圳万兴信息科技股份有限公司 Recovery method and device of deleted files in FAT (File Allocation Table) file system
CN103092896A (en) * 2011-11-04 2013-05-08 英业达股份有限公司 File reading method
CN102339321A (en) * 2011-11-09 2012-02-01 上海盛霄云计算技术有限公司 Network file system with version control and method using same
CN104331348A (en) * 2014-11-27 2015-02-04 四川效率源信息安全技术有限责任公司 Method for recovering file by reducing initial cluster number of FAT32 directory entry
CN104699794B (en) * 2015-03-18 2018-11-13 四川秘无痕信息安全技术有限责任公司 A method of it thoroughly removes and has deleted jpg formatted files in FAT32 file system
CN104881466B (en) * 2015-05-25 2018-09-07 百度在线网络技术(北京)有限公司 The processing of data fragmentation and the delet method of garbage files and device
CN105204959B (en) * 2015-08-28 2018-11-30 小米科技有限责任公司 Restore the method and device of deleted document in ext file system
CN105677250B (en) * 2016-01-04 2019-07-12 北京百度网讯科技有限公司 The update method and updating device of object data in object storage system
CN106156302B (en) * 2016-06-30 2019-09-03 上海达梦数据库有限公司 A kind of processing method and processing device of big field data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101937377A (en) * 2009-06-29 2011-01-05 百度在线网络技术(北京)有限公司 Data recovery method and device
CN105653731A (en) * 2016-02-02 2016-06-08 厦门市美亚柏科信息股份有限公司 Method for restoring deleted data of journaling file system

Also Published As

Publication number Publication date
CN108170372A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
US8904137B1 (en) Deduplication system space recycling through inode manipulation
US10712944B2 (en) Writing data in a distributed data storage system
US8682867B2 (en) Deleted data recovery in data storage systems
US10120587B2 (en) Tier-optimized write scheme
US8924664B2 (en) Logical object deletion
US9436720B2 (en) Safety for volume operations
US11513996B2 (en) Non-disruptive and efficient migration of data across cloud providers
JP5346536B2 (en) Information backup / restore processing device and information backup / restore processing system
US9880759B2 (en) Metadata for data storage array
US9298707B1 (en) Efficient data storage and retrieval for backup systems
US7577808B1 (en) Efficient backup data retrieval
CN106682186B (en) File access control list management method and related device and system
US8332600B2 (en) Storage system and method for operating storage system
CN105493080B (en) The method and apparatus of data de-duplication based on context-aware
GB2520361A (en) Method and system for a safe archiving of data
WO2016029743A1 (en) Method and device for generating logical disk of virtual machine
CN103617097A (en) File recovery method and file recovery device
CN108170372B (en) Data processing method and device based on cloud hard disk
CN106991020B (en) Efficient processing of file system objects for image level backups
CN109857519B (en) Method for processing virtual disk and related device
CN103645967B (en) A kind of read-only materialized view rollback method and device
CN108271420A (en) Manage method, file system and the server system of file
US11409604B1 (en) Storage optimization of pre-allocated units of storage
CN102354302B (en) A kind of method of erasing disk and device
JP5494817B2 (en) Storage system, data management apparatus, method and program

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Data processing method and device based on cloud disk

Granted publication date: 20210219

Pledgee: Xiamen Bank Co.,Ltd.

Pledgor: XIAMEN JIWEI TECHNOLOGY CO.,LTD.

Registration number: Y2024980052444

PE01 Entry into force of the registration of the contract for pledge of patent right