CN114416690A - Data migration device, method and storage medium for file storage to object storage - Google Patents

Data migration device, method and storage medium for file storage to object storage Download PDF

Info

Publication number
CN114416690A
CN114416690A CN202111585376.XA CN202111585376A CN114416690A CN 114416690 A CN114416690 A CN 114416690A CN 202111585376 A CN202111585376 A CN 202111585376A CN 114416690 A CN114416690 A CN 114416690A
Authority
CN
China
Prior art keywords
migration
file
storage
migration task
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111585376.XA
Other languages
Chinese (zh)
Inventor
罗成凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202111585376.XA priority Critical patent/CN114416690A/en
Publication of CN114416690A publication Critical patent/CN114416690A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data migration device, a data migration method and a data migration medium for storing a file into an object storage, which are used for solving the technical problem of low efficiency of storing and migrating the file into the object storage. According to the invention, the remote NFS server is mounted to the local storage side NFS agent, under the control of the migration task, the object storage gateway directly reads the contents of the migration file from the NFS agent and converts the contents into the object storage format to be stored in the target object storage.

Description

Data migration device, method and storage medium for file storage to object storage
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a data migration apparatus, a data migration method, and a storage medium for storing a file to an object storage.
Background
Simple Storage Service (S3) is an Internet-oriented object Storage Service. At present, cloud infrastructure is gradually popularized, and enterprises need to convert a large amount of inventory data from a traditional file storage form to a cloud storage form represented by an S3 object storage so as to ensure sharing and unification of data sources of a data center.
At present, third-party software is needed for migration of file storage to object storage, a client software needs to be installed externally during migration, a conventional read-write mode is adopted, an Input/Output (IO) path is very long, and incremental migration such as Rclone and Rsync is not supported.
When the third-party software migrates data, the source file storage data needs to be read to the host where the software is located, the host writes to the target S3 object storage side, and the host resources and the network bandwidth need to be occupied by reading and writing. Such third-party software usually adopts a general protocol to read and write data, and does not record data of a migration task, and the migration speed is slow and cannot meet business requirements.
Disclosure of Invention
In view of the above, the present invention provides a data migration apparatus, method and storage medium for storing a file to an object storage, which are used to solve the technical problem of low efficiency of data migration from the file to the object storage.
Fig. 1 is a schematic structural diagram of a data migration apparatus for storing a file to an object storage according to the present invention, where the apparatus 100 includes:
a migration task module 110, configured to customize and execute a migration task, where the migration task calls the local file storage agent module 130 through an interface provided by the file system abstraction module 120 to read source file storage data, and calls the object storage gateway module 140 through an interface provided by the file system abstraction module 120 to write the read source file storage data into a local object storage;
a file system abstraction module 120, configured to provide a file system abstraction layer access interface for an upper layer;
the file storage agent module 130 is configured to provide an access interface for the source file storage data for the file system abstraction module 120, mount a network file system NFS directory stored in the source file under the control of the migration task, and read and cache the source file storage data;
and the object storage gateway module 140 is configured to provide an object storage access interface for the file system abstraction module 120, and read the file storage data cached in the file storage agent module 130 under the control of the migration task, convert the read file storage data into an object storage, and write the object storage data into the object storage.
Further, the migration task module 110 is further configured to create a migration thread pool, and a migration thread in the migration thread pool executes a migration task; the migration task module 110 executes the migration task based on the IP address stored in the source file and the Bucket name stored in the destination object, which are configured in advance.
Further, the execution mode of the migration task includes a full migration mode and an incremental migration mode, when the full migration mode is used, a list of all files in the mounted NFS directory is first obtained, then all files in the list are added to the migration task, and when the migration task is executed, the content of all files in the NFS directory mounted in the source file storage is read in full through the file storage agent module.
Further, when the migration task uses the incremental migration mode, firstly, a list of all files in the mounted NFS directory is obtained, filtering is performed according to the time of modification of each file in the list and the time of execution of the last migration task, the modified file after the last migration is added to the migration task, and when the migration task is executed, the content of the file which changes since the last migration in the mounted NFS directory in the source file storage is read in an incremental manner through the file storage agent module.
Based on the embodiment of the invention, the invention also provides a data migration method from file storage to object storage, which comprises the following steps:
s1, customizing and executing a migration task;
s2, under the control of the migration task, the migration task calls a local file storage proxy module through an interface provided by a file system abstraction module, the file storage proxy module mounts a Network File System (NFS) directory stored by a source-end file, and the source-end file storage data is read and cached;
and S3, under the control of the migration task, the migration task calls an object storage gateway module through an interface provided by a file system abstraction module, the object storage gateway module reads file storage data cached in the file storage proxy module and converts the read file storage data into object storage to be written into the object storage.
Further, the method also comprises the steps of creating a migration thread pool for the migration task, and executing the migration task by the migration thread in the migration thread pool; and executing the migration task based on the IP address stored in the source end file and the Bucket name stored in the destination end object.
The invention provides a data migration device and method from file storage to object storage, which are used for solving the technical problem of low efficiency from file storage to object storage. According to the invention, the remote NFS server is mounted to the local storage side NFS agent, under the control of the migration task, the object storage gateway directly reads the contents of the migration file from the NFS agent and converts the contents into the object storage format to be stored in the target object storage.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments of the present invention or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings may be obtained according to the drawings of the embodiments of the present invention.
FIG. 1 is a schematic structural diagram of a data migration apparatus for storing files to object stores according to the present invention;
FIG. 2 is a block diagram illustrating a framework for data migration from file storage to object storage according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device for implementing a data migration method from file storage to object storage according to an embodiment of the present invention.
Detailed Description
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the invention. As used in this embodiment of the invention, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used to describe various information in embodiments of the present invention, the information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present invention. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".
The invention aims to provide a data migration method, a data migration device, data migration equipment and a storage medium for storing files to an object to reduce network bandwidth occupied by data migration and improve migration efficiency. The basic idea of the invention is: by mounting the far-end NFS server to the independent software module/component on the local storage side, IO paths of data flow are reduced through data reading and writing between the internal modules/components, and finally the purpose of transferring the NFS file data to the S3 object storage is achieved.
According to the technical scheme provided by the embodiment of the invention, the data flow is carried out between the two software modules/components in the local storage, so that the IO path is shortened and the migration speed is increased; in addition, in the embodiment of the invention, the information of each migration task can be recorded, and the incremental migration is supported. The invention can improve the migration speed and efficiency from the file storage to the object storage.
Fig. 2 is a schematic structural diagram of a framework of data migration from file storage to object storage according to an embodiment of the present invention. In this example, the data migration apparatus 100 that stores files in the object storage is realized by improving NFS Ganesha, migration task module 110 corresponds to migration, File System Abstraction Layer (FSAL) corresponds to File System Abstraction module 120, NFS Proxy (Proxy for short) corresponds to File storage Proxy module 130, and FSAL _ RGW corresponds to object storage gateway module 140.
The NFS-Ganesha is a File service software module/component based on a Network File System (NFS), and can run in a User Mode (User Mode) of most of operating systems with Unix/Linux as a core. The NFS-Ganesha abstracts the back-end storage into a uniform API application program interface through a File System Abstraction Layer (FSAL), and provides a storage access interface API for a Ganesha server. RGW is an object store access interface gateway inside Ceph, FSAL _ RGW is an object store access interface gateway supporting FSAL, enabling FSAL to access Ceph clusters using a standard object store API. The RGW supports S3, Swift API, and the like. The NFS Proxy is an NFS protocol Proxy and is used to cache data transmitted through the NFS protocol for fast reading by other modules.
In this embodiment, a migratory background program module is added to NFS-Ganesha, and the process is responsible for processing all migration task jobs. When data needs to be migrated, a migration Thread pool (DP Thread pools) is started to execute the migration task, and data reading and writing operations are performed. And the DP Thread holes reads the data stored in the source file by calling Proxy through an interface provided by the FSAL, and writes the data into the object storage of the local end S3 by the FSAL _ RGW.
Before the migration task starts, the following configuration needs to be carried out in advance: 1) the IP address of the source file storage and the NFS shared directory of the source file storage can be accessed by the local storage; 2) the destination object stores the Bucket name.
After a migration task is started, based on a configured source file storage IP address, a to-be-migrated NFS shared directory in a source file storage is mounted to a local storage server, the migration task acquires list information of all files in the NFS shared directory mounted in a remote NFS device, and a file list includes attribute information of each file, such as a file name, file modification time, file size and the like.
The embodiment provides two execution modes of the migration task, namely a full migration mode and an incremental migration mode. The full migration mode refers to migrating all the files in the mounted NFS shared directory to the destination object storage regardless of whether some or all of the files have been migrated by the last migration task. In a full migration mode, after migration tasks are started by Migranond, the migration tasks call FSAL instructions Proxy to mount remote storage, namely a source NFS directory, read a file list generated by all file information in a mounted shared NFS directory, then add all files in the file list into the migration tasks, and execute the migration tasks by migration threads in a migration thread pool. And when the migration task is executed, reading the content of the file to be migrated through an FSAL instruction Proxy and caching, then reading the file data cached by the Proxy through the FSAL instruction FSAL _ RGW through Socket connection by the migration task, and writing the data of the migrated file into local object storage according to the format of the object storage.
The incremental migration mode refers to that files which change in a mounted NFS shared directory since the last migration are migrated to a target object storage, and the files in the directory need to be filtered based on the file modification time and the time of the last migration task execution. In an incremental migration mode, firstly, a list of all files in a mounted NFS directory is obtained, filtering is carried out according to the modification time of each file in the list and the execution time of a last migration task, the modified files after the last migration are added into the migration task, the contents of the files which change since the last migration in the mounted NFS directory in a source file storage are read through Proxy increments when the migration task is executed, then the migration task reads file data cached by the Proxy through a Socket connection through an FSAL instruction FSAL _ RGW, and the data of the migration files are written into a local object storage according to the format of the object storage.
As shown in fig. 2, the version of Proxy adopted in this example may be v3 or v4, mount the NFS directory of the source file storage, read the source file, and write the source file in Object form into the corresponding bucket in the destination Object storage. If the file size exceeds 4M, the file content is read according to the size of 4M each time, and if the file size does not exceed 4M, the file content is read according to the actual file size.
The scheme utilizes the capacity of NFS-Ganesha, realizes local automatic mounting of NFS data by adding a Porxy v3/v4 component, writes Object data into an Object bucket by utilizing a FASL _ RGW interface, shortens an IO path in the whole process, and realizes data flow between two storage types.
Based on the structural example of fig. 2, there is also provided a data migration method from a file storage to an object storage, where the method is applied to a local storage device located on an object storage side, and the method is performed by a plurality of software modules located in NFS-Ganesha in cooperation with each other, where the method includes:
B1. a user can customize a migration task through a management platform or a management terminal, and the migration task is issued to a local storage NFS-Ganesha, wherein a migrationnd in the NFS-Ganesha is responsible for executing the migration task;
B2. after a migration task is started, the migration task controls the execution of each migration step, calls a local file storage agent module, namely Proxy, through an interface provided by a file system abstraction module, namely FSAL, and mounts a network file system NFS directory stored by a source file by the Proxy, and reads and caches source file storage data;
B3. under the control of the migration task, the migration task calls an object storage gateway module (FSAL _ RGW) through an interface provided by the FSAL, the FSAL _ RGW reads file storage data cached in the Proxy through protocol modes such as Socket and the like, and the read file storage data is converted into object storage to be written into the object storage.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a data migration method from file storage to object storage according to an embodiment of the present invention, where the device 300 includes: a processor 310, such as a Central Processing Unit (CPU), a communication bus 320, a communication interface 340, and a storage medium 330. Wherein the processor 310 and the storage medium 330 may communicate with each other through a communication bus 320. The storage medium 330 stores a computer program which, when executed by the processor 310, implements the functions of the steps of the method provided by the present invention.
The storage medium may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. In addition, the storage medium may be at least one memory device located remotely from the processor. The Processor may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), etc.; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose. Further, operations of processes described herein may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A data migration apparatus for file storage to object storage, the apparatus comprising:
the migration task module is used for customizing and executing a migration task, the migration task calls a local file storage proxy module to read source file storage data through an interface provided by the file system abstraction module, and calls an object storage gateway module through the interface provided by the file system abstraction module to write the read source file storage data into a local object storage;
the file system abstraction module is used for providing a file system abstraction layer access interface for the upper layer;
the file storage agent module is used for providing an access interface of source file storage data for the file system abstraction module, mounting a Network File System (NFS) directory stored by the source file under the control of the migration task, and reading and caching the source file storage data;
and the object storage gateway module is used for providing an object storage access interface for the file system abstraction module, reading the file storage data cached in the file storage proxy module under the control of the migration task, converting the read file storage data into object storage, and writing the object storage data into object storage.
2. The apparatus of claim 1,
the migration task module is also used for creating a migration thread pool and executing the migration task by the migration thread in the migration thread pool;
and the migration task module executes the migration task based on the IP address stored in the source file and the Bucket name stored in the destination object which are configured in advance.
3. The apparatus of claim 2,
when the full migration mode is used, firstly, a list of all files in the mounted NFS directory is obtained, then all files in the list are added into the migration task, and the contents of all files in the mounted NFS directory in the source file storage are read in a full mode through the file storage agent module during the execution of the migration task.
4. The apparatus according to claim 3, wherein when the migration task uses an incremental migration mode, first obtain a list of all files in the mounted NFS directory, filter according to a time of modification of each file in the list and a time of execution of a last migration task, add a file modified after the last migration to the migration task, and when the migration task is executed, incrementally read, by the file storage agent module, contents of a file that has changed since the last migration in the NFS directory mounted in the source file storage.
5. A data migration method from file storage to object storage is characterized by comprising the following steps:
customizing and executing the migration task;
under the control of the migration task, the migration task calls a local file storage proxy module through an interface provided by a file system abstraction module, the file storage proxy module mounts a Network File System (NFS) directory of source-end file storage, and reads and caches source-end file storage data;
under the control of the migration task, the migration task calls the object storage gateway module through an interface provided by the file system abstraction module, the object storage gateway module reads file storage data cached in the file storage proxy module and converts the read file storage data into object storage to be written into the object storage.
6. The method of claim 5,
the method also comprises the steps of creating a migration thread pool for the migration task, and executing the migration task by a migration thread in the migration thread pool; and executing the migration task based on the IP address stored in the source end file and the Bucket name stored in the destination end object.
7. The method of claim 6,
when the full migration mode is used, firstly, a list of all files in the mounted NFS directory is obtained, then all files in the list are added into the migration task, and the contents of all files in the mounted NFS directory in the source file storage are read in a full mode through the file storage agent module during the execution of the migration task.
8. The method according to claim 7, wherein when the migration task uses an incremental migration mode, first obtain a list of all files in the mounted NFS directory, filter according to a time of modification of each file in the list and a time of execution of a last migration task, add a file modified after the last migration to the migration task, and when the migration task is executed, incrementally read, by the file storage agent module, contents of a file that has changed since the last migration in the NFS directory mounted in the source file storage.
9. An electronic device is characterized by comprising a processor, a communication interface, a storage medium and a communication bus, wherein the processor, the communication interface and the storage medium are communicated with each other through the communication bus;
a storage medium for storing a computer program;
a processor for performing the method steps of any of claims 5-8 when executing a computer program stored on a storage medium.
10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 5 to 8.
CN202111585376.XA 2021-12-22 2021-12-22 Data migration device, method and storage medium for file storage to object storage Pending CN114416690A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111585376.XA CN114416690A (en) 2021-12-22 2021-12-22 Data migration device, method and storage medium for file storage to object storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111585376.XA CN114416690A (en) 2021-12-22 2021-12-22 Data migration device, method and storage medium for file storage to object storage

Publications (1)

Publication Number Publication Date
CN114416690A true CN114416690A (en) 2022-04-29

Family

ID=81266821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111585376.XA Pending CN114416690A (en) 2021-12-22 2021-12-22 Data migration device, method and storage medium for file storage to object storage

Country Status (1)

Country Link
CN (1) CN114416690A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700842A (en) * 2023-08-04 2023-09-05 长扬科技(北京)股份有限公司 Data object reading and writing method and device, computing equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700842A (en) * 2023-08-04 2023-09-05 长扬科技(北京)股份有限公司 Data object reading and writing method and device, computing equipment and storage medium
CN116700842B (en) * 2023-08-04 2023-10-13 长扬科技(北京)股份有限公司 Data object reading and writing method and device, computing equipment and storage medium

Similar Documents

Publication Publication Date Title
US11681441B2 (en) Input/output processing in a distributed storage node with RDMA
US9557928B2 (en) Autonomic reclamation processing on sequential storage media
US9419899B2 (en) Automated service interface optimization
CN103607428A (en) Method of accessing shared memory and apparatus thereof
CN111708738B (en) Method and system for realizing interaction of hadoop file system hdfs and object storage s3 data
US20150112934A1 (en) Parallel scanners for log based replication
CN110851082B (en) Method for storing container butt-jointed optical fiber network
CN112835524A (en) Storage resource allocation method, storage resource controller and scheduling system
CN113032099B (en) Cloud computing node, file management method and device
US11029932B2 (en) Hydration of applications
CN115686932B (en) Backup set file recovery method and device and computer equipment
CN112230857B (en) Hybrid cloud system, hybrid cloud disk application method and data storage method
CN114564339A (en) Disk image file cross-platform migration method and system
CN114416690A (en) Data migration device, method and storage medium for file storage to object storage
CN104516687A (en) Windows remote mapping method for Linux block device
CN112764830B (en) Data migration method and system applied to localization substitution
CN106598502B (en) Data storage method and system
CN113110918A (en) Read-write rate control method and device, node equipment and storage medium
US11297147B2 (en) Managed data export to a remote network from edge devices
WO2018119662A1 (en) Kernel update method and apparatus, and computer device
CN111767169A (en) Data processing method and device, electronic equipment and storage medium
CN111443992A (en) Docker mirror image difference derivation method, computer storage medium and electronic device
CN112948336B (en) Data acceleration method, cache unit, electronic device and storage medium
CN115469807A (en) Disk function configuration method, device, equipment and storage medium
US10891226B1 (en) Virtual address space dump in a computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination