CN114138197A - Online cross-pool data migration method and electronic equipment - Google Patents

Online cross-pool data migration method and electronic equipment Download PDF

Info

Publication number
CN114138197A
CN114138197A CN202111426913.6A CN202111426913A CN114138197A CN 114138197 A CN114138197 A CN 114138197A CN 202111426913 A CN202111426913 A CN 202111426913A CN 114138197 A CN114138197 A CN 114138197A
Authority
CN
China
Prior art keywords
pool
volume
data migration
master
volumes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111426913.6A
Other languages
Chinese (zh)
Other versions
CN114138197B (en
Inventor
刘爱贵
董冠军
阮薛平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dadao Yunxing Technology Co ltd
Original Assignee
Beijing Dadao Yunxing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dadao Yunxing Technology Co ltd filed Critical Beijing Dadao Yunxing Technology Co ltd
Priority to CN202111426913.6A priority Critical patent/CN114138197B/en
Publication of CN114138197A publication Critical patent/CN114138197A/en
Application granted granted Critical
Publication of CN114138197B publication Critical patent/CN114138197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data migration, in particular to an online cross-pool data migration method and electronic equipment. The invention manages the relationship between the volume and the storage pool and the export of the volume by adopting the globally unique volume ID and using the reference relationship, realizes the online data migration of the volume and ensures the good mobility of the bottom layer data block. The mapping table is maintained through the distributed cache module, and strong consistency and access performance are guaranteed. And the performance and controllability of data migration are improved by parallel execution of the distributed operation modules.

Description

Online cross-pool data migration method and electronic equipment
Technical Field
The invention relates to the technical field of data migration, in particular to an online cross-pool data migration method and electronic equipment.
Background
In a distributed block storage system, a cluster is divided into multiple physically isolated storage pools containing disks on several different nodes. All data blocks of a volume are located in a storage pool. As shown in FIG. 1, the storage pool has nodes with disks on the nodes, and the two copies of data blocks C1, C2, C3 of a two-copy volume are located on different disks of different nodes, respectively. The location of the copy of a data block must satisfy the fault domain rule, i.e. different copies cannot fall into the same fault domain (node level fault domain in this example).
If there are multiple storage pools in a cluster, it is desirable to dynamically map a volume to a different storage pool, such as for data migration between a high performance storage pool and a high capacity storage pool. Data migration is divided into two modes: offline migration and online migration. Offline migration entails unloading all client connections of a volume and reusing them after migration is complete. On-line migration can realize cross-pool migration of volumes without affecting client IO.
In the prior art, the following defects exist: the volume ID contains storage pool information, although the process of obtaining the storage pool ID can be simplified, the coupling relation between the storage pool and the volume is generated, the dynamic modification is difficult, and the online cross-pool data migration is difficult.
The storage pool and volume tree structure is coupled with the volume export, so that only all the volumes under the storage pool can be exported by the same Target, and the export strategy cannot be flexibly customized to export volumes from different storage pools.
Disclosure of Invention
Aiming at the defects of the prior art, the invention discloses an online cross-pool data migration method and electronic equipment, which are used for solving the problems.
The invention is realized by the following technical scheme:
in a first aspect, the present invention provides an online cross-pool data migration method, where a central controller is responsible for generating a monotone increasing natural number, and a tree structure is used on a global configuration server to manage the relationship between storage pools and volumes, so that when a volume is migrated between different storage pools, a volume is unmapped from a source storage pool, and a volume is remapped into a Target storage pool, and finally volumes from different storage pools are derived through the same Target, thereby completing distributed data migration online and asynchronously.
Further, in the method, when a volume is created, an initial pool _ id attribute is recorded in metadata of the volume.
Furthermore, in the method, when the Client calls the lookup interface (volume _ id) to query the storage pool in which the Client is located from the local Range Controller, a cache scheme is adopted, the Master is not directly queried, but the cache on the Range Controller is queried, and the latest value is loaded on the Master only when the cache is not hit.
Furthermore, in the method, after the Master updates the volume _ id, pool _ id, the Client terminal initiates an RPC call to the Master: and after receiving the message, the Master informs all Range controllers of the failure of the cache entry by using a broadcast event, and then returns an RPC success message to the Client.
Further, in the method, the data blocks in the IO process are allocated according to needs, and the allocation process depends on pool _ id.
Further, in the method, when allocating the data block, the corresponding storage pool information is specified:
allocate:=diskmap(pool_id,REPNUM)
wherein REPNUM is the number of copies of the application, because the data block ID itself does not carry pool _ ID information, the volume _ ID is utilized: pool _ id mapping.
Further, in the method, the export and access process of the volume, the export management establishes another reference relationship with the volume, and the export and access process and the export management dynamically establish and remove.
Furthermore, in the method, the distributed data migration process completed asynchronously is executed concurrently in a distributed operation mode, and a master/worker distributed operation architecture is adopted.
Furthermore, in the method, a master node polls a job queue regularly, jobs are initiated on all nodes according to requirements, a work thread on a worker node executes a specific task, and a QoS strategy is applied to the jobs.
In a second aspect, the present invention provides an electronic device, including a processor and a memory storing execution instructions, where when the processor executes the execution instructions stored in the memory, the processor executes the online cross-pool data migration method according to the first aspect.
The invention has the beneficial effects that:
the invention manages the relationship between the volume and the storage pool and the export of the volume by adopting the globally unique volume ID and using the reference relationship, realizes the online data migration of the volume, and ensures the good fluidity of the bottom layer data block on the premise of not interrupting the IO of the client. The volume _ id and pool _ id mapping table is maintained through the distributed cache module, so that strong consistency and access performance are guaranteed. And the performance and controllability of data migration are improved by parallel execution of the distributed operation modules.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram illustrating a distribution of data block copies in a storage pool according to the background art of the present invention;
FIG. 2 is a diagram illustrating an organization of volumes within a cluster, in accordance with an embodiment of the present invention;
FIG. 3 is a diagram of a volume _ id to pool _ id mapping used by an embodiment of the present invention;
FIG. 4 is a diagram illustrating an embodiment of the present invention for updating the volume _ id to pool _ id mapping
FIG. 5 is a diagram of an IO process under the iSCSI protocol of an embodiment of the invention;
FIG. 6 is a diagram of a distributed job processing framework according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a migration process of data blocks according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
The embodiment provides an online cross-pool data migration method, in which a central controller is responsible for generating a monotone increasing natural number, and a tree structure is used on a global configuration server to manage the relationship between storage pools and volumes, so that when volumes are migrated between different storage pools, mapping of the volumes is removed from a source storage pool, the volumes are remapped into a Target storage pool, and finally the volumes from different storage pools are exported by the same Target, thereby completing distributed data migration online and asynchronously.
Referring to fig. 2, this embodiment provides an organization of volumes in a cluster, where storage pool a includes volume1 and volume 2, and storage pool B includes volume 3 and volume 4. Volume1 and volume 3 are derived by Target a, and volume 2 and volume 4 are derived by Target B. It can be seen that migrating volumes between different storage pools does not change the volumes themselves, and the basic process is to unmap volumes from the source storage pool and remap volumes to the target storage pool. Meanwhile, volumes from different storage pools can be exported by the same Target.
The embodiment adopts the globally unique volume ID, does not depend on the storage pool related information any more, and adopts two reference relations to manage the relation between the storage pools and the volumes and the export of the volumes.
Example 2
In a specific implementation level, this embodiment provides a specific application of the online cross-pool data migration method.
The present embodiment provides a 1: m dynamic mapping relationship between storage pools and volumes. Generation rule of volume ID: the central controller is responsible for generating monotonically increasing natural numbers.
On the global configuration server, the relation between the storage pool and the volume is managed by a tree structure, wherein the key point is that the storage pool information cannot be contained in the volume ID. In this way, the remapping of the metadata level can be accomplished by a simple rename operation. Update volume to storage pool remapping operation: remap (/ pool1/volume1,/pool2/volume 1).
This embodiment provides a mapping relationship for managing volume _ id to pool _ id, and when creating a volume, an initial pool _ id attribute is recorded in the metadata of the volume. However, the mapping relationship of volume _ id to pool _ id can be dynamically adjusted, and each controller needs to be able to sense the update process (which needs to satisfy the atomicity of the operation) in real time, so as to obtain the latest value. Therefore, the update process must be handled reliably.
As shown in FIG. 3, the Client calls the interface lookup (volume _ id) to query the local Range Controller for the pool in which it is located. In order to reduce the load of the Master, a buffer scheme is adopted: the Master is not directly queried, but the cache in the Range Controller is queried, and the latest value is loaded on the Master only when the cache misses.
In addition, after the volume _ id, pool _ id, is updated by the Master, the cache in the Range Controller fails, and the problem of distributed cache consistency needs to be solved. As shown in fig. 4, the Client terminal initiates an RPC call to the Master: and after receiving the message, the Master informs all Range controllers of the failure of the cache entry by using a broadcast event, and then returns an RPC success message to the Client.
The present embodiment provides an IO procedure. The IO process for the volume tracks the copy location information based on the metadata, so pool _ id is not needed. However, to support thin provisioning, the data blocks in the IO process are allocated on demand, and the allocation process depends on pool _ id. Therefore, the IO process has the same processing logic as the allocation process.
The present embodiment provides an allocation procedure of data blocks. When allocating data blocks, corresponding storage pool information needs to be specified:
allocate:=diskmap(pool_id,REPNUM)
wherein REPNUM is the number of copies of the application. Since the data block ID itself does not carry pool _ ID information, it is necessary to use volume _ ID: pool _ id mapping.
The present embodiment provides for export and access of volumes. Because the volume ID is used for connection, the volume ID has global uniqueness and does not depend on storage pool information, and therefore the flow of the bottom layer data blocks among different storage pools does not influence the established client connection. Similarly to the reference relationship between storage pools and volumes, export management amounts to establishing another reference relationship with a volume, which may be dynamically established and removed.
The present embodiments provide for an asynchronously completed distributed data migration process. In order to improve the performance, the whole migration process is executed concurrently in a distributed operation mode, and a master/worker distributed operation frame is adopted. The master node polls the job queue regularly, initiates jobs on all nodes as required, executes specific tasks by the working thread on the worker node, and can apply a QoS strategy to the jobs.
Example 3
The embodiment provides an electronic device, which comprises a processor and a memory, wherein the memory stores execution instructions, and when the processor executes the execution instructions stored in the memory, the processor executes an online cross-pool data migration method.
In summary, the present invention uses the globally unique volume ID, and manages the relationship between the volume and the storage pool and the export of the volume by using the reference relationship, thereby implementing online data migration of the volume and ensuring good mobility of the underlying data block without interrupting the client IO. The volume _ id and pool _ id mapping table is maintained through the distributed cache module, so that strong consistency and access performance are guaranteed. And the performance and controllability of data migration are improved by parallel execution of the distributed operation modules.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An online cross-pool data migration method is characterized in that a central controller is responsible for generating monotone increasing natural numbers, the relation between storage pools and volumes is managed on a global configuration server through a tree structure, when the volumes are migrated between different storage pools, mapping of the volumes is removed from a source storage pool, the volumes are remapped into a Target storage pool, finally the volumes from different storage pools are exported through the same Target, and distributed data migration is completed online and asynchronously.
2. An online cross-pool data migration method according to claim 1, wherein in the method, when a volume is created, an initial pool _ id attribute is recorded in metadata of the volume.
3. The method as claimed in claim 1, wherein in the method, when a Client calls a lookup interface (volume _ id) to query the local Range Controller for the pool, a cache scheme is adopted, instead of querying the Master directly, the cache in the Range Controller is queried, and when the cache misses, the latest value is loaded on the Master.
4. The method for online cross-pool data migration according to claim 1, wherein in the method, after updating volume _ id: pool _ id through Master, the Client terminal initiates an RPC call to Master: and after receiving the message, the Master informs all Range controllers of the failure of the cache entry by using a broadcast event, and then returns an RPC success message to the Client.
5. The method of claim 1, wherein in the method, data blocks in IO processes are allocated on demand, and the allocation process depends on pool _ id.
6. The method of claim 5, wherein allocating data blocks specifies corresponding storage pool information:
allocate:=diskmap(pool_id,REPNUM)
wherein REPNUM is the number of copies of the application, because the data block ID itself does not carry pool _ ID information, the volume _ ID is utilized: pool _ id mapping.
7. The method of claim 1, wherein the exporting and accessing process of the volume, the exporting management establishes another reference relationship with the volume, and the dynamically establishing and removing are performed.
8. The on-line cross-pool data migration method according to claim 1, wherein in the method, the asynchronously completed distributed data migration process is executed concurrently in a distributed job manner, and a master/worker distributed job architecture is adopted.
9. The method according to claim 8, wherein in the method, a master node regularly polls a job queue, initiates jobs on all nodes as required, executes specific tasks by a worker thread on a worker node, and applies a QoS policy to the jobs.
10. An electronic device comprising a processor and a memory storing execution instructions, the processor executing the online cross-pool data migration method of any of claims 1-9 when the processor executes the execution instructions stored by the memory.
CN202111426913.6A 2021-11-28 2021-11-28 Online cross-pool data migration method and electronic equipment Active CN114138197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111426913.6A CN114138197B (en) 2021-11-28 2021-11-28 Online cross-pool data migration method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111426913.6A CN114138197B (en) 2021-11-28 2021-11-28 Online cross-pool data migration method and electronic equipment

Publications (2)

Publication Number Publication Date
CN114138197A true CN114138197A (en) 2022-03-04
CN114138197B CN114138197B (en) 2022-10-18

Family

ID=80388361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111426913.6A Active CN114138197B (en) 2021-11-28 2021-11-28 Online cross-pool data migration method and electronic equipment

Country Status (1)

Country Link
CN (1) CN114138197B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117389713A (en) * 2023-12-13 2024-01-12 苏州元脑智能科技有限公司 Storage system application service data migration method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105573679A (en) * 2015-12-18 2016-05-11 国云科技股份有限公司 Method suitable for storage pool resource mapping rule of distributed storage system
CN106375463A (en) * 2016-09-14 2017-02-01 郑州云海信息技术有限公司 Data migration method and device based on storage virtualization
CN111984370A (en) * 2020-07-30 2020-11-24 苏州浪潮智能科技有限公司 Method and device for online migration of multi-disk virtual machine to different storage pools

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105573679A (en) * 2015-12-18 2016-05-11 国云科技股份有限公司 Method suitable for storage pool resource mapping rule of distributed storage system
CN106375463A (en) * 2016-09-14 2017-02-01 郑州云海信息技术有限公司 Data migration method and device based on storage virtualization
CN111984370A (en) * 2020-07-30 2020-11-24 苏州浪潮智能科技有限公司 Method and device for online migration of multi-disk virtual machine to different storage pools

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117389713A (en) * 2023-12-13 2024-01-12 苏州元脑智能科技有限公司 Storage system application service data migration method, device, equipment and medium
CN117389713B (en) * 2023-12-13 2024-02-23 苏州元脑智能科技有限公司 Storage system application service data migration method, device, equipment and medium

Also Published As

Publication number Publication date
CN114138197B (en) 2022-10-18

Similar Documents

Publication Publication Date Title
US10768820B2 (en) On-demand storage provisioning using distributed and virtual namespace management
US11113158B2 (en) Rolling back kubernetes applications
US10579364B2 (en) Upgrading bundled applications in a distributed computing system
US20200310915A1 (en) Orchestration of Heterogeneous Multi-Role Applications
US11847098B2 (en) Metadata control in a load-balanced distributed storage system
US11392363B2 (en) Implementing application entrypoints with containers of a bundled application
US20200042454A1 (en) System and method for facilitating cluster-level cache and memory space
US11347684B2 (en) Rolling back KUBERNETES applications including custom resources
US10628235B2 (en) Accessing log files of a distributed computing system using a simulated file system
US11262912B2 (en) File operations in a distributed storage system
US11205244B2 (en) Resiliency schemes for distributed storage systems
CN105095094A (en) Memory management method and equipment
US20230273859A1 (en) Storage system spanning multiple failure domains
US12038871B2 (en) Data migration in a distributive file system
CN111651286A (en) Data communication method, device, computing equipment and storage medium
CN114138197B (en) Online cross-pool data migration method and electronic equipment
US11093444B2 (en) Access redirection in a distributive file system
US20190215281A1 (en) Fenced Clone Applications
US20220318042A1 (en) Distributed memory block device storage
CN114518962A (en) Memory management method and device
US12013787B2 (en) Dual personality memory for autonomous multi-tenant cloud environment
CN117874101A (en) Elastic distributed data storage system, method and device and electronic equipment
CN118796766A (en) File mapping method and device suitable for microkernel operating system and electronic equipment
CN113031852A (en) Data processing method and device, electronic equipment and storage medium
JP2000347913A (en) Distributed data management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: An online cross pool data migration method and electronic device

Effective date of registration: 20230906

Granted publication date: 20221018

Pledgee: Zhongguancun Branch of Bank of Beijing Co.,Ltd.

Pledgor: BEIJING DADAO YUNXING TECHNOLOGY Co.,Ltd.

Registration number: Y2023980055521