WO2018113533A1 - 一种元数据迁移方法、装置、系统及设备 - Google Patents

一种元数据迁移方法、装置、系统及设备 Download PDF

Info

Publication number
WO2018113533A1
WO2018113533A1 PCT/CN2017/115190 CN2017115190W WO2018113533A1 WO 2018113533 A1 WO2018113533 A1 WO 2018113533A1 CN 2017115190 W CN2017115190 W CN 2017115190W WO 2018113533 A1 WO2018113533 A1 WO 2018113533A1
Authority
WO
WIPO (PCT)
Prior art keywords
migration
migrated
metadata
migration task
task
Prior art date
Application number
PCT/CN2017/115190
Other languages
English (en)
French (fr)
Inventor
吕鹏程
姚文辉
刘俊峰
黄硕
朱家稷
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2018113533A1 publication Critical patent/WO2018113533A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5015Service provider selection

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a metadata migration method, apparatus, system, and device.
  • the distributed storage system manages file system namespaces and file attributes by the metadata server, provides file access rights and file storage locations, and stores files by the data storage server.
  • the client makes various read and write requests for file data.
  • the metadata server in most distributed storage systems usually provides services by multiple groups. Each group of servers constitutes a metadata server group (ie, volume).
  • the distributed storage system usually configures multiple metadata server groups. Data pressure is spread across the various metadata server groups.
  • the metadata is continuously stored in each metadata server group. As the running time increases, the metadata stored in each metadata server group will gradually become unbalanced. To balance the load of each metadata server group, metadata migration is required for each metadata server group.
  • the method of metadata migration between the metadata server groups is usually adopted, that is, the source directory is split on the metadata server group, the items to be migrated are obtained, and the migration task is generated according to the number of files included in each item to be migrated. , then distribute the migration tasks to each execution server for metadata migration.
  • the inventor found that during the metadata migration between the metadata server groups, the migration task is generated based on the number of files included in the entry to be migrated, and the generated migration is generated due to uneven file granularity.
  • the amount of data contained in the task is not uniform, which causes some execution servers to be assigned a large amount of data migration tasks, making it a lot of time for the execution server to complete the migration task, resulting in a lower overall efficiency of metadata migration.
  • the purpose of the embodiments of the present application is to provide a metadata migration method, apparatus, system, and device, so as to achieve a relatively balanced amount of data included in the generated migration task, and improve the overall working efficiency of the metadata migration.
  • An embodiment of the present application provides a metadata migration method, where the method includes:
  • An embodiment of the present application provides a metadata migration apparatus, where the apparatus includes:
  • An item determining module to be migrated used to determine an item to be migrated in the directory to be migrated;
  • a migration task generating module configured to generate a migration task according to the number of data blocks corresponding to the to-be-migrated entry, and divide the to-migrate entry for the migration task;
  • a migration task allocation module configured to allocate the migration task to an execution server to perform metadata migration on the to-be-migrated entry.
  • the embodiment of the present application provides a metadata migration system, including the metadata migration device, the execution server, and the plurality of metadata server groups provided by the foregoing embodiments, where:
  • the execution server is configured to receive a migration task assigned by the metadata migration device, and perform data migration between the metadata server groups according to the migration task.
  • the embodiment of the present application provides a metadata migration device, where the metadata migration device includes:
  • a memory arranged to store computer executable instructions that, when executed, cause the processor to:
  • the embodiment of the present application determines a to-be-migrated entry in the directory to be migrated, and generates a migration task according to the number of data blocks corresponding to the to-be-migrated entry, and divides the to-migrate entry for the migration task.
  • the migration task is assigned to the execution server to perform metadata migration on the migration item, so that the migration task is generated by the number of data blocks corresponding to the item to be migrated, and the amount of data contained in each migration task is relatively Equilibrium, thus avoiding the execution server of the migration task that may be allocated to a large amount of data, shortens the time required for the execution server to complete the migration task, thereby improving the overall efficiency of metadata migration.
  • FIG. 2 is a schematic diagram of a directory tree where a directory to be migrated is located in the present application
  • FIG. 3 is a schematic diagram of migration of a migration task according to the present application.
  • FIG. 8 is a schematic diagram of a metadata migration device according to the present application.
  • the embodiment of the present application provides a metadata migration method, device, system, and device.
  • an execution entity of the method may be a primary server (also referred to as a master server) that controls metadata migration, and the primary server controls multiple execution servers. (Also known as the Work Server), metadata migration between multiple metadata server groups to balance the load of each metadata server group.
  • the method may specifically include the following steps:
  • Step S101 Determine an item to be migrated in the directory to be migrated.
  • the directory to be migrated may be a directory that needs to be migrated by metadata, that is, a directory where metadata to be migrated is located.
  • the metadata may be information for describing data attributes, and is commonly used to support functions such as storage location of indicated data, search of historical data or resources, file recording, and the like.
  • the to-be-migrated entry may include a directory to be migrated and/or a file to be migrated, and the file to be migrated may be a file of a certain format, such as a text file in a txt format, a diary file in a jnt format, or the like.
  • the entire file system is divided into multiple parts according to actual needs, and the partitioning process can generate a corresponding directory tree, and each directory tree can be allocated to a different metadata server group.
  • the primary server can periodically detect (such as every 12 hours or every 24 hours) or real-time detection of the access status of each metadata server, data output, etc., and can analyze the corresponding metadata server group by detecting the above situation. Determine if the metadata server group is overloaded. When a metadata server group is overloaded, some subdirectories in the directory tree of the metadata server group (that is, the directory to be migrated) need to be reassigned to other metadata server groups. To re-allocate subdirectories for other metadata server groups, you need to migrate these subdirectories for metadata.
  • the primary server can traverse the directory to be migrated that requires metadata migration. To get the items to be migrated in the directory to be migrated. For example, if the directory tree is ABC, ABD, or AE, and the directory to be migrated is B, then the directory B to be migrated is traversed, and the files c.txt in the directory C and the directory C, and the files in the directory D and the directory D can be obtained. Txt, thereby obtaining the to-migrate entries including the directory to be migrated (including directory C and directory D) and the files to be migrated (including c.txt and d.txt).
  • Step S102 Generate a migration task according to the number of data blocks corresponding to the entry to be migrated, and divide the to-migrate entry for the migration task.
  • the data block may be a group or groups of blocks for storing data sequentially arranged in order, and may be a data unit for transmitting between the storage device and the input/output device, that is, one or more processes during data transmission.
  • the data block is used for data transmission.
  • the amount of data that can be stored in the data block can be determined according to actual conditions. For example, the amount of data that can be stored in the data block can be 32 MB or 64 MB.
  • the migration task can be an instruction task that is used to instruct the execution server to perform metadata migration.
  • the amount of data that can be stored in each data block can be set in advance, and the amount of data corresponding to the migrated entry and the amount of data that can be stored in each data block can be calculated to determine the number of data blocks corresponding to the entry to be migrated.
  • step S101 if the amount of data corresponding to the files to be migrated c.txt and d.txt is 100 MB, and the amount of data that can be stored in each data block is 32 MB, then 3.12 can be obtained by using 100/32, therefore, The number of data blocks corresponding to the entry to be migrated is determined to be 4.
  • the maximum number of data blocks that can be accommodated can be set in advance for the migration task. Then, the number of migration tasks that need to be generated can be calculated according to the number of data blocks corresponding to the entry to be migrated. In addition to the above methods, the number of migration tasks can be freely set according to actual conditions. After the migration task is generated, the items to be migrated can be divided into each migration task in a relatively balanced manner.
  • the items to be migrated include A, B, C, D, E, and F.
  • the number of data blocks corresponding to A is 10, the number of data blocks corresponding to B is 20, and the number of data blocks corresponding to C is 30, corresponding to D.
  • the number of data blocks is 10, the number of data blocks corresponding to E is 10, and the number of data blocks corresponding to F is 50.
  • the maximum number of data blocks that the migration task can accommodate is 70
  • the number of data blocks corresponding to the items to be migrated is 130. Determine to generate two migration tasks, one of which has a number of blocks of 70 in the migration task (which can be called Migration Task 1) and 60 in the other migration task (which can be called Migration Task 2). You can assign the items to be migrated to the two migration tasks.
  • step S103 the migration task is assigned to the execution server to perform metadata migration on the migration entry.
  • the execution server may be a server that controls the metadata server group to perform a migration task, and the execution server may be a single server or a server cluster (or a server group) composed of multiple servers.
  • the primary server may obtain data such as current remaining bandwidth and/or resource utilization of each execution server, and may perform metadata migration capability of each execution server based on the obtained data, and may be based on metadata migration capability.
  • the size allocates the appropriate migration task to the execution server.
  • each migration task can be delivered to the corresponding execution server.
  • the execution server may analyze the received migration task and perform the migration task one by one to migrate the metadata corresponding to the item to be migrated in the metadata server group including the item to be migrated to the target metadata server group, where
  • the target metadata server group can be a relatively lightly loaded metadata server group.
  • An embodiment of the present application provides a metadata migration method, by determining an entry to be migrated in a directory to be migrated, and According to the number of data blocks corresponding to the item to be migrated, a migration task is generated, the migration task is divided into the items to be migrated, and finally the migration task is assigned to the execution server to perform metadata migration on the migration item, so that the data corresponding to the item to be migrated is used.
  • the number of blocks generates migration tasks, and the amount of data contained in each migration task is relatively balanced, thereby avoiding the execution server of the migration task that may be allocated a large amount of data, and shortening the time required for the execution server to complete the migration task, thereby improving the element.
  • the overall efficiency of data migration by determining an entry to be migrated in a directory to be migrated, and According to the number of data blocks corresponding to the item to be migrated, a migration task is generated, the migration task is divided into the items to be migrated, and finally the migration task is assigned to the execution server to perform metadata migration
  • the processing manner of the S101 in the foregoing Embodiment 1 can be various.
  • the specific processing manner is also provided below, and the following steps S201 to S203 may be included.
  • step S201 the root directory of the directory tree where the directory to be migrated is located is traversed, and the identifier of the current traversal is recorded during the traversal process.
  • the entry may include a directory and a file, where the directory may include a root directory and a subdirectory, and the root directory may be the top level directory of the file storage, for example, ac/file1/B.txt, where ac/ may be the root directory.
  • File1/ can be a subdirectory
  • B.txt can be a file.
  • the directory tree can be made up of a root directory and subdirectories.
  • the primary server can periodically detect the access or usage of the metadata stored in each metadata server group, and can determine the metadata server group that needs to be metadata migrated through the detected situation, and the metadata migration. Target metadata server group. Then, the primary server can determine the directory to be migrated according to the amount of data to be migrated or according to actual conditions, and determine the directory tree where the directory to be migrated is located, and the initial traversal stack can be constructed based on the directory tree. The directory in the initial traversal stack can be numbered and used as an entry identifier for the directory. Then, it is possible to traverse from the root directory of the directory tree, determine the files contained in the respective directories, and record the entry identifiers of the respective entries currently traversed during the traversal process.
  • a directory tree in which an entry to be migrated includes a root directory "/", subdirectories "A/", “B/”, and “C/”, etc., to be migrated
  • the entry can be "B/”
  • the root directory and each subdirectory are set with corresponding entry identifiers, such as the entry identifier of the root directory of FIG. 2 is 14, and the entry identifier of the subdirectory "A/" is 6 or the like.
  • the primary server can record the identifiable entry identifiers to form a traversal stack, such as 14-6-1, 14-9-8, and the like.
  • the load status of the metadata server group can be detected by the primary server.
  • a management server may be set in the metadata server group, and the management server may detect the access or usage of the metadata stored in the metadata server group, and report it to the main The server so that the primary server adjusts the load between the metadata server groups in a timely manner.
  • the primary server may encounter a failure such as a crash or a system crash. In this case, the primary server needs to be restarted, and the traversal process needs to be continued after the restart.
  • a failure such as a crash or a system crash.
  • the primary server needs to be restarted, and the traversal process needs to be continued after the restart.
  • S203 For the corresponding processing, refer to step S202 and the following steps. S203.
  • step S202 during the traversal process, if a downtime occurs locally, when the downtime is restored, the last recorded entry identifier is acquired to reconstruct the traversal stack.
  • the traversal completed entry identifier may be stored, and then the restart operation may be performed.
  • the directory tree in which the directory to be migrated is located may be obtained, and the last recorded entry identifier is obtained from the pre-recorded and stored entry identifier, and the traversal including the entry identifier may be found according to the last recorded entry identifier.
  • the stack can traverse the stack reconstruction based on the found traversal stack.
  • step S203 the directory tree is traversed according to the reconstructed traversal stack, and the items to be migrated included in the directory to be migrated are determined.
  • the reconstructed traversal stack usually includes the root directory and the last subdirectory of the subdirectory corresponding to the last recorded entry identifier. Therefore, after the main server obtains the reconstructed traversal stack, it only needs to continue the entry from the last record.
  • the subdirectory corresponding to the identifier is traversed downwards, and the files included in each directory can be obtained by traversing the directory tree, so that the items to be migrated included in the directory to be migrated can be determined.
  • the pre-recorded and stored traversal stack 14-9-8 can be found according to the entry identifier 8 and based on 14 -9-8 rebuilds the traversal stack. Then, the main server can continue to traverse down on the basis of 14-9-8, without the main server traversing the directory tree where the migration directory is located from the root directory, thereby saving the directory. The traversal time of the tree.
  • the processing manner of S102 in the foregoing Embodiment 1 can be various.
  • the following further provides a specific processing manner, which may include the following steps S204 and S205.
  • step S204 a migration task is generated according to the number of data blocks corresponding to each entry to be migrated, and a threshold number of data blocks accommodated by the migration task is determined.
  • the data block number threshold may be set according to actual conditions.
  • the data block number threshold may be 10 or 30 or the like.
  • step S204 For the processing of the above step S204, refer to the related content in step S102 in the foregoing first embodiment, This will not be repeated here.
  • step S205 according to the preset allocation policy, the migration task is allocated to the migration task within the threshold of the number of data blocks accommodated by the migration task, until the assignment of the to-be-migrated entry is completed.
  • the default allocation policy may be set according to the actual situation.
  • the embodiment of the present application provides an optional preset allocation policy, which may be: for the current migration task, if there is a data block in the unallocated entry to be migrated If the number of the data to be migrated is smaller than the number of idle data blocks of the current migration task, the items to be migrated whose number of data blocks are smaller than the number of free data blocks of the current migration task are allocated to the current migration task.
  • the primary server can calculate the number of data blocks corresponding to each item to be migrated through the processing of the above step S204, and can be combined with the data block that the migration task can accommodate.
  • the quantity threshold is compared. If the number of data blocks corresponding to the item to be migrated is smaller than the data block number threshold, the item to be migrated may be assigned to the migration task.
  • the primary server may calculate a difference between the number of data block thresholds and the number of data blocks corresponding to the allocated entry to be migrated, which may be used as the number of free data blocks of the migration task, and then The number of data blocks corresponding to the item to be migrated may be compared. If the number of data blocks corresponding to the item to be migrated is smaller than the number of free data blocks, the item to be migrated may be assigned to the migration task.
  • Each of the items to be migrated is sequentially allocated in the above manner to enter the corresponding migration task.
  • the amount of data corresponding to the items to be migrated allocated in each migration task can be counted in real time, and the amount of data in each migration task can be adjusted in real time. So that the amount of data in each migration task tends to be equal, thereby achieving load balancing between the various metadata server groups.
  • the migration task of the item to be migrated may be quickly generated by the following manner, specifically Includes the following:
  • each subdirectory in the above directory tree contains the number of entries: the subdirectory itself (the number is 1), and the subdirectory The sum of the number of subdirectories included and the number of files included in the subdirectory.
  • the target subdirectory can be The corresponding entry, the entry corresponding to the subdirectory contained in the target subdirectory, and the entry corresponding to the file contained in the target subdirectory are all assigned to the same migration task, so that the subdirectory with a small number of entries is quickly traversed and Assign migration tasks.
  • a predetermined number of thresholds such as 500 or 800, etc.
  • step S206 the foregoing migration task is allocated to the execution server in a manner of being delivered one by one, and the migration task currently owned by the execution server is made smaller than a preset value.
  • the preset value can be set according to actual conditions, such as 3 or 4, and the like.
  • the execution server when the execution server performs a certain migration task, other migration tasks assigned to the execution server may be in a waiting execution state. In order to make the execution task in the execution server too much to be executed, the execution task may be set.
  • the maximum number of migration tasks owned by each execution server (that is, the default value).
  • the main server may allocate the generated migration task to the corresponding execution server in a manner of being sent one by one. For details, refer to the related content in step S103 in the first embodiment, and details are not described herein again.
  • the primary server can also detect the number of migration tasks in the execution server periodically or in real time. When the number of migration tasks owned by an execution server is less than a predetermined task threshold, the primary server can deliver the migration to the execution server. task.
  • the number of migration tasks owned by the execution server is 2.
  • the primary server detects that the number of migration tasks in an execution server is less than 2 (that is, the task threshold)
  • the primary server can deliver the migration task to the execution server. This ensures that each execution server can have up to two migration tasks.
  • the execution server has less than two migration tasks, the primary server can deliver new migration tasks as soon as possible, so that the execution server can not be idle, and the migration task can be delivered to the task processing pressure as much as possible.
  • a relatively small execution server which reduces migration tasks that are waiting for execution.
  • the failure of the metadata server group is inevitable. Therefore, during the execution of the migration task, the execution server is likely to be executed quickly and smoothly due to the relatively slow or even unserviceable metadata server group.
  • the execution status of the execution task of the execution server can be monitored, and the currently performed migration task can be processed according to different states. For details, refer to the following steps S207 and S208.
  • step S207 the migration task status information fed back by the execution server is received.
  • the migration task status information may include one or more parameter information, such as a data migration speed of the migration task and/or a data volume (or a number of data blocks) included in the migration task, and/or a migration time that has been consumed and/or Or the remaining migration time, etc.
  • a feedback period can be set in the execution server, such as 30 seconds or 1 minute.
  • the execution server may obtain information about the current migration task, and may generate migration task status information based on the related information and send the information to the primary server, where the primary server may receive the migration task status information sent by the execution server.
  • the processing manner for obtaining the migration task status information may also adopt other processing manners, for example, the manner in which the primary server actively pulls.
  • step S208 it is determined according to the migration task state information that the execution server has a blocking migration task. If the execution server has a blocked migration task, the execution execution server performs an accelerated migration process.
  • the primary server may analyze the migration task status information to obtain target parameter data of the migration task that can be used to identify whether the execution server is blocked.
  • the migration task status information may include migration.
  • the data migration speed of the task and the amount of data (or the number of data blocks) included in the migration task the primary server can calculate the length of time required for the migration task to complete, and the duration can be used as the migration task of the execution server.
  • Target parameter data The data migration speed of the task and the amount of data (or the number of data blocks) included in the migration task, the primary server can calculate the length of time required for the migration task to complete, and the duration can be used as the migration task of the execution server.
  • the target parameter threshold of the migration task may be preset. If the target parameter data obtained above is greater than the target parameter threshold, the migration task with blocking in the execution server may be determined. At this time, the primary server may send an accelerated migration instruction to the execution server to execute After receiving the accelerated migration command, the server can perform the accelerated migration process on the currently executed migration task, so that the migration of the items to be migrated in the migration task can be quickly completed. If the target parameter data obtained above is smaller than the target parameter threshold, it may be determined that there is no blocking migration task in the execution server. At this time, the execution server may continue to perform the migration task until the migration task is completed.
  • step S208 in addition to the foregoing processing manners, a plurality of processing manners may be used, and an optional processing manner may be further provided, which may include the following steps 1 to 3.
  • Step 1 According to the migration task status information of the execution server, obtain the migration time that has been used to perform the metadata migration of the current migration data in the server.
  • the task identifier of the ongoing migration task and the entry identifier of the entry to be migrated may be extracted, and the current time is recorded.
  • the task identifier of the ongoing migration task and the entry identifier of the entry to be migrated may be extracted. If the task identifier and the entry identifier of the two times are the same, Calculate the migration time that has been spent on the item to be migrated.
  • Step 2 If the migration time exceeds a preset threshold, determine that the execution server has a blocked migration task.
  • the preset threshold may be set according to actual conditions, and may be 10 minutes or 20 minutes.
  • the above data may be determined according to the ratio of the data volume of the item to be migrated and the other items to be migrated.
  • the preset threshold may specifically include the following: if the number of data blocks in the to-be-migrated entry is greater than the predetermined number of the first to-be-migrated entries, the preset threshold is determined according to the number of data blocks corresponding to the first to-be-migrated entry and the predetermined number. The adjustment is performed to obtain a preset threshold corresponding to the first to-be-migrated entry.
  • the number of data blocks corresponding to an item to be migrated (that is, the first item to be migrated) is 50, and the number of data blocks corresponding to other items to be migrated is 10, and others are waiting for
  • the preset threshold of the migration entry is 10 minutes, and the preset threshold of the to-migrated entry may be 50 minutes or the like.
  • the items to be migrated may have duplicates in different data storage servers.
  • metadata migration is performed on the items to be migrated, the metadata of all the copies must be successfully migrated. If the data storage server of a copy is slow, or Failure to serve will result in congestion of the migration task and a decrease in the efficiency of the overall migration.
  • the embodiment performs an accelerated migration process on the blocked migration task by switching the execution server to the fast migration mode (that is, the Fast Mode). details as follows:
  • Step 3 Send an accelerated migration instruction to the execution server that has the blocking migration task, so that the execution server switches to the fast migration mode to migrate the metadata corresponding to the current to-be-migrated entry.
  • the execution server after receiving the accelerated migration instruction sent by the primary server, switches its working mode to the fast migration mode.
  • the quick migration mode for the copy of the currently migrated entry, stop the migration of the metadata of the copy that takes longer than the preset threshold, and generate the metadata recovery file of the current metadata migration entry, which will take longer to migrate.
  • the metadata of the remaining copies other than the copy of the preset threshold and the metadata recovery file are migrated to the target metadata server group.
  • the primary server may restore the metadata of the replica that exceeds the preset threshold in the target metadata server according to the metadata and the metadata recovery file of the remaining replicas other than the replica whose migration time exceeds the preset threshold in the target metadata server. data.
  • the items to be migrated in a migration task include three copies.
  • all the metadata of the three copies need to be migrated, and the migration task is determined to be completed.
  • the execution time of a task depends on the copy with the slowest migration speed.
  • the primary server can record the execution time of each migration task. When the execution time of a migration task is too long (for example, more than 10 minutes), you can control the execution server to perform mode switching.
  • the metadata corresponding to the entry thus avoiding long tails.
  • the migration task can be performed in a multi-thread concurrent manner.
  • the execution server may receive the migration task that has been performed. Therefore, before executing the migration task, the execution server may first determine whether the migration task has been migrated, thus saving the time for repeated migration.
  • the embodiment of the present application provides a metadata migration method, by determining an entry to be migrated in a directory to be migrated, and generating one or more migration tasks according to the number of data blocks corresponding to the entry to be migrated, and dividing each migration task.
  • the entry to be migrated, and finally the migration task is assigned to the execution server to perform metadata migration on the migration entry, which
  • the migration task is generated by the number of data blocks corresponding to the items to be migrated, and the amount of data contained in each migration task is relatively balanced, thereby avoiding an execution server of a migration task that may be allocated a large amount of data, and shortening the execution server to complete the migration. Tasks take time, which improves the overall productivity of metadata migration.
  • the individual metadata server group of the distributed file system is likely to be very slow in response to the aging of the device or the load is too high, and the service is still incapable of being serviced.
  • the current metadata migration in the execution server is to be migrated.
  • the migration task that the migration has taken longer than the preset threshold stops the metadata migration and continues with other migration tasks. After the other migration tasks are completed, the migration task that is stopped and executed is accelerated and migrated in a fast migration manner, so that the long tail can be avoided and the overall working efficiency of the metadata migration can be further improved.
  • the embodiment of the present application further provides a metadata migration device, as shown in FIG. 4 .
  • the metadata migration apparatus includes: an item to be migrated determination module 401, a migration task generation module 402, and a migration task assignment module 403, wherein:
  • the item to be migrated determining module 401 is configured to determine an item to be migrated in the directory to be migrated;
  • the migration task generation module 402 is configured to generate a migration task according to the number of data blocks corresponding to the to-be-migrated entry, and divide the to-migrate entry for the migration task;
  • the migration task assignment module 403 is configured to allocate the migration task to an execution server to perform metadata migration on the to-migrated entry.
  • the device further includes:
  • the receiving module 404 is configured to receive migration task status information that is sent by the execution server;
  • the blocking processing module 405 is configured to determine, according to the migration task status information, whether the execution server has a blocked migration task, and if the execution server has a blocked migration task, control the execution server to perform an accelerated migration process.
  • the blocking processing module 405 includes:
  • a migration time obtaining unit configured to acquire, according to the migration task state information of the execution server, a migration time that has been consumed by the current migration metadata entry in the execution server;
  • a blocking task determining unit configured to determine that the execution server has a blocked migration task if the migration time exceeds a preset threshold
  • a sending unit configured to send an acceleration migration instruction to the execution server that has the blocking migration task, so that the execution server switches to the metadata corresponding to the current to-migrate entry in the fast migration mode.
  • the device further includes:
  • the threshold adjustment module 406 is configured to: if the number of data blocks in the to-be-migrated entry is greater than the predetermined number of the first to-be-migrated entries, according to the number of data blocks corresponding to the first to-be-migrated entry and the predetermined number, The threshold is adjusted to obtain a preset threshold corresponding to the first to-be-migrated entry.
  • the item to be migrated determining module 401 includes:
  • the traversing operation unit is configured to traverse the root directory of the directory tree where the directory to be migrated is located, and record the current traversal entry identifier during the traversal process;
  • the traversing operation unit is further configured to continue traversing the directory tree according to the reconstructed traversal stack, and determine an entry to be migrated included in the directory to be migrated.
  • the device further includes:
  • the number of entries obtaining module 407 is configured to obtain, during the traversal process, the number of entries included in each subdirectory in the directory tree;
  • the item allocation module 408 is configured to allocate the items included in the target sub-directory to the same migration task if the directory tree includes a target sub-directory whose number of entries is less than a predetermined number of thresholds.
  • the migration task generating module 402 is configured to generate a migration task according to the number of data blocks corresponding to each of the to-be-migrated entries, and determine a threshold number of data blocks that the migration task accommodates; Assigning, to the migration task, the to-be-migrated entry to the migration task, after the allocation of the to-migrated entry is completed;
  • the preset allocation policy is: for the current migration task, if there is an unallocated entry in the to-be-migrated entry that is smaller than the number of idle data blocks of the current migration task, the unallocated An item to be migrated whose number of selected data blocks is smaller than the current number of idle data blocks of the current migration task is allocated to the current migration task.
  • the migration task assignment module 403 is configured to allocate the migration task to the execution server in a manner of being delivered one by one, and make the migration task currently owned by the execution server smaller than a preset value.
  • the embodiment of the present application provides a metadata migration apparatus, which is configured to determine an item to be migrated in a directory to be migrated, and generate a migration task according to the number of data blocks corresponding to the item to be migrated, and divide the item to be migrated for the migration task.
  • the migration task is assigned to the execution server to perform metadata migration on the migration items, the migration task is generated by the number of data blocks corresponding to the items to be migrated, and the amount of data contained in each migration task is relatively balanced, thereby avoiding possible allocation to
  • the execution server of a large data volume migration task shortens the time required for the execution server to complete the migration task, thereby improving the overall efficiency of metadata migration.
  • the individual metadata server group of the distributed file system is likely to be very slow in response to the aging of the device or the load is too high, and the service is still incapable of being serviced.
  • the current metadata migration in the execution server is to be migrated.
  • the migration task that the migration has taken longer than the preset threshold stops the metadata migration and continues with other migration tasks. After the other migration tasks are completed, the migration task that is stopped and executed is accelerated and migrated in a fast migration manner, so that the long tail can be avoided and the overall working efficiency of the metadata migration can be further improved.
  • the embodiment of the present application further provides a metadata migration system, as shown in FIG. 7 .
  • the metadata migration system includes: a metadata migration device 701, an execution server 702, and a plurality of metadata server groups 703, wherein:
  • the execution server 702 is configured to receive a migration task assigned by the metadata migration device 701, and perform data migration between the metadata server groups 703 according to the migration task.
  • the metadata migration device 701 may be the primary server in the first embodiment or the second embodiment.
  • the metadata server group 703 may be composed of one server or may be composed of a plurality of servers.
  • the execution server 702 is further configured to switch to the fast migration mode after receiving the accelerated migration instruction sent by the metadata migration device 701; in the fast migration mode, for the currently migrated entry Copying, stopping the migration of the metadata of the copy in which the migration duration exceeds the preset threshold, generating a metadata recovery file of the entry for the current metadata migration, and using the remaining copy other than the copy whose consumption migration duration exceeds the preset threshold Metadata and the metadata recovery file are migrated to the target metadata server group;
  • the metadata migration device 701 is further configured to: according to metadata of the remaining replicas other than the copy of the target metadata server that exceeds a preset threshold, and the metadata recovery file, where the target is The metadata of the copy whose consumption migration duration exceeds a preset threshold is restored in the metadata server.
  • the embodiment of the present application provides a metadata migration system, which is configured to determine an item to be migrated in a directory to be migrated, and generate a migration task according to the number of data blocks corresponding to the item to be migrated, and divide the item to be migrated for the migration task.
  • the migration task is assigned to the execution server to perform metadata migration on the migration items, the migration task is generated by the number of data blocks corresponding to the items to be migrated, and the amount of data contained in each migration task is relatively balanced, thereby avoiding possible allocation to
  • the execution server of a large data volume migration task shortens the time required for the execution server to complete the migration task, thereby improving the overall efficiency of metadata migration.
  • the individual metadata server group of the distributed file system is likely to be very slow in response to the aging of the device or the load is too high, and the service is still incapable of being serviced.
  • the current metadata migration in the execution server is to be migrated.
  • the migration task that the migration has taken longer than the preset threshold stops the metadata migration and continues with other migration tasks. After the other migration tasks are completed, the migration task that is stopped and executed is accelerated and migrated in a fast migration manner, so that the long tail can be avoided and the overall working efficiency of the metadata migration can be further improved.
  • the embodiment of the present application further provides a metadata migration device, as shown in FIG. 8.
  • the metadata migration device may be a primary server (also referred to as a Master server) that controls metadata migration provided by the above embodiments.
  • the metadata migration device may vary considerably depending on configuration or performance, and may include one or more processors 801 and memory 802 in which one or more storage applications or data may be stored.
  • the memory 802 can be short-lived or persistent.
  • An application stored in memory 802 may include one or more modules (not shown), each of which may include a series of computer executable instructions in a metadata migration device.
  • the processor 801 can be configured to communicate with the memory 802 to execute a series of computer executable instructions in the memory 802 on the metadata migration device.
  • the metadata migration device may also include one or more power sources 803, one or more wired or wireless network interfaces 804, one or more input and output interfaces 805, one or more keyboards 806, and the like.
  • the metadata migration device includes a memory, and one or more programs, wherein one or more programs are stored in the memory, and one or more programs may include one or more modules, and each The modules can include a series of computer executable instructions in a metadata migration device, and are configured to be executed by one or more processors.
  • the one or more programs are included for performing the following computer executable instructions:
  • the computer executable instructions when executed, may also cause the processor to:
  • the computer executable instructions when executed, may also cause the processor to:
  • controlling the execution server to perform an accelerated migration process includes:
  • the computer executable instructions when executed, may also cause the processor to:
  • the preset threshold is adjusted according to the number of data blocks corresponding to the first to-be-migrated entry and the predetermined number. The preset threshold corresponding to the first to-be-migrated entry is described.
  • the computer executable instructions when executed, may also cause the processor to:
  • the directory tree is traversed according to the reconstructed traversal stack, and the items to be migrated included in the directory to be migrated are determined.
  • the processor may further: traverse the root directory of the directory tree from where the directory to be migrated is located, and record the current traversal entry identifier after the traversal process Obtaining, in the traversal process, the number of entries included in each subdirectory in the directory tree;
  • the directory tree contains a target subdirectory whose number of entries is less than a predetermined number of thresholds, then the target sub The entries contained in the directory are assigned to the same migration task.
  • the computer executable instructions when executed, may also cause the processor to:
  • the preset allocation policy is: for the current migration task, if there is an unallocated entry in the to-be-migrated entry that is smaller than the number of idle data blocks of the current migration task, the unallocated An item to be migrated whose number of selected data blocks is smaller than the current number of idle data blocks of the current migration task is allocated to the current migration task.
  • the computer executable instructions when executed, may also cause the processor to:
  • the migration task is allocated to the execution server in a manner of being delivered one by one, and the migration task currently owned by the execution server is made smaller than a preset value.
  • the embodiment of the present application provides a metadata migration device, which determines an entry to be migrated in a directory to be migrated, and generates a migration task according to the number of data blocks corresponding to the entry to be migrated, and divides the to-migrate entry for the migration task, and finally migrates.
  • Tasks are assigned to the execution server to perform metadata migration on the migration items, so that the migration task is generated by the number of data blocks corresponding to the items to be migrated, and the amount of data contained in each migration task is relatively balanced, thereby avoiding possible allocation of large data.
  • the execution server of the migration task shortens the time required for the execution server to complete the migration task, thereby improving the overall efficiency of metadata migration.
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • ABEL Advanced Boolean Expression Language
  • AHDL Altera Hardware Description Language
  • HDCal JHDL
  • Lava Lola
  • MyHDL PALASM
  • RHDL Ruby Hardware Description Language
  • VHDL Very-High-Speed Integrated Circuit Hardware Description Language
  • Verilog Verilog
  • the controller can be implemented in any suitable manner, for example, the controller can take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor.
  • computer readable program code eg, software or firmware
  • examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, The Microchip PIC18F26K20 and the Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic.
  • the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding.
  • Such a controller can therefore be considered a hardware component, and the means for implementing various functions included therein can also be considered as a structure within the hardware component.
  • a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.
  • the system, device, module or unit illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function.
  • a typical implementation device is a computer.
  • the computer can be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware.
  • the application can be implemented on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) in which computer usable program code is embodied.
  • the form of a computer program product includes but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment in combination of software and hardware.
  • the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the application can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the present application can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种元数据迁移方法、装置、系统及设备,该方法包括:确定待迁移目录中的待迁移条目(S101);根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目(S102);将所述迁移任务分配至执行服务器以对所述待迁移条目进行元数据迁移(S103)。该方法可以实现待迁移条目在各个迁移任务中的均衡分布,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,可以缩短执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。

Description

一种元数据迁移方法、装置、系统及设备
本申请要求2016年12月22日递交的申请号为201611199032.4、发明名称为“一种元数据迁移方法、装置、系统及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种元数据迁移方法、装置、系统及设备。
背景技术
目前大规模分布式存储系统应用比较广泛,分布式存储系统由元数据服务器管理文件系统命名空间和文件各种属性,提供文件访问权限和文件存储位置等信息,同时由数据存储服务器存储文件,处理客户端对文件数据的各种读写请求。多数分布式存储系统中的元数据服务器通常由多个构成一组提供服务,每组服务器构成一个元数据服务器组(即volume),分布式存储系统通常会配置多个元数据服务器组,将元数据压力分散到各元数据服务器组上。
在系统运行过程中,元数据会不断存储到各元数据服务器组上,随着运行时间的增加,各元数据服务器组上存储的元数据也会逐渐出现不均衡。为均衡各元数据服务器组的负载,需要对各元数据服务器组进行元数据迁移。
目前,通常采用在元数据服务器组间进行元数据迁移的方式,即在元数据服务器组上对源目录进行拆分,获取待迁移条目,根据每个待迁移条目中包含的文件数量生成迁移任务,再将迁移任务分发到各个执行服务器进行元数据迁移。
在对现有技术进行研究后,发明人发现,在元数据服务器组间进行元数据迁移的过程中,基于待迁移条目中包含的文件数量生成迁移任务,由于文件粒度不均,因此生成的迁移任务中包含的数据量也不均,从而导致部分执行服务器可能分配到很大数据量的迁移任务,使得执行服务器完成该迁移任务需要耗费大量时间,致使元数据迁移的整体工作效率较低。
发明内容
本申请实施例的目的是提供一种元数据迁移方法、装置、系统及设备,以实现生成的迁移任务中包含的数据量相对均衡,提高元数据迁移的整体工作效率。
为解决上述技术问题,本申请实施例是这样实现的:
本申请实施例提供一种元数据迁移方法,所述方法包括:
确定待迁移目录中的待迁移条目;
根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
将所述迁移任务分配至执行服务器以对所述待迁移条目对应的元数据进行元数据迁移。
本申请实施例提供一种元数据迁移装置,所述装置包括:
待迁移条目确定模块,用于确定待迁移目录中的待迁移条目;
迁移任务生成模块,用于根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
迁移任务分配模块,用于将所述迁移任务分配至执行服务器以对所述待迁移条目进行元数据迁移。
本申请实施例提供一种元数据迁移系统,包括如上述实施例提供的元数据迁移装置、执行服务器和多个元数据服务器组,其中:
所述执行服务器,用于接收所述元数据迁移装置分配的迁移任务,根据所述迁移任务,在所述元数据服务器组之间进行数据迁移。
本申请实施例提供一种元数据迁移设备,所述元数据迁移设备包括:
处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:
确定待迁移目录中的待迁移条目;
根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
将所述迁移任务分配至执行服务器以对所述待迁移条目对应的元数据进行元数据迁移。
由以上本申请实施例提供的技术方案可见,本申请实施例通过确定待迁移目录中的待迁移条目,并根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目,最后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对 均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请一种元数据迁移方法实施例;
图2为本申请一种待迁移目录所在的目录树示意图;
图3为本申请一种迁移任务的迁移示意图;
图4为本申请一种元数据迁移装置实施例;
图5为本申请另一种元数据迁移装置实施例;
图6为本申请再一种元数据迁移装置实施例;
图7为本申请一种元数据迁移系统实施例;
图8为本申请一种元数据迁移设备实施例。
具体实施方式
本申请实施例提供一种元数据迁移方法、装置、系统及设备。
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
实施例一
如图1所示,本申请实施例提供一种元数据迁移方法,该方法的执行主体可以为控制元数据迁移的主服务器(也可称为Master服务器),该主服务器通过控制多个执行服务器(也可称为Work服务器),在多个元数据服务器组之间进行元数据迁移,以均衡各元数据服务器组的负载。该方法具体可以包括以下步骤:
步骤S101:确定待迁移目录中的待迁移条目。
其中,待迁移目录可以是需要进行元数据迁移的目录,也即是待迁移的元数据所在的目录。元数据可以是用于描述数据属性的信息,常用以支持如指示数据的存储位置、历史数据或资源的查找、文件记录等功能。待迁移条目可以包括待迁移目录和/或待迁移文件,待迁移文件可以是某种格式的文件,例如txt格式的文本文件、jnt格式的日记本文件等。
在实施中,分布式存储系统中,整个文件系统会根据实际需要被划分成多个部分,基于划分的过程可以生成相应的目录树,每个目录树可以分配至不同的元数据服务器组。主服务器可以周期性(如每隔12小时或每隔24小时等)或实时检测各元数据服务器的访问情况、数据输出情况等,可以通过检测到的上述情况对相应的元数据服务器组进行分析,确定该元数据服务器组是否负载过重。当某个元数据服务器组负载过重时,需要将该元数据服务器组的目录树中的部分子目录(即待迁移目录)重新分配给其他元数据服务器组。而重新为其他元数据服务器组分配子目录,就需要将这些子目录进行元数据迁移。
在进行元数据迁移时,需要确定待迁移目录中的待迁移条目。确定待迁移目录中的待迁移条目的处理可以多种多样,以下提供一种可选的处理方式,具体可以包括以下内容:主服务器可以通过遍历的方式遍历需要进行元数据迁移的待迁移目录,以得到该待迁移目录中的待迁移条目。例如,目录树为A-B-C、A-B-D、A-E,待迁移目录为B,则遍历该待迁移目录B,可以得到目录C和目录C中的文件c.txt,以及目录D和目录D中的文件d.txt,从而得到包括待迁移目录(包括目录C和目录D)和待迁移文件(包括c.txt和d.txt)的待迁移条目。
步骤S102:根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目。
其中,数据块可以是一组或几组按顺序连续排列的用于存储数据的块,可以是存储设备与输入输出设备之间进行传输的数据单位,即数据传输的过程中可以一个或多个数据块为单位进行数据传输。数据块中能够存储的数据量可以根据实际情况确定,例如,数据块中能够存储的数据量可以为32MB或64MB等。迁移任务可以是用于指示执行服务器进行元数据迁移的指令任务。
在实施中,为了使得元数据服务器组中存储的元数据的数据量分布均衡,且考虑到待迁移条目的数据量与待迁移条目对应的元数据的数据量呈一定的比例关系,可以通过待迁移条目的数据量来划分迁移任务,使得每个迁移任务中包含的数据量趋于相等,从 而达到均衡迁移工作量的目的。具体地,可以预先设定每个数据块中能够存储的数据量,可以对待迁移条目对应的数据量和每个数据块中能够存储的数据量进行计算,确定待迁移条目对应的数据块数量。
基于上述步骤S101的示例,若待迁移文件c.txt和d.txt对应的数据量为100MB,每个数据块中能够存储的数据量为32MB,则可以使用100/32得到3.12,因此,可以确定待迁移条目对应的数据块数量为4。
可以预先为迁移任务设定其能够容纳的最大数据块数量,然后,可以根据待迁移条目所对应的数据块数量,计算得到需要生成的迁移任务的数量。除上述方式外,还可以根据实际情况自由的设定迁移任务的数量。生成迁移任务后,可以相对均衡的向每个迁移任务中划分待迁移条目。
例如,待迁移条目包括A、B、C、D、E和F,其中,A对应的数据块数量为10,B对应的数据块数量为20,C对应的数据块数量为30,D对应的数据块数量为10,E对应的数据块数量为10,F对应的数据块数量为50,如果迁移任务能够容纳的最大数据块数量为70,则根据待迁移条目对应的数据块数量130,可以确定生成两个迁移任务,其中的一个迁移任务(可称为迁移任务1)中包含的数据块数量为70,另一个迁移任务(可称为迁移任务2)中包含的数据块数量为60。可以向两个迁移任务中分配待迁移条目,如果A、B、C、D、E和F依次顺序排列,则可以将待迁移条目A、B、C和D(对应的总数据块数量为70)划分到迁移任务1中,可以将待迁移条目E和F(对应的总数据块数量为60)划分到迁移任务2中。
在步骤S103中:将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移。
其中,执行服务器可以是控制元数据服务器组执行迁移任务的服务器,该执行服务器可以是单一的服务器,也可以是由多个服务器构成的服务器集群(或服务器组)。
在实施中,主服务器可以获取各个执行服务器当前的剩余带宽和/或资源利用率等数据,可以基于获取的上述数据对各个执行服务器的元数据迁移能力进行评估,并可以根据元数据迁移能力的大小为执行服务器分配相应的迁移任务。主服务器确定迁移任务的分配关系后,可以将各个迁移任务下发给相应的执行服务器。执行服务器可以对接收到的迁移任务进行分析,并逐条执行该迁移任务,以将包括待迁移条目的元数据服务器组中的待迁移条目对应的元数据迁移到目标元数据服务器组中,其中,目标元数据服务器组可以是负载相对较轻的元数据服务器组。
本申请实施例提供一种元数据迁移方法,通过确定待迁移目录中的待迁移条目,并 根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目,最后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
实施例二
本申请实施例提供了一种元数据迁移方法,该方法的执行主体可以为主服务器,该主服务器通过控制多个执行服务器,在多个元数据服务器组之间进行元数据迁移,以均衡各元数据服务器组的负载。该方法具体可以包括以下步骤:
上述实施例一中S101的处理方式可以多种多样,以下还提供一种具体的处理方式,可以包括以下步骤S201~步骤S203。
在步骤S201中:从待迁移目录所在的的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识。
其中,条目可以包括目录和文件,其中的目录可以包括根目录和子目录,根目录可以为文件存储的最上一级目录,例如,ac/file1/B.txt,其中,ac/可以为根目录,file1/可以为子目录,B.txt可以为文件。目录树可以是由根目录和子目录构成。
在实施中,主服务器可以周期性检测各元数据服务器组中存储的元数据的访问或使用情况,并可以通过检测到的情况确定需要进行元数据迁移的元数据服务器组,以及元数据迁移的目标元数据服务器组。然后,主服务器可以根据需要迁移的数据量或者根据实际情况确定待迁移目录,并确定待迁移目录所在的目录树,可以基于该目录树构建初始遍历栈。可以为初始遍历栈中的目录进行编号,并可以作为该目录的条目标识。然后,可以从该目录树的根目录进行遍历,确定各个目录中包含的文件,并在遍历过程中记录当前遍历的各个条目的条目标识。
例如,如图2所示,其中提供了一个待迁移条目所在的目录树,该目录树包括根目录“/”、子目录“A/”、“B/”和“C/”等,待迁移条目可以为“B/”,而且,根目录和各个子目录都设置有相应的条目标识,如图2根目录的条目标识为14,子目录“A/”的条目标识为6等。主服务器在遍历上述目录树的过程中,可以记录其遍历的条目标识,构成遍历栈,如14-6-1,14-9-8等。
需要说明的是,元数据服务器组的负载情况除了可以通过主服务器通过主动检测的 方式处理外,还可以通过其它方式,例如,元数据服务器组中可以设置有管理服务器,管理服务器可以对元数据服务器组中存储的元数据的访问或使用情况进行检测,并将其上报给主服务器,以便主服务器及时调整元数据服务器组之间的负载。
在遍历的过程中,主服务器可能会遇到如死机或系统崩溃等故障,此时,主服务器需要重启,并在重启后需要继续执行上述遍历过程,相应的处理可以参见下述步骤S202和步骤S203。
在步骤S202中:在遍历过程中,若本地发生宕机,则在宕机恢复时,获取最近一次记录的条目标识,以重建遍历栈。
在实施中,在执行上述步骤S201对待迁移目录所在的目录树进行遍历的过程中,如果主服务器发生故障需要重启时,可以存储遍历完成的条目标识,然后,可以执行重启操作。主服务器重启后,可以获取上述待迁移目录所在的目录树,并从预先记录并存储的条目标识中获取最近一次记录的条目标识,可以根据最近一次记录的条目标识查找到包括该条目标识的遍历栈,可以基于查找到的遍历栈进行遍历栈重建。
在步骤S203中:根据重建的遍历栈继续对目录树进行遍历,确定待迁移目录中包含的待迁移条目。
在实施中,重建的遍历栈中通常已经包括根目录和最近一次记录的条目标识对应的子目录的上级子目录,因此,主服务器得到重建的遍历栈后,只需要继续从最近一次记录的条目标识对应的子目录向下遍历,通过对目录树的遍历可以得到每个目录中包含的文件,从而可以确定待迁移目录中包含的待迁移条目。
例如,如图2所示,如果主服务器最近一次记录的条目标识为8,则在主服务器重启后,可以根据条目标识8查找到预先记录并存储的遍历栈14-9-8,并基于14-9-8重建遍历栈,然后,主服务器可以14-9-8作为基础继续向下遍历,而不需要主服务器从根目录开始重新对待迁移目录所在的目录树进行遍历,从而可以节省对目录树的遍历时间。
上述实施例一中S102的处理方式可以多种多样,以下还提供一种具体的处理方式,可以包括以下步骤S204和步骤S205。
在步骤S204中:根据每个待迁移条目所对应的数据块数量,生成迁移任务并确定迁移任务容纳的数据块数量阈值。
其中,数据块数量阈值可以根据实际情况进行设定,例如,数据块数量阈值可以为10或30等。
上述步骤S204的处理过程可以参见上述实施例一中的步骤S102中的相关内容,在 此不再赘述。
在步骤S205中:按照预设分配策略,在上述迁移任务容纳的数据块数量阈值范围内,为该迁移任务分配待迁移条目,直至待迁移条目分配完毕。
其中,预设分配策略可以根据实际情况进行设定,本申请实施例提供一种可选的预设分配策略,具体可以为:对于当前的迁移任务,若未分配的待迁移条目中存在数据块数量小于当前迁移任务空闲数据块数量的待迁移条目,则从未分配的待迁移条目中选取数据块数量小于当前迁移任务空闲数据块数量的待迁移条目分配给当前迁移任务。
在实施中,对于还没有分配待迁移条目的迁移任务,主服务器可以通过上述步骤S204的处理计算得到每个待迁移条目所对应的数据块数量,并可以将其与迁移任务能够容纳的数据块数量阈值进行比较,如果待迁移条目对应的数据块数量小于数据块数量阈值,则可以将该待迁移条目分配给该迁移任务。
对于已分配待迁移条目的迁移任务,主服务器可以计算数据块数量阈值与已分配的待迁移条目对应的数据块数量的差值,可以将该差值作为该迁移任务的空闲数据块数量,然后可以将其与某待迁移条目对应的数据块数量进行对比,如果该待迁移条目对应的数据块数量小于空闲数据块数量,则可以将该待迁移条目分配给该迁移任务。通过上述方式依次对每一个待迁移条目进行分配,使其进入相应的迁移任务中。
需要说明的是,通过上述方式将待迁移条目分配到相应的迁移任务的过程中,可以实时统计各个迁移任务中分配的待迁移条目对应的数据量,并可以实时调整各个迁移任务中的数据量,以使得各迁移任务中的数据量趋于相等,从而实现各元数据服务器组之间的负载均衡。
另外,为了加快上述步骤S201~步骤S203的遍历操作,以及上述步骤S204和步骤S205的迁移任务生成和为迁移任务分配的计算过程,还可以通过以下方式快速生成待迁移条目的迁移任务,具体可以包括以下内容:
在对待迁移目录所在的目录树进行遍历的过程中,可以获取上述目录树中各个子目录包含的条目数量,每个子目录包含的条目数量即:子目录本身(数量为1)、该子目录所包含的下级子目录的数量及该子目录所包含的文件的数量之和,如果待迁移目录中包含条目数量小于预定数量阈值(如500或800等)的目标子目录,则可以将目标子目录对应的条目、目标子目录包含的下级子目录对应的条目,以及目标子目录包含的文件对应的条目,一起分配到同一个迁移任务中,以便对条目数量较少的子目录快速遍历并为其分配迁移任务。
在步骤S206中:采用逐个下发的方式将上述迁移任务分配至执行服务器,且使执行服务器当前拥有的迁移任务小于预设值。
其中,预设值可以根据实际情况进行设定,具体如3或4等。
在实施中,执行服务器在执行某一项迁移任务时,分配给该执行服务器的其它迁移任务会处于等待执行状态,为了使得执行服务器中处于等待执行状态的迁移任务过多而无法执行,可以设定每个执行服务器拥有的迁移任务的数量的最大值(即预设值)。主服务器可以采用逐个下发的方式将生成的迁移任务分配至相应的执行服务器,具体可以参见上述实施例一中步骤S103中的相关内容,在此不再赘述。此外,主服务器还可以周期性或实时地检测执行服务器中拥有的迁移任务的数量,当某执行服务器中拥有的迁移任务的数量小于预定的任务阈值时,主服务器可以向该执行服务器下发迁移任务。例如,如果预设值为3,则执行服务器中拥有的迁移任务的数量为2。当主服务器检测到某执行服务器中拥有的迁移任务的数量小于2(即任务阈值)时,可以向该执行服务器下发迁移任务,这样,可以保证每个执行服务器可以最多拥有2个迁移任务,当执行服务器拥有的迁移任务不足2个时,主服务器可以尽快下发新的迁移任务来补足,从而,不仅可以使得执行服务器不会空闲,而且,可以最大可能的将迁移任务下发到任务处理压力相对较小的执行服务器,从而减少处于等待执行状态的迁移任务。
在分布式存储系统中,元数据服务器组故障不可避免,因此执行服务器在执行迁移任务过程中,很可能由于遇到相对缓慢甚至不可服务的元数据服务器组,为了迁移任务能够快速顺利的被执行完成,可以监测执行服务器执行迁移任务的执行状态,并可以根据不同的状态对当前执行的迁移任务进行相应的处理,具体可以参见下述步骤S207和步骤S208。
在步骤S207中:接收上述执行服务器反馈的迁移任务状态信息。
其中,迁移任务状态信息中可以包括一种或多种参数信息,如迁移任务的数据迁移速度和/或迁移任务中包含的数据量(或数据块数量)和/或已耗费的迁移时间和/或剩余迁移时长等。
在实施中,可以在执行服务器中设置反馈周期,例如30秒或1分钟等。当到达反馈周期时,执行服务器可以获取当前迁移任务的相关信息,可以基于该相关信息生成迁移任务状态信息发送给主服务器,主服务器可以接收执行服务器发送的迁移任务状态信息。
需要说明的是,获取迁移任务状态信息的处理方式除了可以采用上述为执行服务器设置反馈周期的方式外,还可以采用其它处理方式,例如主服务器主动拉取的方式。
在步骤S208中:根据上述迁移任务状态信息确定执行服务器是否存在阻塞的迁移任务,若执行服务器存在阻塞的迁移任务,则控制执行服务器进行加速迁移处理。
在实施中,主服务器接收到迁移任务状态信息后,可以对迁移任务状态信息进行分析,得到可以表征执行服务器中是否存在阻塞的迁移任务的目标参数数据,例如,迁移任务状态信息中可以包括迁移任务的数据迁移速度和迁移任务中包含的数据量(或数据块数量),则主服务器可以通过计算得到迁移任务完成所需要的时长,可以将该时长作为执行服务器中是否存在阻塞的迁移任务的目标参数数据。
可以预先设置迁移任务的目标参数阈值,如果上述得到的目标参数数据大于该目标参数阈值,则可以确定执行服务器中存在阻塞的迁移任务,此时,主服务器可以向执行服务器发送加速迁移指令,执行服务器接收到该加速迁移指令后,可以对当前执行的迁移任务进行加速迁移处理,从而可以快速完成对迁移任务中待迁移条目的迁移。如果上述得到的目标参数数据小于该目标参数阈值,则可以确定执行服务器中不存在阻塞的迁移任务,此时,执行服务器可以继续执行该迁移任务直至该迁移任务完成。
上述步骤S208除了可以采用上述处理方式外,还可以采用多种处理方式,以下还提供一种可选的处理方式,具体可以包括以下步骤一~步骤三。
步骤一,根据执行服务器的迁移任务状态信息,获取执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间。
在实施中,当主服务器接收到执行服务器发送的迁移任务状态信息时,可以提取其中的正在执行的迁移任务的任务标识和待迁移条目的条目标识,并记录当前时间。当主服务器再次接收到执行服务器发送的迁移任务状态信息时,可以提取其中的正在执行的迁移任务的任务标识和待迁移条目的条目标识,如果上述两次的任务标识和条目标识均相同,则可以计算该待迁移条目已耗费的迁移时间。
步骤二,若该迁移时间超出预设阈值,则确定执行服务器存在阻塞的迁移任务。
其中,预设阈值可以根据实际情况进行设定,具体可以为10分钟或20分钟等。
需要说明的是,如果某待迁移条目对应的数据块数量相比于其它待迁移条目对应的数据块数量较大或较小,则可以根据待迁移条目与其它待迁移条目的数据量比值确定上述预设阈值,具体可以包括以下内容:如果待迁移条目中存在数据块数量大于预定数量的第一待迁移条目,则根据第一待迁移条目对应的数据块数量和预定数量,对上述预设阈值进行调整,得到第一待迁移条目对应的预设阈值。例如,某待迁移条目(即第一待迁移条目)对应的数据块数量为50,其它待迁移条目对应的数据块数量均为10,其它待 迁移条目的预设阈值为10分钟,则该待迁移条目的预设阈值可以为50分钟等。
通常,待迁移条目可能在不同的数据存储服务器存有副本,对该待迁移条目进行元数据迁移时,必须保证所有副本的元数据都迁移成功,如果某个副本所在的数据存储服务器服务缓慢或者无法服务,就会造成迁移任务的阻塞,造成整体迁移工作的效率下降。
对于阻塞的迁移任务,本实施例采用将执行服务器切换为快速迁移模式(即Fast Mode)的方式,对阻塞的迁移任务进行加速迁移处理。具体如下:
步骤三,向存在阻塞迁移任务的执行服务器发送加速迁移指令,以使执行服务器切换为快速迁移模式迁移当前的待迁移条目对应的元数据。
在实施中,执行服务器在接收到主服务器发送的加速迁移指令后,将其工作模式切换到快速迁移模式。在快速迁移模式下,对于当前进行迁移的条目的副本,停止迁移其中耗费迁移时长超过预设阈值的副本的元数据,生成当前进行元数据迁移的条目的元数据恢复文件,将耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及该元数据恢复文件迁移至目标元数据服务器组。主服务器可以根据上述目标元数据服务器中耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及元数据恢复文件,在目标元数据服务器中恢复耗费迁移时长超过预设阈值的副本的元数据。
以下通过具体实例对上述步骤一~步骤三的处理过程进行说明,具体可以包括以下内容:
如图3所示,某迁移任务中的待迁移条目包括3个副本,在元数据迁移的过程中需要将3个副本的元数据全部迁移完成,该迁移任务才被确定为执行完成,这样迁移任务的执行时长就取决于迁移速度最慢的副本。为此,主服务器可以记录每个迁移任务的执行时间,当发现某迁移任务的执行时长过长(如超过10分钟)时,可以控制执行服务器进行模式切换,从而采用快速迁移模式迁移上述待迁移条目对应的元数据,从而避免长尾。
需要说明的是,为了加快对待迁移条目的元数据迁移,可以采用多线程并发的方式执行迁移任务。此外,由于调度原因,执行服务器有可能会接收到已经执行完成的迁移任务,因此执行服务器在执行迁移任务前,可以先判断该迁移任务是否已经迁移完毕,这样,可以节约重复迁移的时间。
本申请实施例提供一种元数据迁移方法,通过确定待迁移目录中的待迁移条目,并根据待迁移条目所对应的数据块数量,生成一个或多个迁移任务,并为每个迁移任务划分待迁移条目,最后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这 样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
进一步地,分布式文件系统的个别元数据服务器组很可能由于设备老化或负载过高,导致响应极慢,甚至无法服务,本申请实施例中,对于执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间超出预设阈值的迁移任务,停止对其进行元数据迁移,而继续执行其他迁移任务。并在其他迁移任务完成后,通过快速迁移的方式对上述停止执行的迁移任务进行加速迁移处理,从而可以避免长尾,进一步提高元数据迁移的整体工作效率。
实施例三
以上为本申请实施例提供的元数据迁移方法,基于同样的思路,本申请实施例还提供一种元数据迁移装置,如图4所示。
所述元数据迁移装置包括:待迁移条目确定模块401、迁移任务生成模块402和迁移任务分配模块403,其中:
待迁移条目确定模块401,用于确定待迁移目录中的待迁移条目;
迁移任务生成模块402,用于根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
迁移任务分配模块403,用于将所述迁移任务分配至执行服务器以对所述待迁移条目进行元数据迁移。
本申请实施例中,如图5所示,所述装置还包括:
接收模块404,用于接收所述执行服务器反馈的迁移任务状态信息;
阻塞处理模块405,用于根据所述迁移任务状态信息确定所述执行服务器是否存在阻塞的迁移任务,若所述执行服务器存在阻塞的迁移任务,则控制所述执行服务器进行加速迁移处理。
本申请实施例中,所述阻塞处理模块405,包括:
迁移时间获取单元,用于根据所述执行服务器的迁移任务状态信息,获取所述执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间;
阻塞任务确定单元,用于若所述迁移时间超出预设阈值,则确定所述执行服务器存在阻塞的迁移任务;
发送单元,用于向所述存在阻塞迁移任务的执行服务器发送加速迁移指令,以使所述执行服务器切换为快速迁移模式迁移当前的待迁移条目对应的元数据。
本申请实施例中,如图5所示,所述装置还包括:
阈值调整模块406,用于如果待迁移条目中存在数据块数量大于预定数量的第一待迁移条目,则根据所述第一待迁移条目对应的数据块数量和所述预定数量,对所述预设阈值进行调整,得到所述第一待迁移条目对应的预设阈值。
本申请实施例中,所述待迁移条目确定模块401,包括:
遍历操作单元,用于从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识;
遍历栈重建单元,用于在遍历过程中,若本地发生宕机,则在宕机恢复时,获取最近一次记录的条目标识,以重建遍历栈;
所述遍历操作单元,还用于根据重建的遍历栈继续对所述目录树进行遍历,确定待迁移目录中包含的待迁移条目。
本申请实施例中,如图6所示,所述装置还包括:
条目数量获取模块407,用于在遍历过程中,获取所述目录树中各个子目录包含的条目数量;
条目分配模块408,用于如果所述目录树中包含条目数量小于预定数量阈值的目标子目录,则将所述目标子目录包含的条目分配到同一个迁移任务中。
本申请实施例中,所述迁移任务生成模块402,用于根据每个所述待迁移条目所对应的数据块数量,生成迁移任务并确定所述迁移任务容纳的数据块数量阈值;按照预设分配策略,在所述迁移任务容纳的数据块数量阈值范围内,为所述迁移任务分配所述待迁移条目,直至所述待迁移条目分配完毕;
其中,所述预设分配策略为:对于当前的迁移任务,若未分配的所述待迁移条目中存在数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目,则从未分配的所述待迁移条目中选取数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目分配给所述当前迁移任务。
本申请实施例中,所述迁移任务分配模块403,用于采用逐个下发的方式将所述迁移任务分配至所述执行服务器,且使所述执行服务器当前拥有的迁移任务小于预设值。
本申请实施例提供一种元数据迁移装置,通过确定待迁移目录中的待迁移条目,并根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目,最 后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
进一步地,分布式文件系统的个别元数据服务器组很可能由于设备老化或负载过高,导致响应极慢,甚至无法服务,本申请实施例中,对于执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间超出预设阈值的迁移任务,停止对其进行元数据迁移,而继续执行其他迁移任务。并在其他迁移任务完成后,通过快速迁移的方式对上述停止执行的迁移任务进行加速迁移处理,从而可以避免长尾,进一步提高元数据迁移的整体工作效率。
实施例四
以上为本申请实施例提供的元数据迁移装置,基于同样的思路,本申请实施例还提供一种元数据迁移系统,如图7所示。
所述元数据迁移系统包括:元数据迁移装置701、执行服务器702和多个元数据服务器组703,其中:
所述执行服务器702,用于接收所述元数据迁移装置701分配的迁移任务,根据所述迁移任务,在所述元数据服务器组703之间进行数据迁移。
其中,元数据迁移装置701可以是上述实施例一或实施例二中的主服务器。元数据服务器组703可以是由一个服务器构成,也可以是由多个服务器构成。
本申请实施例中,所述执行服务器702,还用于在接收到所述元数据迁移装置701发送的加速迁移指令后切换到快速迁移模式;在快速迁移模式下,对于当前进行迁移的条目的副本,停止迁移其中耗费迁移时长超过预设阈值的副本的元数据,生成所述当前进行元数据迁移的条目的元数据恢复文件,将所述耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及所述元数据恢复文件迁移至目标元数据服务器组;
所述元数据迁移装置701,还用于根据所述目标元数据服务器中的所述耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及所述元数据恢复文件,在所述目标元数据服务器中恢复所述耗费迁移时长超过预设阈值的副本的元数据。
本申请实施例提供一种元数据迁移系统,通过确定待迁移目录中的待迁移条目,并根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目,最 后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
进一步地,分布式文件系统的个别元数据服务器组很可能由于设备老化或负载过高,导致响应极慢,甚至无法服务,本申请实施例中,对于执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间超出预设阈值的迁移任务,停止对其进行元数据迁移,而继续执行其他迁移任务。并在其他迁移任务完成后,通过快速迁移的方式对上述停止执行的迁移任务进行加速迁移处理,从而可以避免长尾,进一步提高元数据迁移的整体工作效率。
实施例五
基于同样的思路,本申请实施例还提供一种元数据迁移设备,如图8所示。该元数据迁移设备可以为上述实施例提供的控制元数据迁移的主服务器(也可称为Master服务器)。
元数据迁移设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上的处理器801和存储器802,存储器802中可以存储有一个或一个以上存储应用程序或数据。其中,存储器802可以是短暂存储或持久存储。存储在存储器802的应用程序可以包括一个或一个以上模块(图示未示出),每个模块可以包括对元数据迁移设备中的一系列计算机可执行指令。更进一步地,处理器801可以设置为与存储器802通信,在元数据迁移设备上执行存储器802中的一系列计算机可执行指令。元数据迁移设备还可以包括一个或一个以上电源803,一个或一个以上有线或无线网络接口804,一个或一个以上输入输出接口805,一个或一个以上键盘806等。
具体在本实施例中,元数据迁移设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对元数据迁移设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:
确定待迁移目录中的待迁移条目;
根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所 述待迁移条目;
将所述迁移任务分配至执行服务器以对所述待迁移条目对应的元数据进行元数据迁移。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
在所述将所述迁移任务分配至执行服务器之后,接收所述执行服务器反馈的迁移任务状态信息;
根据所述迁移任务状态信息确定所述执行服务器是否存在阻塞的迁移任务,若所述执行服务器存在阻塞的迁移任务,则控制所述执行服务器进行加速迁移处理。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
根据所述执行服务器的迁移任务状态信息,获取所述执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间;
若所述迁移时间超出预设阈值,则确定所述执行服务器存在阻塞的迁移任务;
相应的,所述控制所述执行服务器进行加速迁移处理,包括:
向所述存在阻塞迁移任务的执行服务器发送加速迁移指令,以使所述执行服务器切换为快速迁移模式迁移当前的待迁移条目对应的元数据。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
如果待迁移条目中存在数据块数量大于预定数量的第一待迁移条目,则根据所述第一待迁移条目对应的数据块数量和所述预定数量,对所述预设阈值进行调整,得到所述第一待迁移条目对应的预设阈值。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识;
在遍历过程中,若本地发生宕机,则在宕机恢复时,获取最近一次记录的条目标识,以重建遍历栈;
根据重建的遍历栈继续对所述目录树进行遍历,确定待迁移目录中包含的待迁移条目。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:在所述从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识之后,在遍历过程中,获取所述目录树中各个子目录包含的条目数量;
如果所述目录树中包含条目数量小于预定数量阈值的目标子目录,则将所述目标子 目录包含的条目分配到同一个迁移任务中。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
根据每个所述待迁移条目所对应的数据块数量,生成迁移任务并确定所述迁移任务容纳的数据块数量阈值;
按照预设分配策略,在所述迁移任务容纳的数据块数量阈值范围内,为所述迁移任务分配所述待迁移条目,直至所述待迁移条目分配完毕;
其中,所述预设分配策略为:对于当前的迁移任务,若未分配的所述待迁移条目中存在数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目,则从未分配的所述待迁移条目中选取数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目分配给所述当前迁移任务。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:
采用逐个下发的方式将所述迁移任务分配至所述执行服务器,且使所述执行服务器当前拥有的迁移任务小于预设值。
本申请实施例提供一种元数据迁移设备,通过确定待迁移目录中的待迁移条目,并根据待迁移条目所对应的数据块数量,生成迁移任务,为迁移任务划分待迁移条目,最后将迁移任务分配至执行服务器以对待迁移条目进行元数据迁移,这样,通过待迁移条目所对应的数据块数量生成迁移任务,各个迁移任务中包含的数据量相对均衡,从而避免出现可能分配到很大数据量的迁移任务的执行服务器,缩短了执行服务器完成迁移任务需要耗费的时间,从而可以提高元数据迁移的整体工作效率。
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语 言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的 计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。

Claims (19)

  1. 一种元数据迁移方法,其特征在于,所述方法包括:
    确定待迁移目录中的待迁移条目;
    根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
    将所述迁移任务分配至执行服务器以对所述待迁移条目对应的元数据进行元数据迁移。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述迁移任务分配至执行服务器之后,所述方法还包括:
    接收所述执行服务器反馈的迁移任务状态信息;
    根据所述迁移任务状态信息确定所述执行服务器是否存在阻塞的迁移任务,若所述执行服务器存在阻塞的迁移任务,则控制所述执行服务器进行加速迁移处理。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述迁移任务状态信息确定所述执行服务器是否存在阻塞的迁移任务,包括:
    根据所述执行服务器的迁移任务状态信息,获取所述执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间;
    若所述迁移时间超出预设阈值,则确定所述执行服务器存在阻塞的迁移任务;
    相应的,所述控制所述执行服务器进行加速迁移处理,包括:
    向所述存在阻塞迁移任务的执行服务器发送加速迁移指令,以使所述执行服务器切换为快速迁移模式迁移当前的待迁移条目对应的元数据。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    如果待迁移条目中存在数据块数量大于预定数量的第一待迁移条目,则根据所述第一待迁移条目对应的数据块数量和所述预定数量,对所述预设阈值进行调整,得到所述第一待迁移条目对应的预设阈值。
  5. 根据权利要求1所述的方法,其特征在于,所述确定待迁移目录中包含的待迁移条目,包括:
    从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识;
    在遍历过程中,若本地发生宕机,则在宕机恢复时,获取最近一次记录的条目标识,以重建遍历栈;
    根据重建的遍历栈继续对所述目录树进行遍历,确定待迁移目录中包含的待迁移条目。
  6. 根据权利要求5所述的方法,其特征在于,所述从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识之后,还包括:
    在遍历过程中,获取所述目录树中各个子目录包含的条目数量;
    如果所述目录树中包含条目数量小于预定数量阈值的目标子目录,则将所述目标子目录包含的条目分配到同一个迁移任务中。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目,包括:
    根据每个所述待迁移条目所对应的数据块数量,生成迁移任务并确定所述迁移任务容纳的数据块数量阈值;
    按照预设分配策略,在所述迁移任务容纳的数据块数量阈值范围内,为所述迁移任务分配所述待迁移条目,直至所述待迁移条目分配完毕;
    其中,所述预设分配策略为:对于当前的迁移任务,若未分配的所述待迁移条目中存在数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目,则从未分配的所述待迁移条目中选取数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目分配给所述当前迁移任务。
  8. 根据权利要求1所述的方法,其特征在于,所述将所述迁移任务分配至执行服务器,包括:
    采用逐个下发的方式将所述迁移任务分配至所述执行服务器,且使所述执行服务器当前拥有的迁移任务小于预设值。
  9. 一种元数据迁移装置,其特征在于,所述装置包括:
    待迁移条目确定模块,用于确定待迁移目录中的待迁移条目;
    迁移任务生成模块,用于根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
    迁移任务分配模块,用于将所述迁移任务分配至执行服务器以对所述待迁移条目进行元数据迁移。
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    接收模块,用于接收所述执行服务器反馈的迁移任务状态信息;
    阻塞处理模块,用于根据所述迁移任务状态信息确定所述执行服务器是否存在阻塞的迁移任务,若所述执行服务器存在阻塞的迁移任务,则控制所述执行服务器进行加速迁移处理。
  11. 根据权利要求10所述的装置,其特征在于,所述阻塞处理模块,包括:
    迁移时间获取单元,用于根据所述执行服务器的迁移任务状态信息,获取所述执行服务器中当前进行元数据迁移的待迁移条目已耗费的迁移时间;
    阻塞任务确定单元,用于若所述迁移时间超出预设阈值,则确定所述执行服务器存在阻塞的迁移任务;
    发送单元,用于向所述存在阻塞迁移任务的执行服务器发送加速迁移指令,以使所述执行服务器切换为快速迁移模式迁移当前的待迁移条目对应的元数据。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    阈值调整模块,用于如果待迁移条目中存在数据块数量大于预定数量的第一待迁移条目,则根据所述第一待迁移条目对应的数据块数量和所述预定数量,对所述预设阈值进行调整,得到所述第一待迁移条目对应的预设阈值。
  13. 根据权利要求9所述的装置,其特征在于,所述待迁移条目确定模块,包括:
    遍历操作单元,用于从所述待迁移目录所在的目录树的根目录进行遍历,遍历过程中记录当前遍历的条目标识;
    遍历栈重建单元,用于在遍历过程中,若本地发生宕机,则在宕机恢复时,获取最近一次记录的条目标识,以重建遍历栈;
    所述遍历操作单元,还用于根据重建的遍历栈继续对所述目录树进行遍历,确定待迁移目录中包含的待迁移条目。
  14. 根据权利要求13所述的装置,其特征在于,所述装置还包括:
    条目数量获取模块,用于在遍历过程中,获取所述目录树中各个子目录包含的条目数量;
    条目分配模块,用于如果所述目录树中包含条目数量小于预定数量阈值的目标子目录,则将所述目标子目录包含的条目分配到同一个迁移任务中。
  15. 根据权利要求9所述的装置,其特征在于,所述迁移任务生成模块,用于根据每个所述待迁移条目所对应的数据块数量,生成迁移任务并确定所述迁移任务容纳的数据块数量阈值;按照预设分配策略,在所述迁移任务容纳的数据块数量阈值范围内,为所述迁移任务分配所述待迁移条目,直至所述待迁移条目分配完毕;
    其中,所述预设分配策略为:对于当前的迁移任务,若未分配的所述待迁移条目中存在数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目,则从未分配的所述待迁移条目中选取数据块数量小于所述当前迁移任务空闲数据块数量的待迁移条目分配给所述当前迁移任务。
  16. 根据权利要求9所述的装置,其特征在于,所述迁移任务分配模块,用于采用逐个下发的方式将所述迁移任务分配至所述执行服务器,且使所述执行服务器当前拥有的迁移任务小于预设值。
  17. 一种元数据迁移系统,其特征在于,包括如权利要求9-16所述的元数据迁移装置、执行服务器和多个元数据服务器组,其中:
    所述执行服务器,用于接收所述元数据迁移装置分配的迁移任务,根据所述迁移任务,在所述元数据服务器组之间进行元数据迁移。
  18. 根据权利要求17所述的系统,其特征在于,
    所述执行服务器,还用于在接收到所述元数据迁移装置发送的加速迁移指令后切换到快速迁移模式;在快速迁移模式下,对于当前进行迁移的条目的副本,停止迁移其中 耗费迁移时长超过预设阈值的副本的元数据,生成所述当前进行元数据迁移的条目的元数据恢复文件,将所述耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及所述元数据恢复文件迁移至目标元数据服务器组;
    所述元数据迁移装置,还用于根据所述目标元数据服务器中的所述耗费迁移时长超过预设阈值的副本以外的其余副本的元数据及所述元数据恢复文件,在所述目标元数据服务器中恢复所述耗费迁移时长超过预设阈值的副本的元数据。
  19. 一种元数据迁移设备,其特征在于,所述元数据迁移设备包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:
    确定待迁移目录中的待迁移条目;
    根据所述待迁移条目所对应的数据块数量,生成迁移任务,为所述迁移任务划分所述待迁移条目;
    将所述迁移任务分配至执行服务器以对所述待迁移条目对应的元数据进行元数据迁移。
PCT/CN2017/115190 2016-12-22 2017-12-08 一种元数据迁移方法、装置、系统及设备 WO2018113533A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611199032.4A CN108228672B (zh) 2016-12-22 2016-12-22 一种元数据迁移方法、装置、系统及设备
CN201611199032.4 2016-12-22

Publications (1)

Publication Number Publication Date
WO2018113533A1 true WO2018113533A1 (zh) 2018-06-28

Family

ID=62624475

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115190 WO2018113533A1 (zh) 2016-12-22 2017-12-08 一种元数据迁移方法、装置、系统及设备

Country Status (2)

Country Link
CN (1) CN108228672B (zh)
WO (1) WO2018113533A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109901925A (zh) * 2018-11-06 2019-06-18 阿里巴巴集团控股有限公司 一种任务处理方法及系统
CN112162963A (zh) * 2020-10-15 2021-01-01 苏州交驰人工智能研究院有限公司 一种数据同步方法、装置、计算机设备及存储介质
CN112559118A (zh) * 2019-09-25 2021-03-26 北京国双科技有限公司 应用数据迁移方法、装置、电子设备及存储介质
CN113157427A (zh) * 2020-01-07 2021-07-23 中科寒武纪科技股份有限公司 任务迁移的方法、装置、计算机设备及可读存储介质
CN113608876A (zh) * 2021-08-12 2021-11-05 中国科学技术大学 基于负载类型感知的分布式文件系统元数据负载均衡方法
CN115426251A (zh) * 2022-08-30 2022-12-02 山东海量信息技术研究院 一种云主机的容灾方法、装置及介质
CN116089358A (zh) * 2022-06-02 2023-05-09 荣耀终端有限公司 数据迁移方法及电子设备
CN118069066A (zh) * 2024-04-12 2024-05-24 四川华鲲振宇智能科技有限责任公司 一种提升存储系统性能的存储方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110874344B (zh) * 2018-08-10 2023-05-09 阿里巴巴集团控股有限公司 数据迁移方法、装置及电子设备
CN109343793B (zh) * 2018-09-11 2021-09-07 创新先进技术有限公司 数据迁移方法及装置
CN110471774A (zh) * 2019-06-28 2019-11-19 苏宁云计算有限公司 一种基于统一任务调度的数据处理方法及装置
CN110555014B (zh) * 2019-09-06 2022-04-15 中国联合网络通信集团有限公司 一种数据迁移方法和系统、电子设备、存储介质
CN110928860B (zh) * 2019-11-27 2023-06-20 中国银行股份有限公司 数据迁移方法和装置
CN111104404B (zh) * 2019-12-04 2021-10-01 星辰天合(北京)数据科技有限公司 基于分布式对象的数据存储方法及装置
CN113660298A (zh) * 2020-05-12 2021-11-16 北京沃东天骏信息技术有限公司 一种数据迁移方法和装置
CN112347080B (zh) * 2020-11-11 2024-08-16 金蝶云科技有限公司 一种数据迁移方法及相关装置
CN113778982A (zh) * 2021-03-09 2021-12-10 北京沃东天骏信息技术有限公司 一种数据迁移方法和装置
CN113225576B (zh) * 2021-04-30 2023-03-21 广州虎牙科技有限公司 基于直播平台边缘计算场景的服务迁移系统和方法
CN117155759B (zh) * 2023-10-27 2024-01-05 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136691A1 (en) * 2004-12-20 2006-06-22 Brown Michael F Method to perform parallel data migration in a clustered storage environment
CN102857577A (zh) * 2012-09-24 2013-01-02 北京联创信安科技有限公司 一种集群存储自动负载均衡的系统及方法
CN103279568A (zh) * 2013-06-18 2013-09-04 无锡紫光存储系统有限公司 一种元数据管理系统及方法
CN104731888A (zh) * 2015-03-12 2015-06-24 北京奇虎科技有限公司 一种数据迁移的方法、装置和系统
CN105975331A (zh) * 2016-04-26 2016-09-28 浪潮(北京)电子信息产业有限公司 一种数据并行处理方法及装置
CN106020959A (zh) * 2016-05-24 2016-10-12 郑州悉知信息科技股份有限公司 一种数据迁移方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009043016A (ja) * 2007-08-08 2009-02-26 Hitachi Ltd ストレージシステム及びストレージシステムのアクセス均等化方法
WO2012056494A2 (en) * 2010-10-26 2012-05-03 Hitachi, Ltd. Storage system and its operation method
CN102495906A (zh) * 2011-12-23 2012-06-13 天津神舟通用数据技术有限公司 一种实现断点续传的增量式数据迁移方法
US10102211B2 (en) * 2014-04-18 2018-10-16 Oracle International Corporation Systems and methods for multi-threaded shadow migration
CN104468521B (zh) * 2014-11-13 2017-12-29 华为技术有限公司 在线迁移方法、装置和系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136691A1 (en) * 2004-12-20 2006-06-22 Brown Michael F Method to perform parallel data migration in a clustered storage environment
CN102857577A (zh) * 2012-09-24 2013-01-02 北京联创信安科技有限公司 一种集群存储自动负载均衡的系统及方法
CN103279568A (zh) * 2013-06-18 2013-09-04 无锡紫光存储系统有限公司 一种元数据管理系统及方法
CN104731888A (zh) * 2015-03-12 2015-06-24 北京奇虎科技有限公司 一种数据迁移的方法、装置和系统
CN105975331A (zh) * 2016-04-26 2016-09-28 浪潮(北京)电子信息产业有限公司 一种数据并行处理方法及装置
CN106020959A (zh) * 2016-05-24 2016-10-12 郑州悉知信息科技股份有限公司 一种数据迁移方法和装置

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109901925A (zh) * 2018-11-06 2019-06-18 阿里巴巴集团控股有限公司 一种任务处理方法及系统
CN109901925B (zh) * 2018-11-06 2023-08-25 创新先进技术有限公司 一种任务处理方法及系统
CN112559118A (zh) * 2019-09-25 2021-03-26 北京国双科技有限公司 应用数据迁移方法、装置、电子设备及存储介质
CN113157427B (zh) * 2020-01-07 2024-03-15 中科寒武纪科技股份有限公司 任务迁移的方法、装置、计算机设备及可读存储介质
CN113157427A (zh) * 2020-01-07 2021-07-23 中科寒武纪科技股份有限公司 任务迁移的方法、装置、计算机设备及可读存储介质
CN112162963A (zh) * 2020-10-15 2021-01-01 苏州交驰人工智能研究院有限公司 一种数据同步方法、装置、计算机设备及存储介质
CN113608876A (zh) * 2021-08-12 2021-11-05 中国科学技术大学 基于负载类型感知的分布式文件系统元数据负载均衡方法
CN113608876B (zh) * 2021-08-12 2024-03-29 中国科学技术大学 基于负载类型感知的分布式文件系统元数据负载均衡方法
CN116089358A (zh) * 2022-06-02 2023-05-09 荣耀终端有限公司 数据迁移方法及电子设备
CN116089358B (zh) * 2022-06-02 2023-11-24 荣耀终端有限公司 数据迁移方法及电子设备
CN115426251B (zh) * 2022-08-30 2024-02-13 山东海量信息技术研究院 一种云主机的容灾方法、装置及介质
CN115426251A (zh) * 2022-08-30 2022-12-02 山东海量信息技术研究院 一种云主机的容灾方法、装置及介质
CN118069066A (zh) * 2024-04-12 2024-05-24 四川华鲲振宇智能科技有限责任公司 一种提升存储系统性能的存储方法

Also Published As

Publication number Publication date
CN108228672B (zh) 2022-05-03
CN108228672A (zh) 2018-06-29

Similar Documents

Publication Publication Date Title
WO2018113533A1 (zh) 一种元数据迁移方法、装置、系统及设备
CN106933534B (zh) 一种数据同步方法和装置
US11275622B2 (en) Utilizing accelerators to accelerate data analytic workloads in disaggregated systems
US10078639B1 (en) Cluster file system comprising data mover modules having associated quota manager for managing back-end user quotas
JP6219512B2 (ja) 仮想ハドゥープマネジャ
EP3129880B1 (en) Method and device for augmenting and releasing capacity of computing resources in real-time stream computing system
AU2011312036B2 (en) Automatic replication and migration of live virtual machines
US8751657B2 (en) Multi-client storage system and storage system management method
WO2019137320A1 (zh) 资源调度方法、装置、设备和系统
WO2016023390A1 (zh) 虚拟机存储资源部署方法和装置
US9052828B2 (en) Optimal volume placement across remote replication relationships
CN109739627B (zh) 任务的调度方法、电子设备及介质
CN110597614B (zh) 一种资源调整方法及装置
US9584435B2 (en) Global cloud computing environment resource allocation with local optimization
US11442927B1 (en) Storage performance-based distribution of deduplicated data to nodes within a clustered storage environment
US20140282540A1 (en) Performant host selection for virtualization centers
US9838332B1 (en) Dynamically meeting slas without provisioning static capacity
US9507526B2 (en) Just-in time remote data storage allocation
US11249790B1 (en) Scheduling usage of oversubscribed computing resources
WO2016041446A1 (zh) 一种资源分配方法、装置及设备
JP2017528794A (ja) 負荷に基づく動的統合
WO2019011262A1 (zh) 分配资源的方法和装置
US9645841B2 (en) Dynamic virtual machine storage usage monitoring, provisioning, and migration
US10007673B1 (en) Cluster file system comprising data mover module arranged between front-end and back-end file systems
US9465549B1 (en) Dynamic allocation of a high-speed memory pool between a cluster file system and a burst buffer appliance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17884484

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17884484

Country of ref document: EP

Kind code of ref document: A1