WO2023241115A1 - 数据迁移方法及相关装置 - Google Patents

数据迁移方法及相关装置 Download PDF

Info

Publication number
WO2023241115A1
WO2023241115A1 PCT/CN2023/080091 CN2023080091W WO2023241115A1 WO 2023241115 A1 WO2023241115 A1 WO 2023241115A1 CN 2023080091 W CN2023080091 W CN 2023080091W WO 2023241115 A1 WO2023241115 A1 WO 2023241115A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
metadata
storage device
storage
data
Prior art date
Application number
PCT/CN2023/080091
Other languages
English (en)
French (fr)
Inventor
苏毅
兰龙文
周文
程桢
方维
胡刚
肖尧文
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023241115A1 publication Critical patent/WO2023241115A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • This application relates to the fields of information technology (IT) and storage technology, and in particular to data migration methods and related devices.
  • IT information technology
  • a storage device has a low capacity but high storage performance
  • B storage device has a high capacity but poor storage performance. Therefore, when the business changes the data stored in A device, After the access performance requirements of a certain data (for example, called DATA1) are reduced, the data DATA1 can be stored in device B through a migration operation to optimize the overall storage cost.
  • a certain data for example, called DATA1
  • Current data classification methods mainly include classification methods based on intermediate devices and migration methods based on replication technology.
  • the first method requires the introduction of an external migration controller, which reads data from the source device and writes it to the target device.
  • the second method is to establish a bidirectional channel between the source device and the destination device, with the source device controlling and executing the process of writing data to the destination device. It can be seen that the above two methods need to realize mutual awareness of devices before migration.
  • the migrator needs to establish connection and access security control with the source device and destination device.
  • the source device needs to Connection and access security control need to be established between the device and the destination device. Due to the need for mutual awareness and security control between devices, the data migration process is complicated, resulting in low efficiency of data migration and easily affecting the normal use of data by users.
  • the embodiments of the present application provide a data migration method and device, which can realize state-based data migration, improve the efficiency of data migration, and improve the user's convenience in data use and management.
  • embodiments of the present application provide a data migration method, which method includes:
  • the data of the first file is stored on the first storage device, and the migration task for the first file indicates migrating the data of the first file from the first storage device to the second storage device;
  • the above method is implemented by a migration scheduling device.
  • the embodiment of the present application triggers the migration operation for the file by changing the metadata of the file.
  • the file belongs to the second storage device and the data of the file is still stored on the first storage device, the first file is moved from the second storage device.
  • One storage device is migrated to a second storage device.
  • This method can trigger data migration based on file status (attribute status such as file ownership, file storage layout, etc.).
  • file status attribute status such as file ownership, file storage layout, etc.
  • the device has the ability to change the metadata of a file, it can change the file status and trigger the migration process.
  • the security control process improves the efficiency of data migration and improves the convenience of data use and management for users. sex. Especially for businesses that contain multiple storage devices or multiple data centers, state-based data migration can further decouple the functions of each device, greatly improving the flexibility and scalability of the business system.
  • the storage location of the file data is still indicated by the storage layout information. Therefore, the above data migration method does not affect the normal use of the file data by users and improves the stability of the business system.
  • the metadata of the first file includes ownership information of the first file and storage layout information of the first file.
  • the storage device indicated by the ownership information of the first file is the first storage device, and the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and does not include the third storage device. 2. Storage devices.
  • the ownership information of the first file is the identification of the first storage device
  • the storage layout information of the first file includes the identification of the first storage device and does not include the identification of the second storage device.
  • the first change to the metadata of the first file includes:
  • the migration task for the first file includes an identifier of the first file, an identifier of the first storage device, and an identifier of the second storage device.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Metadata is synchronized across multiple devices.
  • synchronization between multiple devices means: it can be modified by any one of the multiple devices, the modified content can be learned by the multiple devices, and the metadata of the target system learned by the multiple devices is Consistent.
  • the plurality of devices include a first computing device and a second computing device.
  • the first computing device is located in a first storage device or is connected to the first storage device.
  • the second computing device is located in a second storage device or is connected to the second storage device. .
  • the metadata of the target file system is synchronized among multiple devices, when the metadata of the first file is changed, multiple devices can read the changes in the metadata of the file system, so that The first computing device and the second computing device can trigger a migration operation based on changes in metadata.
  • the source device can synchronize the metadata of the file system and learn about changes in the ownership information of the first file, thereby triggering a push operation on the data of the first file.
  • the destination device can synchronize the metadata of the file system and learn about changes in the ownership information of the first file, thereby triggering a pull operation for the data of the first file.
  • the metadata of the target file system is synchronized between multiple devices, so that multiple devices can use consistent metadata to represent the hierarchical structure and file (and/or directory) information of the target file system, which can easily implement a certain
  • the union and interoperability of file systems are conducive to the management of file systems.
  • the plurality of devices mentioned above may also include a migration scheduling device.
  • the method further includes:
  • the first notification indicates what changes have occurred in the metadata of the first file.
  • the first notification may include the content of the first change, such as the identification of the first file and the attributes (or values of attributes) of the file changed by the second change.
  • the first computing device and the second computing device can obtain the first modified metadata of the first file based on the first pre-modified metadata and the first modified content, and obtain the first modified metadata of the first file based on the first modified metadata.
  • a migration task is performed on the first changed metadata of a file.
  • the first notification may include the first changed metadata of the first file.
  • the first computing device and the second computing device may perform the migration task according to the first changed metadata of the first file.
  • the first notification indicates that the first change has occurred, but does not include the specific content of the change and the metadata of the first file after the first change.
  • the first computing device and/or the second computing device may, in response to the first notification, request the first modified metadata of the first file from the migration scheduling device, and obtain the first modified metadata of the first file according to the first file provided by the migration scheduling device.
  • the first changed metadata is used to perform the migration task.
  • the method further includes:
  • progress monitoring is implemented through file metadata changes, eliminating the need for data interaction with the device performing the migration, thus achieving decoupling between progress tracking and migration execution control.
  • metadata of the target file system is stored locally on multiple devices.
  • the device When a device makes changes to the target file system, the device notifies other devices that store the metadata of the target file system of the change, and the other devices change the locally stored metadata of the target file system accordingly based on the notification, thereby achieving Synchronization of target file system metadata across multiple devices.
  • the migration scheduling device, the first computing device, and the second computing device all store metadata of the first file system locally.
  • the migration scheduling device may send a first notification, and the first notification indicates that the metadata of the first file has undergone a first change.
  • the first computing device and the second computing device correspondingly change the locally stored metadata of the target file system based on the first notification, thereby achieving synchronization of the metadata of the target file system on multiple devices.
  • the metadata of the target file system is stored in a global metadata service.
  • the global metadata service can store metadata of the target file system.
  • the global metadata service can support access and update of metadata of the target file system. Specifically, when a certain device changes the metadata of the target file system, the change is provided to the global metadata service, and multiple devices can access the changed metadata of the target file system from the global metadata service. Thus, the metadata of the target file system can be synchronized on multiple devices.
  • the metadata of the file system is managed, accessed and updated, and multiple devices read or write metadata according to the format of the metadata in the global metadata service, unifying the metadata of the file.
  • the data representation method shields the differences in metadata management and access control between heterogeneous storage devices, which not only improves the user's ability to use data
  • the convenience of use and management can also improve the scalability and flexibility of the system.
  • a new storage device when a new storage device needs to share the metadata of the file system, it can join the share by using the global metadata service; similarly, when the storage device exits the share, it can disconnect the function from the global metadata service. Interaction to exit sharing.
  • the above implementation makes business expansion and contraction more flexible and easier to implement.
  • the method before obtaining the second modified metadata of the first file, the method further includes:
  • a second notification is received indicating that metadata of the first file has changed.
  • the migration scheduling device, the first computing device, and the second computing device all maintain metadata of the target file system; the obtaining the The second changed metadata of the first file includes:
  • the second notification includes the second changed content, or the second notification includes the second changed metadata of the first file.
  • the second notification contains the content of the second change.
  • the content of the second modification may include the identification of the first file and the attributes (or values of attributes) of the file changed by the second modification.
  • the migration scheduling device may obtain the metadata after the second change of the first file based on the metadata before the second change of the first file and the content of the second change.
  • the second notification includes second changed metadata of the first file.
  • the migration scheduling device can obtain the second modified metadata of the first file according to the second notification.
  • the notification (for example, the first notification, or the second notification, etc.) may be sent in the form of a message queue.
  • the sender writes messages to the message queue, and the receiver receives notifications by reading the message queue, thereby further reducing the coupling between different functional modules.
  • the migration scheduling device, the first computing device, and the second computing device all maintain metadata of the target file system
  • the obtaining the second modified metadata of the first file includes:
  • the metadata of the target file system is stored in a global metadata service and synchronized among the multiple devices through the global metadata service;
  • the obtaining the second modified metadata of the first file includes:
  • the global metadata service provides a service interface
  • the device can call the service interface to access and update metadata.
  • the service interface is a communication interface, such as an application programming interface (API), which can be used for data interaction and provision of services between different functional modules.
  • API application programming interface
  • the caller and the implementer can be decoupled.
  • the device calling the service interface can provide relevant data according to the requirements of the service interface, and the global metadata service can obtain relevant data through the service interface and implement related
  • the corresponding functions not only provide It improves the efficiency of accessing and updating metadata, and also improves the scalability and flexibility of the system.
  • the metadata of the target file system is stored in a global metadata service and synchronized among the multiple devices through the global metadata service;
  • the first change to the metadata of the first file includes:
  • the first change is implemented through a service interface provided by the global metadata service.
  • the global metadata service is located on any one of the multiple devices, or on any device outside the multiple devices.
  • the global metadata service is located on a third computing device.
  • the third computing device may be the same computing device as the first computing device or the second computing device, or may be another computing device other than the two.
  • the service interface of the global metadata service may be provided by the third computing device to the migration scheduling device.
  • the third computing device may provide another interface (referred to as a first interface for ease of differentiation) to the migration scheduling apparatus, and by calling the first interface, the function of calling the service interface of the global metadata service can be implemented.
  • the metadata of the target file system is in a table structure and the metadata can be modified.
  • the tabular structure is a data structure containing rows and columns. Each row (or each column) contains multiple values, and each value corresponds to an attribute.
  • Metadata with a tabular structure can add a row (or column) of metadata, delete a row (or column) of metadata, or modify existing attribute values in the metadata. That is, the first change can be implemented by modifying the metadata of the target file system.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the streaming structure is a data structure containing multiple pieces of information, and each piece of information is a metadata record.
  • the streaming structure has the following characteristics: read-only, increase-only, and ordered. "Read-only” means that the values of the records in the streaming structure can only be read and cannot be modified; “increment-only” indicates that the values of the records in the streaming structure can only be read and cannot be modified. Appending new records cannot delete (or modify) existing records, but multiple records belonging to the same node can be merged into one record; "ordered” means that the records in the streaming structure have a logical order, and the appended Records are appended to the end of the stream structure.
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the changed ownership information of the first file.
  • the first metadata record includes the identifier of the first file and the changed ownership information of the first file.
  • the changed ownership information of the file indicates that the storage device to which the first file belongs is the second storage device.
  • the above embodiment can perform a first change, and the first change is implemented by adding a metadata record in the metadata stream.
  • other devices can learn the changes in the metadata in the first file system by obtaining changes in the metadata stream (added records), and can accordingly update the local file hierarchy or node structure of the first file system. Properties to facilitate synchronization of file views on multiple devices.
  • the method before determining the migration task for the first file, the method further includes:
  • the external event information includes one or more of the following information: network connection status, device health status, or all Describe the personnel transfer status related to the first document.
  • the triggering of data migration is related to external event information.
  • the corresponding migration task will be determined and the data will be migrated to achieve intelligent migration of comprehensive multi-information flows and improve the user experience.
  • the data in location A can be migrated to location B.
  • migration can be actively triggered to migrate the data of the business to a storage device closer to the business trip destination.
  • determining the migration task for the first file includes:
  • the migration task for the first file is determined according to the analysis result of the metadata of the first file; wherein the analysis result includes one or more of the following information: the hot and cold status of the first file, the The security of the first file or the business related to the first file.
  • the hot or cold status of metadata can be indicated by the frequency of access to a file.
  • the metadata of the first file includes an attribute indicating the number of times the file is accessed within a period of time. If the number of accesses is greater than or equal to the first threshold, the data of the first file is migrated to a device with high storage speed (for example, a second storage device), thereby improving the efficiency of accessing the data of the first file and improving the service of the system. quality. Similarly, if the number of accesses of the first file is less than or equal to the second threshold, the data of the file is migrated to a device with high storage capacity to reduce storage costs.
  • the first threshold and the second threshold may be input by an administrator (such as a developer, management department, etc.), a manufacturer, etc., or may be preset.
  • the metadata of the first file may include an attribute indicating the security level of the file. For example, if the security level of the first file is high and the security level of the first storage device does not meet the security level requirements of the first file, then the data of the first file is migrated to a device that can meet the security level requirements of the first file. to effectively protect users’ file security needs and improve the system’s service quality.
  • the metadata of the first file includes attributes representing services related to the first file.
  • the related services of the first file are vehicle services, video services, or file download services.
  • the first file is used to store vehicle-mounted service data, when the vehicle-mounted service data needs to be migrated to the second storage device, the data of the first file will also be migrated accordingly. In this way, users can migrate files based on different businesses, which improves the convenience for users to manage business data and improves the service quality of the system.
  • the analysis results of metadata can indicate the migration requirements for files (such as access requirements, security requirements, business requirements, etc.). Determining migration tasks based on migration requirements can achieve overall storage optimization. In addition, users can express file migration needs by updating the file's metadata, realizing intelligent management of data and improving user convenience in data use and management.
  • determining the migration task for the first file includes:
  • the migration task for the first file is determined according to the migration instruction for the first file input by the user.
  • the user can migrate a certain file by inputting migration instructions, which can meet the user's personalized needs and improve the user experience.
  • the method further includes:
  • orchestrating tasks can include determining the execution order and execution priority of tasks, such as which file to migrate first.
  • the migration tasks of some files can be prioritized according to the priority of needs, thereby improving the user experience. For example, files whose access frequency surges in a short period of time can be prioritized for migration to increase the file's access rate as quickly as possible and improve user experience.
  • multiple tasks can be merged during the task orchestration process. For example, if task A instructs to migrate a first file from a first storage device to a second storage device, and task B instructs to migrate a first file from a first storage device to a third storage device, then task A and task B can be merged. A new task is obtained, and the new task indicates migrating the first file from the first storage device to the third storage device. This can reduce the probability of task execution errors, and secondly, reduce the computing power consumption of task execution, effectively improve task execution efficiency, and enhance user experience.
  • the above implementation method arranges the tasks before executing them, so that each migration task can be executed according to a reasonable execution order and execution method, thereby improving the user experience.
  • embodiments of the present application provide a data migration method, applied to a first computing device, the first computing device is located in a first storage device or is connected to the first storage device, and the first storage device The method further includes: storing the data of the first file on the computer.
  • the storage layout information of the first file indicates that the storage device storing the first file does not include the second storage device and includes the When using the first storage device, migrate the data of the first file from the first storage device to the second storage device.
  • the embodiment of the present application triggers the migration operation for the file by changing the ownership information of the file.
  • the ownership information of the file indicates the second storage device
  • the storage layout information of the file includes an indication that the storage device storing the first file does not include the
  • the second storage device includes the first storage device
  • the first storage device migrates the first file from the first storage device to the second storage device.
  • the embodiment of the present application triggers data migration based on file status (attribute status such as file ownership, file storage layout, etc.), so that the first computing device can migrate the file after obtaining changes in the file's ownership information, improving It improves the efficiency of data migration and improves the convenience of data use and management.
  • the state-based data migration process can further decouple the functions of each device and greatly improve the flexibility and scalability of the business system.
  • the storage location of the file is still indicated by the storage layout information. Therefore, the above data migration method does not affect the normal use of the file data by users and improves the stability of the business system.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Synchronizing between a plurality of devices including the first computing device.
  • the metadata of the target file system is stored in a global metadata service and synchronized between the multiple devices through the global metadata service;
  • the obtaining of metadata of the first file includes:
  • the data of the first file is transferred from the first storage device to Migrate to the second storage device, including:
  • Push the data of the first file to a shared storage area is connected to the first computing device and a second computing device, and the second computing device is located in the second storage device or is connected to the second computing device.
  • the second storage device is connected;
  • a change indicates that the shared storage area is added to the storage device indicated by the storage layout information of the first file; after the first change, the storage layout information of the first file indicates that the storage device of the first file is stored.
  • the storage device includes the first storage device and the shared storage area, and does not include the second storage device.
  • the data of the first file is temporarily stored in the intermediate shared storage area during the migration process.
  • the source device does not need to establish a data security access control mechanism with the destination device, further decoupling the source device and the destination device, and the business system flexibility and scalability.
  • layout information is stored to indicate that data has been pushed to shared storage.
  • This method can indicate the storage location of data to the second storage device without being aware of the second storage device, further decoupling the first storage device and the second storage device, and improving the flexibility and scalability of the business system. .
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes The identification of the first storage device does not include the identification of the second storage device;
  • the first change to the metadata of the first file includes:
  • the method before obtaining the metadata of the first file, the method further includes:
  • the method further includes:
  • a second notification is sent, the first notification indicating that the metadata of the first file has changed.
  • obtaining the metadata of the first file includes:
  • the metadata of the target file system is stored locally on multiple devices.
  • the device When a device makes changes to the target file system, the device notifies other devices that store the metadata of the target file system of the change, and the other devices change the locally stored metadata of the target file system accordingly based on the notification, thereby achieving Synchronization of target file system metadata across multiple devices.
  • the metadata of the target file system is stored in a global metadata service.
  • the global metadata service can store metadata of the target file system.
  • the global metadata service can support access and update of metadata of the target file system.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • Store layout information indicating storage
  • the storage device of the first file includes the first storage device and the shared storage area.
  • the method further includes:
  • the second modified metadata of the first file When the second modified metadata of the first file is obtained, the data of the first file on the first storage device is deleted; the second modification indicates the storage layout information of the first file. After the second change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and includes the second storage device;
  • the third change indicates deletion of the first storage device in the storage device indicated by the storage layout information of the first file; after the third change , the storage layout information of the first file indicates that the storage device storing the first file does not include the first storage device.
  • the data on the source device can be deleted, thereby avoiding multiple storage devices from repeatedly storing the data of a certain file and solving the problem. storage space to optimize overall storage costs.
  • the method before deleting the first file on the first storage device, the method further includes:
  • the file For another example, before deleting a file, mark the file as deletable. Deletable files cannot be accessed through normal means, and then delete the files uniformly when the preset conditions are met.
  • the preset conditions here may be that the time marked as deletable reaches the preset length, the data marked as deletable reaches the preset size, etc. This can facilitate the user to retrieve the data of the first file on the first storage device, reduce data loss caused by misoperation, and improve user experience.
  • migrating the data of the first file from the first storage device to the second storage device includes:
  • Pushing data of the first file to a second computing device Pushing data of the first file to a second computing device.
  • the first computing device can directly push the data of the first file to the second computing device without going through an intermediate device. Improve the efficiency of data migration.
  • the first storage device (or first computing device) and the second storage device (or second computing device) are devices located in the same data center, then the first computing device can push the file of the first file to the second computing device. data.
  • the method further includes:
  • a pull request for the first file is received from the second computing device.
  • the second computing device sends a pull request, indicating that the second computing device and/or the second storage device should be in an available state, so as to avoid migrating the data of the first file to the unavailable device, Improve data security and availability.
  • the method further includes:
  • the local file view indicating a hierarchical structure of a plurality of files stored on the first storage device, and storage layout information of the plurality of files indicating the first storage device. equipment.
  • the first computing device may provide a local view to the outside based on the storage layout information of the file, so that The user or application can easily obtain the data of the file stored on the first storage device, which satisfies the visualization requirement of the file and improves the user experience.
  • the method further includes:
  • a home file view of the first storage device is provided, the home local file view contains information of a plurality of files, and the home information of the multiple files indicates the first storage device.
  • the first computing device can provide a file view belonging to the first storage device to present information about the files belonging to the first storage device, which satisfies the visualization requirement for the files belonging to the first storage device and improves the user experience.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the method also includes:
  • a global file view includes information about files attributed to the first storage device and information about files attributed to the second storage device.
  • the global file view can integrate cross-device file information into one file view, so that data in different devices is no longer isolated data.
  • the low-level union of the global file system is insensitive to upper-level applications.
  • Upper-layer applications can use the global file system as easily as using a traditional file system, greatly improving users' convenience in data use and management.
  • embodiments of the present application provide a data migration method, applied to a second computing device, the second computing device is located in a second storage device, or the second computing device and the second storage device Connected, the method includes:
  • the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not include the third storage device.
  • the embodiment of the present application triggers a migration operation for a file by changing the ownership information of the file.
  • the ownership information of the file indicates a second storage device
  • the storage layout information of the file includes an indication that the storage device storing the first file does not include the
  • the second computing device pulls the data of the first file from the first storage device to the second storage device.
  • Embodiments of the present application trigger data migration based on file status (attribute status such as file ownership, file storage layout, etc.), so that the second computing device can pull the file belonging to the second computing device after obtaining changes in the file's ownership information.
  • Storage device file data improves the efficiency of data migration and improves the convenience of data use and management. Especially for businesses that contain multiple storage devices or multiple data centers, the state-based data migration process can further decouple the functions of each device and greatly improve the flexibility and scalability of the business system.
  • the storage location of the file is still indicated by the storage layout information. Therefore, the above data migration method does not affect the normal use of the file data by users and improves the stability of the business system.
  • the method further includes:
  • a first change is performed on the metadata of the first file, and the first change indicates that the second storage device is added to the storage device indicated by the storage layout information of the first file.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file does not Contains the label of the second storage device knowledge;
  • the first change to the metadata of the first file includes:
  • the storage layout information of the first file indicates that the storage device that stores the first file includes a first storage device, and the slave device that stores the first file Before the data device pulls the data of the first file to the second storage device, the method further includes:
  • the pull request is used to instruct a first computing device to push the first file; the first computing device is located in the first storage device or is connected to the first storage device. The device is connected.
  • the storage layout information of the first file indicates that the storage device that stores the first file includes a shared storage area, and the slave device that stores the data of the first file The device pulls the data of the first file to the second storage device, including:
  • the storage layout information of the first file indicates that the storage device that stores the first file includes a first storage device, and the data from the storage device of the first file is The device pulls the data of the first file to the second storage device, including:
  • the data of the first file is pulled from the device that stores the data of the first file to the second storage device:
  • the first file belongs to the target file system, the metadata of the first file is included in the metadata of the target file system, and the metadata of the target file system Metadata is stored in the global metadata service;
  • the obtaining of metadata of the first file includes:
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information includes the identification of the second storage device.
  • the method further includes:
  • the local file view indicating a hierarchical structure of a plurality of files stored on the first storage device, and storage layout information of the plurality of files indicating the first storage device. equipment.
  • the method further includes:
  • a home file view of the second storage device is provided, the home local file view contains information of a plurality of files, and the home information of the multiple files indicates the second storage device.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the method also includes:
  • a global file view includes information about files attributed to the first storage device and information about files attributed to the second storage device.
  • embodiments of the present application provide a data migration method, applied to a first computing device, the first computing device is located in a first storage device or connected to the first storage device, and the first storage device The method further includes: storing the data of the first file on the computer.
  • a change indicates that the shared storage area is added to the storage device indicated by the storage layout information of the first file; after the first change, the storage layout information of the first file indicates that the storage device of the first file is stored.
  • the storage device includes the first storage device and the shared storage area, and does not include the second storage device.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes The identification of the first storage device does not include the identification of the second storage device;
  • the first change to the metadata of the first file includes:
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Metadata is synchronized across multiple devices, including the first computing device.
  • the method further includes:
  • a first notification is sent, the first notification indicating that the metadata of the first file has changed.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information indicates that the storage device storing the first file includes the first storage device and the shared storage area.
  • the method further includes:
  • the second modified metadata of the first file When the second modified metadata of the first file is obtained, the data of the first file on the first storage device is deleted; the second modification indicates the storage layout information of the first file. After the second change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and includes the second storage device;
  • the third change indicates deletion of the first storage device in the storage device indicated by the storage layout information of the first file; after the third change , the storage cloth of the first file
  • the station information indicates that the storage device storing the first file does not include the first storage device.
  • the method before deleting the first file on the first storage device, the method further includes:
  • the method further includes:
  • the local file view indicating a hierarchical structure of a plurality of files stored on the first storage device, and storage layout information of the plurality of files indicating the first storage device. equipment.
  • the method further includes:
  • a home file view of the first storage device is provided, the home local file view contains information of a plurality of files, and the home information of the multiple files indicates the first storage device.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the method also includes:
  • a global file view includes information about files attributed to the first storage device and information about files attributed to the second storage device.
  • inventions of the present application provide a migration scheduling device.
  • the migration scheduling device includes a task determination module and a metadata update module.
  • the migration scheduling device is used for any of the methods described in the first aspect.
  • the task determining module is configured to determine a migration task for the first file, the data of the first file is stored on the first storage device, and the task for the first file is stored on the first storage device.
  • the file migration task indicates migrating the data of the first file from the first storage device to the second storage device;
  • the metadata update module is configured to make a first change to the metadata of the first file to trigger the execution of the migration task for the first file; wherein the first change indicates the ownership of the first file.
  • the storage device is changed from the first storage device to the second storage device.
  • the metadata of the first file includes ownership information of the first file and storage layout information of the first file.
  • the storage device indicated by the ownership information of the first file is the first storage device, and the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and does not include the third storage device. 2.
  • the ownership information of the first file is the identification of the first storage device, and the storage layout information of the first file includes the identification of the first storage device and does not include the identification of the second storage device.
  • the metadata update module is configured to change the ownership information of the first file from the identification of the first storage device to the identification of the second storage device. logo.
  • the migration task for the first file includes an identifier of the first file, an identifier of the first storage device, and an identifier of the second storage device.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Metadata is synchronized across multiple devices.
  • the plurality of devices include a first computing device and a second computing device.
  • the first computing device is located in a first storage device or is connected to the first storage device.
  • the second computing device is located in a second storage device or is connected to the second storage device.
  • the plurality of devices mentioned above may also include a migration scheduling device.
  • the migration scheduling device further includes a communication module, and the communication module Module configured to send a first notification indicating that the metadata of the first file has been changed, so that the first computing device or the second computing device obtains the metadata according to the first notification.
  • the first modified metadata of the first file is obtained, and a migration task for the first file is executed according to the first modified metadata of the first file.
  • the first notification indicates what changes have occurred in the metadata of the first file.
  • the first notification may include the content of the first change, and/or the first notification may include the first change of the first file. the subsequent metadata.
  • the first notification indicates that the first change has occurred, but does not include the specific content of the change and the metadata after the first change of the first file.
  • the migration scheduling device further includes a task monitoring module, and the task monitoring module is used to:
  • metadata of the target file system is stored locally on multiple devices.
  • the metadata of the target file system is stored in a global metadata service.
  • the global metadata service can store metadata of the target file system.
  • the global metadata service can support access and update of metadata of the target file system.
  • the migration scheduling device further includes a communication module, and the communication module is used for:
  • a second notification is received indicating that metadata of the first file has changed.
  • the task monitoring module is also used to:
  • the second notification includes the second changed content, or the second notification includes the second changed metadata of the first file.
  • the notification (for example, the first notification, or the second notification, etc.) may be sent in the form of a message queue.
  • the sender writes messages to the message queue, and the receiver receives notifications by reading the message queue, thereby further reducing the coupling between different functional modules.
  • the communication module is further configured to: send the modified information for obtaining the first file to the first computing device or the second computing device. Metadata requests;
  • the task monitoring module is further configured to obtain the second modified metadata of the first file according to the response of the first computing device or the second computing device to the request.
  • the task monitoring module is further configured to obtain the second modified metadata of the first file from the global metadata service.
  • the global metadata service provides a service interface
  • the migration scheduling device can call the service interface to implement access and update of metadata
  • the service interface is a communication interface, such as an application programming interface (API), which can be used to interact with data and provide services between different functional modules.
  • API application programming interface
  • the metadata update module is also used to:
  • the first change is implemented through a service interface provided by the global metadata service.
  • the global metadata service is located in any of the multiple devices. means on one device, or on any device outside the multiple devices.
  • the global metadata service is located on the third computing device.
  • the service interface of the global metadata service may be provided by the third computing device to the migration scheduling apparatus.
  • the third computing device may provide another interface (referred to as a first interface for ease of differentiation) to the migration scheduling apparatus, and by calling the first interface, the function of calling the service interface of the global metadata service can be implemented.
  • the metadata of the target file system is in a table structure and the metadata can be modified.
  • the tabular structure is a data interface containing rows and columns. Each row (or each column) contains multiple values, and each value corresponds to a field.
  • Metadata with a tabular structure can add metadata, delete metadata, or modify existing metadata. That is, the first change can be implemented by modifying the metadata of the target file system.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the metadata update module is also used to:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the changed ownership information of the first file.
  • the first metadata record includes the identifier of the first file and the changed ownership information of the first file.
  • the changed ownership information of the file indicates that the storage device to which the first file belongs is the second storage device.
  • the task determination module is also used to:
  • the external event information includes one or more of the following information: network connection status, device health status, or personnel transfer status related to the first file.
  • the task determination module is also used to:
  • the migration task for the first file is determined according to the analysis result of the metadata of the first file; wherein the analysis result includes one or more of the following information: the hot and cold status of the first file, the The security of the first file or the business related to the first file.
  • the task determination module is also used to:
  • the migration task for the first file is determined according to the migration instruction for the first file input by the user.
  • the task determination module is also used to determine the migration task for the second file
  • the migration scheduling device further includes a task arrangement module, which is used to arrange the execution sequence of the migration task for the first file and the migration task for the second file.
  • orchestrating tasks can include determining the execution order, execution priority, etc. of tasks.
  • multiple tasks can be merged during the task orchestration process.
  • embodiments of the present application provide a computing device.
  • the computing device includes a metadata acquisition module and a migration module.
  • the computing device is configured to implement the method described in any one of the second aspects.
  • the computing device is located in the first storage device or connected to the first storage device.
  • the metadata acquisition module is configured to acquire metadata of the first file, where the metadata of the first file includes the attribution information of the first file and the The storage layout information of the first file;
  • the migration module is configured to determine that the storage device indicated by the ownership information of the first file is a second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not contain the second storage device. When two storage devices include the first storage device, migrate the data of the first file from the first storage device to the second storage device.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Metadata is synchronized across multiple devices, including the computing device or the first computing device on which the computing device resides.
  • the metadata of the target file system is stored in a global metadata service and synchronized among the multiple devices through the global metadata service;
  • the metadata acquisition module is also used to:
  • the migration module is also used to:
  • Push the data of the first file to a shared storage area is connected to the first computing device and a second computing device, and the second computing device is located in the second storage device or is connected to the second computing device.
  • the second storage device is connected;
  • the computing device further includes a metadata update module configured to perform a first change on the metadata of the first file to trigger the second computing device to obtain the metadata from the shared storage area.
  • the data of the first file is stored in the second storage device, and the first change instruction is to add the shared storage area in the storage device indicated by the storage layout information of the first file; in the first After the change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes The identification of the first storage device does not include the identification of the second storage device;
  • the metadata update module is also used to:
  • the computing device further includes a communication module, and the communication module is used for:
  • a first notification is received, the first notification indicating that metadata of the first file has changed.
  • the computing device further includes a communication module, and the communication module is used for:
  • a second notification is sent, the first notification indicating that the metadata of the first file has changed.
  • the metadata acquisition module is also used to:
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the metadata update module is further configured to: append a first metadata record at the end of the metadata of the target file system, where the first metadata record includes the identifier of the first file and the The storage layout information of the first file indicates that the storage device that stores the first file includes the first storage device and the shared storage area.
  • the computing device further includes a deletion control module, and the deletion control module is used to:
  • the second modified metadata of the first file When the second modified metadata of the first file is obtained, the data of the first file on the first storage device is deleted; the second modification indicates the storage layout information of the first file. After the second change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and includes the second storage device;
  • the metadata update module is further configured to: perform a third change to the metadata of the first file, the third change indicating deletion of the third change in the storage device indicated by the storage layout information of the first file.
  • deletion control module is also used to:
  • the migration module is also used to:
  • Pushing data of the first file to a second computing device Pushing data of the first file to a second computing device.
  • the computing device further includes a communication module, and the communication module is used to:
  • a pull request for the first file is received from the second computing device.
  • the computing device further includes a view providing module configured to provide a home file view of the first storage device, where the home local file view includes multiple File information, and the ownership information of the multiple files indicates the first storage device.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the computing device also includes a view providing module for providing a global file view that includes information about files belonging to the first storage device and information about files belonging to the second storage device.
  • embodiments of the present application provide a computing device.
  • the computing device includes a metadata acquisition module and a migration module.
  • the computing device is configured to implement the method described in any one of the third aspects.
  • the computing device is located in the second storage device or is connected to the second storage device.
  • the metadata acquisition module is configured to acquire metadata of the first file, where the metadata of the first file includes the ownership information of the first file and the The storage layout information of the first file;
  • the migration module is configured to: when the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates where the first file is stored; When the storage device does not include the second storage device, the data of the first file is pulled from the device that stores the data of the first file to the second storage device.
  • the computing device further includes a metadata update module, and the metadata update module is further configured to:
  • a first change is performed on the metadata of the first file, and the first change indicates that the second storage device is added to the storage device indicated by the storage layout information of the first file.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file does not Contains the identification of the second storage device;
  • the metadata update module is also used to:
  • the storage layout information of the first file indicates that the storage device that stores the first file includes the first storage device.
  • the computing device also includes a communication module for:
  • the pull request is used to instruct a first computing device to push the first file; the first computing device is located in the first storage device or is connected to the first storage device. The device is connected.
  • the storage layout information of the first file indicates that the storage device storing the first file includes a shared storage area.
  • the migration module is used for:
  • the storage layout information of the first file indicates that the storage device that stores the first file includes the first storage device.
  • the migration module is also used for:
  • the first file belongs to the target file system, the metadata of the first file is included in the metadata of the target file system, and the metadata of the target file system Metadata is stored in the global metadata service;
  • the metadata acquisition module is also used to:
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information includes the identification of the second storage device.
  • the computing device further includes a view providing module, the view providing module being configured to provide a local file view of the first storage device, the local file view indicating that the file is stored in the first storage device.
  • a hierarchical structure of multiple files on the first storage device, and storage layout information of the multiple files indicates the first storage device.
  • the computing device further includes a view providing module configured to provide a home file view of the first storage device, where the home local file view includes multiple File information, and the ownership information of the multiple files indicates the first storage device.
  • files belonging to the first storage device and files belonging to the second storage device are federated to form a global file system.
  • the computing device also includes a view providing module for providing a global file view that includes information about files belonging to the first storage device and information about files belonging to the second storage device.
  • embodiments of the present application provide a computing device, which includes a communication module, a migration module and a Metadata update module, the computing device is configured to implement the method described in any one of the fourth aspects.
  • the computing device is located in the first storage device or connected to the first storage device.
  • the communication module is configured to receive a pull request for the first file from a second computing device, and the second computing device is connected to a second storage device.
  • the migration module is used to push the data of the first file to the shared storage area
  • the metadata update module is configured to perform a first change on the metadata of the first file to trigger the second computing device to obtain the data of the first file from the shared storage area and store it in the In the second storage device, the first change indicates that the shared storage area is added to the storage device indicated by the storage layout information of the first file; after the first change, the storage layout of the first file The information indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes The identification of the first storage device does not include the identification of the second storage device;
  • the metadata update module is also used to:
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system Metadata is synchronized across multiple devices, including the first computing device.
  • the communication module is also used for:
  • a first notification is sent, the first notification indicating that the metadata of the first file has changed.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an ID of the node. Attributes, where the node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the metadata update module is used for:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information indicates that the storage device storing the first file includes the first storage device and the shared storage area.
  • the computing device includes a deletion control module, the deletion control module configured to delete the second changed metadata of the first file when the second modified metadata of the first file is obtained.
  • the metadata update module is configured to perform a third change to the metadata of the first file, where the third change indicates deletion of the first storage in the storage device indicated by the storage layout information of the first file.
  • the storage layout information of the first file indicates that the storage device storing the first file does not include the first storage device.
  • deletion control module is also used to:
  • inventions of the present application provide a data migration system.
  • the data migration system includes a first computing device and a second computing device.
  • the first computing device is located in a first storage device or the first computing device is connected to a first storage device.
  • the first storage device is connected, the second computing device is located in the second storage device, or the second computing device is connected to the second storage device.
  • the first storage device is used to implement the method described in any one of the second aspect or any one of the fourth aspect
  • the second storage device is used to implement the method described in any one of the third aspect.
  • the first storage device includes the computing device described in any one of the sixth aspect or any one of the eighth aspect
  • the second storage device includes the computing device described in any one of the seventh aspect.
  • the data migration system further includes a migration scheduling device, and the migration scheduling device is configured to implement the method described in any one of the first aspects.
  • the data migration system further includes a migration scheduling device, and the computing device is the migration scheduling device described in any one of the fifth aspects.
  • inventions of the present application provide a data migration system.
  • the data migration system includes a first computing device and a second computing device.
  • the first computing device is located in a first storage device or the first computing device is connected to a first storage device.
  • the first storage device is connected, the second computing device is located in the second storage device, or the second computing device is connected to the second storage device;
  • First computing device for:
  • the storage layout information of the first file indicates that the storage device storing the first file does not include the second storage device and includes the
  • Second computing device for:
  • the data migration system further includes a migration scheduling device, the migration scheduling device is used to implement the method described in any one of the first aspects, or the migration scheduling The device is the migration scheduling device of any one of the fifth aspects.
  • inventions of the present application provide a data migration system.
  • the data migration system includes a first computing device and a second computing device.
  • the first computing device is located in a first storage device or the first computing device Connected to the first storage device, the second computing device is located in the second storage device or the second computing device is connected to the second storage device;
  • Second computing device for:
  • the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not include the third storage device.
  • Second storage device to The second storage device sends a pull request, the pull request being used to instruct the first computing device to push the data of the first file;
  • First computing device for:
  • the second computing device is also used for:
  • the data migration system further includes a migration scheduling device, the migration scheduling device is used to implement the method described in any one of the first aspects, or the migration The scheduling device is the migration scheduling device of any one of the fifth aspects.
  • inventions of the present application provide a data migration system.
  • the data migration system includes a first computing device and a second computing device.
  • the first computing device is located in a first storage device or the first computing device Connected to the first storage device, the second computing device is located in the second storage device or the second computing device is connected to the second storage device;
  • First computing device for:
  • the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not include the third storage device.
  • Second computing device for:
  • the data migration system further includes a migration scheduling device, the migration scheduling device is used to implement the method described in any one of the first aspects, or the migration scheduling The device is the migration scheduling device of any one of the fifth aspects.
  • inventions of the present application provide a data migration system.
  • the data migration system includes a first computing device and a second computing device.
  • the first computing device is located in a first storage device or the first computing device Connected to the first storage device, the second computing device is located in the second storage device or the second computing device is connected to the second storage device;
  • Second computing device for:
  • the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not include the third storage device.
  • First computing device for:
  • the second computing device is also used for:
  • embodiments of the present application provide a computing device, which includes a processor and a memory; the processor executes instructions stored in the memory, so that the computing device implements any of the foregoing first aspects. described method.
  • the computing device further includes a communication interface, the communication interface is used to receive and/or send data, and/or the communication interface is used to provide input and/or output to the processor.
  • the above embodiments are explained by taking a processor (or a general-purpose processor) specified by calling a computer to execute a method as an example.
  • the processor may also be a dedicated processor, in which case the computer instructions have been preloaded in the processor.
  • the processor may also include both a dedicated processor and a general-purpose processor.
  • the processor and the memory may be integrated into one device, that is, the processor and the memory may be integrated together.
  • embodiments of the present application further provide a computing device cluster, the computing device cluster includes at least one computing device, and each computing device includes a processor and a memory;
  • the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, so that the cluster of computing devices executes the method described in any one of the first aspects.
  • embodiments of the present application provide a computing device, which includes a processor and a memory; the memory is used to store computer instructions, and the processor is used to execute the computer instructions stored in the memory, so that the The computing device implements the method described in any one of the second aspects, or implements the method described in any one of the fourth aspects.
  • embodiments of the present application provide a storage device, which includes a computing device and a storage disk connected to the computing device.
  • the connection may be through a wired line or a wireless line.
  • the two are connected via a bus.
  • the two are connected through a switch.
  • the computing device may be the computing device described in the sixteenth aspect.
  • embodiments of the present application provide a storage device, which includes a computing device and a storage disk connected to the computing device; the memory is used to store computer instructions, and the processor is used to execute instructions stored in the memory. Computer instructions, so that the computing device implements the method described in any one of the third aspects.
  • embodiments of the present application provide a storage device, which includes a computing device and a storage disk connected to the computing device.
  • the connection may be through a wired line or a wireless line.
  • the two are connected via a bus.
  • the two are connected through a switch.
  • the computing device may be the computing device described in the eighteenth aspect.
  • embodiments of the present application provide a computer-readable storage medium. Instructions are stored in the computer-readable storage medium. When the instructions are run on at least one processor, any one of the foregoing first aspects is implemented. or implement the method described in any one of the foregoing second aspects; or implement the method described in any one of the foregoing third aspects; or implement the method described in any one of the foregoing fourth aspects. .
  • this application provides a computer program product.
  • the computer program product includes computer instructions, When the instructions are run on at least one processor, the method described in any one of the first aspects is implemented; or the method described in any one of the second aspects is implemented; or any one of the third aspects is implemented.
  • the computer program product can be a software installation package or image package. If the foregoing method needs to be used, the computer program product can be downloaded and executed on the computing device.
  • Figure 1 is a schematic architectural diagram of a data migration system provided by an embodiment of the present application.
  • Figure 2 is an architectural schematic diagram of a storage device provided by an embodiment of the present application.
  • Figure 3 is an architectural schematic diagram of another data migration system provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of an operating scenario of a data migration system provided by an embodiment of the present application.
  • Figure 5 is a view of a global file system provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a metadata flow provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of changes to a metadata table provided by an embodiment of the present application.
  • Figure 8 is a schematic flowchart of a data migration method provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of a change of metadata of a first file provided by an embodiment of the present application.
  • Figure 10 is a schematic flow chart of another data migration method provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of yet another change of metadata of a first file provided by an embodiment of the present application.
  • Figure 12 is a schematic flow chart of another data migration method provided by an embodiment of the present application.
  • Figure 13 is a schematic structural diagram of a migration scheduling device provided by an embodiment of the present application.
  • Figure 14 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
  • Figure 15 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
  • a file system is a method and data structure used to identify files on a storage disk (such as a disk, solid state drive, or partition, etc.), that is, a method of organizing files on a storage disk.
  • the main function of the file system is to allow users to easily read and write files. For example, if the user provides the file system with the identification of a specified file (such as the name of the file, the path of the file, etc.), the file system can access the data of the corresponding file.
  • Files, data, metadata A file, or computer file, is a collection of information. Files contain data and metadata. Data is the data content of the file; metadata is the information describing the file, such as file name, file size, file type, etc.
  • file data is generally unstructured data, such as documents, pictures, videos, audios, and other data without a fixed structure.
  • the metadata of a file can contain attribution information, which is used to specify the device to which the file belongs, such as the storage device to which the file belongs.
  • the file's belonging device is used to manage the data of the file, including but not limited to maintaining the latest complete data of the file, publishing data changes when the data of the file changes, or releasing data (such as returning data to the application that requested the data), etc. item or multiple items.
  • the metadata of the directory can also include ownership information, which is used to specify the ownership device of the directory.
  • a message queue is a data structure that can be understood as a list containing one or more messages. Messages are stored on the message queue before being processed and deleted. The message sender can interact with the message receiver through the message queue service. It should be understood that in this application, for convenience of description, data structures containing multiple messages are collectively called message queues, and it is not intended to limit the implementation of message queues through queues. For example, during specific implementation, message queues can also be implemented through lists, heaps, linked lists, or stacks.
  • Data migration refers to the process of migrating data from one device (source device) to another device (destination device).
  • HSM Hierarchical Storage Management
  • data migration does not focus on how the source device handles the data stored on it, while data grading requires deleting the data stored on the source device to free up storage space.
  • file system is a system that provides access and access services to data.
  • name of a system with similar characteristics may not necessarily be called a file system.
  • this application takes the migration of data in a file system as an example. It is also applicable to other similar systems.
  • object systems when some object systems store objects, they can store and access data through object formats. Data stored in the form of objects also have corresponding metadata, and the embodiments of this application are also applicable to object systems.
  • Embodiments of the present application provide a data migration method and device.
  • the method triggers data migration of the file based on changes to the metadata of the file (such as changes to metadata such as the file's ownership, storage layout, etc.), and
  • the migration progress of the file's data can also be reflected by the status of the file's metadata.
  • For the migration scheduling device there is no need to establish access security control with the source device and the destination device for migration, which simplifies the security control process of data migration.
  • the source device and the destination device by updating the file ownership information and storage layout information to indicate the completed operations to the other party, the efficiency of data migration can be improved and the convenience of data use and management can be improved.
  • the embodiments of the present application can not only improve the efficiency of data migration, but also realize the decoupling of various devices during the migration process, greatly improving the flexibility and scalability of the business system.
  • Figure 1 is a schematic architectural diagram of a data migration system provided by an embodiment of the present application.
  • the data migration system 10 includes a storage device 101 and a storage device 102.
  • the storage device 101 can provide storage space and has the ability to store data.
  • the storage device 101 includes a computing device 1011 and a storage disk 1012, and the computing device 1011 and the storage disk 1012 are connected.
  • the computing device 1011 has computing power;
  • the storage disk 1012 is used to provide storage space, and the storage disk 1012 can store file data.
  • the storage disk 1012 includes but is not limited to a hard disk, a random access memory, a read only memory (ROM), etc.
  • the storage disk can also be virtual, such as a virtual storage pool.
  • the computing device 1011 can complete one or more of the following functions: obtaining metadata of a file, controlling the reading and writing of data in the storage disk 1012, or changing metadata.
  • the storage device 102 includes a computing device 1021 and a storage disk 1022, and the computing device 1021 and the storage disk 1022 are connected.
  • the computing device 1021 has computing capabilities; the storage disk 1022 is used to provide storage space.
  • the computing device 1021 and the storage disk 1022 please refer to the aforementioned introduction to the computing device 1011 and the storage disk 1012.
  • File data often needs to be migrated across devices, for example, file data stored in the storage device 101 is migrated to the storage device 102 .
  • the embodiment of the present application can control data migration based on the status of the file.
  • the metadata of a file includes ownership information and storage layout information of the file.
  • the ownership information indicates the device to which the file belongs, such as the storage device to which the file belongs; and the storage layout information of the file indicates the device in which the file is stored.
  • the file's belonging device is the storage device 101 , and since the data of the file is stored on the storage device 101 , the storage layout information of the file indicates the storage device 101 .
  • the file's data migration can be triggered. For example, when the ownership information of a file is changed from indicating the storage device 101 to indicating the storage device 102, the storage device 101 and/or the storage device 102 are triggered to migrate the data of the file from the storage device 101 to the storage device 102.
  • the data migration operation can be triggered. There is no need to establish a connection between the source device (the device where the data is migrated) and the destination device (the device where the data is migrated) for the purpose of migration.
  • the security control of data access simplifies the security control process of data migration, improves the efficiency of data migration, and improves the convenience of data use and management for users.
  • state-based data migration can further decouple the functions of each device, greatly improving the flexibility and scalability of the business system.
  • the storage device 101 can push the file data to a designated device, so that the storage device 102 obtains the file from the designated device to implement data migration.
  • the designated device may be a customized device that can provide data storage services (also called shared storage area) for the storage device 101 and the storage device 102 .
  • the data storage service can be provided by the global data service, or by a third-party temporary storage device or intermediate device.
  • the computing devices eg, computing device 1011, computing device 1021
  • the storage device may be implemented by software and/or implemented by hardware.
  • the computing device may be a controller, a processor, a server, etc.
  • controllers include but are not limited to storage controllers (such as memory controllers, hard disk controllers, integrated drives, electronic controllers, disk array controllers, etc.), combinational logic controllers, hard wiring controllers, etc.
  • Processors include but are not limited to central processing units, image processors, artificial intelligence processors, microprocessors or programmable logic gate arrays, etc.
  • the controller can also be regarded as a processor because the controller also has computing capabilities and/or can execute instructions.
  • Serve Servers include but are not limited to general computers, storage servers, cloud servers or blade servers. When the functions of the computing device are implemented by servers, the number of servers included may be one or multiple (such as a server cluster).
  • the functions implemented by the computing device can be implemented through software functional units.
  • the computing module can be a virtual machine, container, cloud, etc.
  • a virtual machine is a computer system simulated by software that has complete hardware system functions and runs in an isolated environment.
  • a container is an isolated environment that packages applications and application dependencies.
  • the cloud is a software platform that uses application virtualization technology, which allows one or more software and applications to be developed and run in an independent virtualized environment.
  • the cloud can be deployed on public cloud, private cloud, or hybrid cloud.
  • a computing device may include code that runs on a computing instance.
  • the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container.
  • the computing device and the storage disk in the storage device can be integrated.
  • the storage device is a storage system integrating disk and control.
  • the storage device includes a controller (the number may be one or more), and the controller is connected to the storage disk (such as a hard disk) through a bus.
  • the controller can be used to process data access requests from outside the storage device (server or other storage system), and can also be used to process requests generated within the storage device. For example, when the controller receives the write data request sent by the application server, the controller may send the data carried in the write data request to the storage disk for storage.
  • the computing device may be a controller in the storage device.
  • FIG. 2 shows an architectural schematic diagram of a possible storage device provided by an embodiment of the present application.
  • the computing device 1021 in the storage device 101 is an independent device, and the storage disk 1022 is located outside the computing device 1021.
  • the computing device and the storage disk are connected to each other, and the connection method between the two can be a bus or a network.
  • the network is, for example, a wired network, a wireless network, a combination of a wired network and a wireless network, etc.
  • the two can be connected through a network cable or through a switch.
  • the change of the file's ownership information can be performed by the storage device 101, or by the storage device 102, or by other devices.
  • the data migration system further includes a migration scheduling device, which is used to determine a migration task and change the metadata of the file based on the migration task.
  • the data migration system 30 includes a storage device 101, a storage device 102, and a migration scheduling device 301.
  • the migration scheduling device 301 can change the metadata of the file, and the change of metadata can be obtained by the storage device 101 and/or the storage device 102, thereby triggering the migration of the data of the file.
  • the migration scheduling device may also be called a data scheduling engine.
  • the migration scheduling device 301 can determine the migration task for the file based on the input information, and change the metadata based on the migration task.
  • the input information may be one or more of external event information, user input instructions, metadata analysis results, etc.
  • the migration scheduling device 301 may include a migration strategy module.
  • the migration strategy module is used to implement hierarchical migration strategies. These migration strategies may be defined by predefined algorithms (such as AI modules) or preset rules. . Furthermore, the migration strategy module can also determine migration tasks based on the input information and hierarchical migration strategies.
  • the number of tasks determined by the migration scheduling device 301 may be multiple. These multiple tasks may be for the same file, or they may be for different files. Whether there are conflicts between tasks, the order of task execution, etc. will all affect task execution. Success rate and efficiency.
  • the migration scheduling device 301 may include a task orchestration module, which is used to orchestrate multiple tasks. Further, orchestrating tasks can include determining the execution order and execution priority of tasks, such as which file to migrate first, etc. In addition, the migration scheduling device can also merge multiple tasks in the process of arranging tasks.
  • the migration scheduling device 301 may include a migration task management module.
  • the migration task management module is used to track the execution progress of the task and facilitate the acquisition of the task execution status.
  • tasks that progress slowly or fail can be processed as soon as possible to improve system stability.
  • the data migration system 30 also includes a metadata analysis device 302.
  • the metadata analysis device is also called a metadata analysis engine.
  • the migration scheduling device 301 can also be implemented by software and/or by hardware; the metadata analysis device 302 can also be implemented by software and/or by hardware.
  • the migration scheduling device 301 and the metadata analysis device 302 can be set up independently or integrated into the same device.
  • the metadata analysis device 302 is used to analyze the metadata of the file and provide the analysis results to the migration scheduling device 301 .
  • the migration scheduling device determines the migration task for the file based on the analysis results.
  • the analysis results include one or more of the following information: the hot or cold status of the file, the security of the file, or the business related to the file, etc.
  • the following takes data migration based on the hot and cold status of files as an example to illustrate a possible operation scenario of the data migration system.
  • Figure 4 shows a schematic diagram of an operating scenario of a data migration system provided by an embodiment of the present application.
  • the data scheduling engine includes one or more of a migration strategy module, a task encoding module, and a migration task management module.
  • the data scheduling engine may determine the migration task for the file based on one or more of external event information, user input, metadata analysis results, etc.
  • the metadata analysis engine may include a data hot and cold profiling module.
  • the data hot and cold profiling module is used to determine the hot and cold degree of data of files in the file system.
  • the degree of hotness and coldness of data includes three levels: hot data, warm data and cold data. Hot data is data that is accessed more frequently, followed by warm data and cold data.
  • a file's metadata can be included in the global file system's metadata.
  • the global file system refers to a file system obtained by combining files (or file systems) on multiple storage devices. It is also called a joint file system.
  • Figure 4 shows a possible global file system.
  • the directories are represented by a box pattern (for the convenience of distinction, the root directory is a diamond), and the files are represented by a circular pattern.
  • the number in the middle of the shape is the node number corresponding to the directory and file.
  • the name outside the shape is the name of the directory or the name of the file. It should be understood that the number of the node, the order of the files (or directories), the name of the file (or directory), etc. This is only an example and is not intended to limit this application.
  • storage device S1 is a storage device that focuses on high access speed.
  • Storage device S3 is a large-capacity storage device.
  • the storage device S2 is a storage device with medium capacity and access speed. It is not difficult to see that storage device S1 is suitable for storing hot data and is conducive to high-speed file access.
  • Storage device S2 is suitable for storing warm data.
  • Storage device S1 is suitable for storing cold data.
  • the metadata analysis engine can provide the data scheduling engine with the hotness and coldness of the data.
  • the data scheduling engine determines the migration task for the file based on the hotness and coldness of the data, and modifies the metadata of the file based on the migration task.
  • the storage device obtains the metadata of the modified file, it performs data migration based on the metadata changes.
  • the hot data is migrated to the storage device S1 for storage
  • the warm data is migrated to the storage device S2 for storage
  • the cold data is migrated to the storage device S2.
  • migrated to Storage on storage device S3 achieves optimal data access performance and optimal storage cost.
  • the data scheduling engine changes the ownership device of the file named "001.png” to storage device S2. , causing the data of the file to be migrated to storage device S2 for storage.
  • the data scheduling engine changes the ownership device of the file named "002.png" to the storage device. S3, so that the data of the file is migrated to the storage device S3 for storage.
  • the metadata of the file system was mentioned in the introduction to the architecture above.
  • the format of the metadata of the file system is introduced below.
  • the file's metadata is contained in the file system's metadata. Since file attributes will be changed, the file system's metadata needs to support dynamic changes in metadata.
  • the metadata of the file system is in a streaming structure and contains multiple metadata records.
  • Each metadata record contains the identification of the file and the attributes of the file.
  • the attributes of the file include, for example, the file's ownership information, file One or more items of storage layout information, file creation time, etc.
  • the streaming structure is a data structure containing multiple pieces of information, and each piece of information is a metadata record.
  • the streaming structure has the following characteristics: read-only, increase-only, and ordered. "Read-only” means that the values of the records in the streaming structure can only be read and cannot be modified; “increment-only” indicates that the values of the records in the streaming structure can only be read and cannot be modified. Appending new records cannot delete (or modify) existing records, but multiple records belonging to the same file (or directory) can be merged into one record; "ordered” means that the records in the streaming structure have logical Sequentially, appended records are added at the end of the streaming structure.
  • Metadata stream In stream-structured metadata (hereinafter referred to as metadata stream), when the metadata of a file in the file system changes, a metadata record is appended to the end of the metadata stream. Other devices can obtain changes in file attributes by reading the newly appended metadata record at the end of the metadata stream.
  • Figure 6 is a schematic diagram of a metadata flow provided by an embodiment of the present application.
  • the value of the attribute of the file named "001.png" contains the identification of storage device S1. (i.e. S1); the storage layout information is ⁇ S1:1, S2:0 ⁇ , indicating that the file data is stored on the storage device S1 but the file data is not stored on the storage device S2. That is, the file belongs to the storage device S1, and the data of the file is stored in the storage device S1.
  • a metadata record 602 is appended to the end of the metadata stream.
  • the ownership information of the file has been changed to the identification of the storage device S2 (ie, S2).
  • the ownership information of the file has been changed, but the file data is still stored in the storage device S1. Therefore, the file data needs to be migrated from the storage device S1 to the storage device S2.
  • a migration operation is performed in response to the change of the file ownership device.
  • the metadata of the file system is a table structure.
  • the metadata in a tabular structure (hereinafter referred to as the metadata table) includes rows and columns. Each row (or each column) contains the values of multiple attributes, and the value of each attribute corresponds to an attribute.
  • the metadata table can add a row (or column) of metadata, delete a row (or column) of metadata, or modify the values of existing attributes.
  • the metadata table when the metadata of a file in the file system changes, the value of an original field in the metadata table is modified. Other devices obtain the value of the modified file's metadata, to get changes to file attributes.
  • the file named "001.png" contains the identification of the storage device S1 (i.e. S1). ;
  • the storage layout information is ⁇ S1:1, S2:0 ⁇ , indicating that the file data is stored on the storage device S1 but the file data is not stored on the storage device S2. That is, the file belongs to the storage device S1, and the data of the file is stored in the storage device S1.
  • a migration operation is performed in response to the change of the ownership information of the file.
  • the directory ownership information is not shown.
  • the directory may also have ownership information, which is used to indicate the device to which the directory belongs.
  • the directory can be owned by a designated device, that is, the designated device can uniformly maintain the directories in the file system.
  • file system's metadata is only an example, and during specific implementation, the file system's metadata may be in other formats.
  • Figure 8 is a schematic flowchart of a data migration method provided by an embodiment of the present application.
  • this method can be applied to the aforementioned data migration system, such as the data migration system shown in Figure 1, Figure 3, or Figure 5.
  • the data migration method shown in Figure 8 may include one or more steps from step S801 to step S806. It should be understood that this application describes the sequence of S801 to S806 for convenience of description, and is not intended to limit execution to the above sequence. The embodiments of the present application do not limit the execution sequence, execution time, number of executions, etc. of one or more of the above steps.
  • the details of steps S801 to S806 are as follows:
  • Step S801 The migration scheduling device determines the migration task for the first file.
  • the migration task indicates migrating the first file from the first storage device to the second storage device.
  • the first file may be one file or multiple files.
  • the migration task instructs to migrate data of multiple files in a certain directory to a second storage device.
  • the migration task includes the identification of the first file, the identification of the first storage device, and the identification of the second storage device.
  • the identifier of the first file is used to indicate the file targeted by the migration task or the directory to which the file belongs, the identifier of the first storage device is the identifier of the source device for migration, and the identifier of the second storage device is the identifier of the target device for migration. .
  • the migration scheduling device may determine the migration task for the file based on the input information.
  • the following are examples of designs that determine migration tasks based on input information:
  • Design 1 The migration scheduling device determines the migration task for the first file based on external event information.
  • External events refer to events that occur outside the migration scheduling device and/or business system.
  • External event information includes but is not limited to one or more of network connection status, device health status, or personnel transfer status.
  • the network connection status can be understood as the online status, which is used to describe whether the device can be perceived by other devices. For example, when a certain line communication is interrupted and communication at location A is predicted to be affected (or the equipment at location A may not be sensed, or the communication rate is affected), the data at location A can be migrated to location B. .
  • Device health can describe the current storage capacity of the device, or describe the current fault status of the device.
  • the access speed of a device may decrease with the duration and number of uses.
  • migrate the data on the storage device to other storage devices Another example is when a storage device fails, the data on the storage device is migrated to other storage devices.
  • the personnel transfer status related to the first file includes a change in location of the data owner or data manager. For example, when the R&D team members of a business travel to a different location, migration can be triggered to migrate the data of the business to a storage device closer to the business trip destination.
  • the migration scheduling device can determine the migration task according to external events by presetting the data migration strategy.
  • Data migration strategies can be algorithms, preset rules or conditions set by users.
  • the corresponding migration task when external event information meets the conditions for triggering data migration, the corresponding migration task will be determined to achieve intelligent migration of comprehensive multi-information flows and improve the user experience.
  • the conditions that trigger data migration can be defined by the migration policy.
  • Migration strategies can be implemented through algorithms or rules.
  • the migration scheduling device determines the migration task for the first file based on the metadata analysis result.
  • the metadata analysis result is the result of analyzing the metadata of the first file (or the metadata of the target file system).
  • the metadata analysis result includes but is not limited to the hot or cold status of the first file, the security or safety of the first file. One or more of the businesses related to the first document.
  • the hot or cold status of metadata can be indicated by the frequency of access to a file. For example, if the metadata of the first file includes an attribute indicating the number of visits within a period of time. If the number of accesses is greater than or equal to the first threshold, the data of the first file is migrated to a device with high storage speed (for example, a second storage device), thereby improving the efficiency of accessing the data of the first file and improving the service of the system. quality. Similarly, if the number of accesses of the first file is less than or equal to the second threshold, the data of the file is migrated to a device with high storage capacity to reduce storage costs.
  • the first threshold and the second threshold here may be input by an administrator (such as a developer, management department, etc.), a manufacturer, etc., or may be preset.
  • the metadata of the first file may include an attribute indicating the security level of the file. For example, if the security level of the first file is high and the security level of the first storage device does not meet the security level requirements of the first file, then the data of the first file is migrated to a device that can meet the security level requirements of the first file. to effectively protect users’ file security needs and improve the system’s service quality.
  • the metadata of the first file includes attributes representing services related to the first file.
  • the related services of the first file are vehicle services, video services, or file download services.
  • the first file is used to store vehicle-mounted service data, when the vehicle-mounted service data needs to be migrated to the second storage device, the data of the first file will also be migrated accordingly. In this way, users can migrate files based on different businesses, which improves the convenience for users to manage business data and improves the service quality of the system.
  • metadata analysis results can indicate file migration requirements (such as access requirements, security requirements, business requirements, etc.), and migration tasks are determined based on migration requirements to achieve overall storage optimization.
  • files migration requirements such as access requirements, security requirements, business requirements, etc.
  • migration tasks are determined based on migration requirements to achieve overall storage optimization.
  • users can express file migration needs by updating the file's metadata, realizing intelligent management of data and improving user convenience in data use and management.
  • Design 3 The migration scheduling device determines the migration task according to the migration instruction for the first file input by the user. For example, the instruction information input by the user instructs to migrate the data of the first file to the second storage device, the migration scheduling device can respond Based on this instruction information, determine the migration task for the first file.
  • users can migrate a certain file by entering migration instructions, which can meet the user's personalized needs and improve the user experience.
  • the above three designs are only examples, and the input information may also include other information during the specific implementation process.
  • the above three designs can also be combined without being mutually exclusive, and the combination will not be described in detail here.
  • the migration task for the first file is part of multiple tasks determined by the migration scheduling device.
  • the migration scheduling device can determine multiple tasks and arrange the multiple tasks so that the multiple tasks can be executed reasonably and orderly.
  • the migration scheduling device may determine the migration task for the second file, and arrange the migration task for the first file and the migration task for the second file.
  • arranging tasks may include one or more of the following operations: determining the execution order of tasks, determining execution priority, or merging multiple tasks, etc.
  • the data scheduling device can determine, based on the priority of needs, to prioritize the migration tasks of some files to improve the user experience. For example, files whose access frequency surges in a short period of time can be prioritized for migration to increase the file's access rate as quickly as possible and improve user experience.
  • task A instructs to migrate the first file from the first storage device to the second storage device
  • task B instructs to migrate the first file from the first storage device to the third storage device
  • task A and task B can A new task is obtained by being merged, and the new task indicates migrating the first file from the first storage device to the third storage device.
  • Step S802 The migration scheduling device performs a first change on the metadata of the first file.
  • the first change indicates that the storage device to which the first file belongs is changed from the first storage device to the second storage device.
  • the ownership of the first file is indicated by ownership information, and the ownership information includes an identification of the device to which the file belongs.
  • the ownership information of the file is an identification of the first storage device, that is, the identification device of the file is the first device.
  • the migration scheduling device performs a first change on the metadata of the first file, specifically: the migration scheduling device changes the ownership information of the first file from the identification of the first storage device to the identification of the second storage device. logo.
  • the ownership information is indicated by the value of the field. When the value of the field corresponding to the second storage device in the metadata of the first file is the first value, it means that the second storage device is the home device of the first file.
  • the first value may be predefined or preconfigured.
  • the migration scheduling device may change the metadata of the first file by changing the metadata of the target file system.
  • the target file system here refers to the file system to which the first file belongs. It should be understood that the target file system is a specific file system or a group of specific file systems.
  • the metadata of the target file system is in a streaming structure.
  • the migration scheduling device may perform the first change by adding a metadata record to the metadata stream.
  • the migration scheduling device appends a first metadata record at the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the ownership information of the first file.
  • the ownership information of the first file includes the second The identification of the storage device.
  • the metadata of the target file system is in a tabular structure.
  • the migration scheduling device may modify the ownership information of the first file into the identification of the second storage device by modifying the metadata of the first file in the metadata table of the target file system.
  • the target file system's metadata is synchronized across multiple devices, or multiple The device shares the metadata of the target file system. Synchronization here means that the metadata of the target file system can be modified by any one of multiple devices, the modified content can be learned by the multiple devices, and the metadata of the target system learned by multiple devices is consistent. of. Therefore, when the migration scheduling device performs the first change on the metadata of the first file, the device that shares the metadata of the target file system can obtain the first changed metadata of the first file.
  • the metadata of the above target file system is synchronized between multiple devices.
  • Implementation method 1 The metadata of the target file system is stored locally on multiple devices.
  • the device When a certain device makes changes to the target file system, the device notifies other devices that store the metadata of the target file system of the change. Other devices change the locally stored metadata of the target file system accordingly based on the notification, thereby achieving synchronization of the metadata of the target file system on multiple devices.
  • the multiple devices here include a migration scheduling device, a first storage device and a second storage device.
  • the migration scheduling device, the first storage device, and the second storage device all store metadata of the first file system locally.
  • the migration scheduling device may send a first notification, and the first notification indicates that the metadata of the first file has undergone a first change.
  • the first storage device and the second storage device correspondingly change the locally stored metadata of the target file system based on the first notification, thereby achieving synchronization of the metadata of the target file system on multiple devices.
  • the first notification indicates what changes have occurred in the metadata of the first file.
  • the first notification may include the content of the first change, such as the identification of the first file and the attributes (or values of attributes) of the file changed by the second change.
  • the first storage device and the second storage device can obtain the first modified metadata of the first file based on the first pre-modified metadata and the first modified content, and obtain the first modified metadata of the first file based on the first modified metadata.
  • a migration task is performed on the first changed metadata of a file.
  • the first notification may include the first changed metadata of the first file.
  • the first storage device and the second storage device may perform the migration task according to the first changed metadata of the first file.
  • the first notification indicates that the first change has occurred, but the first notification does not include the specific content of the first change and/or the metadata of the first file after the first change.
  • the first storage device and the second storage device request the specific content of the first change and/or the metadata after the first change of the first file from the migration scheduling device, and based on the information provided by the migration scheduling device. Change the locally stored metadata of the target file system accordingly.
  • Implementation method 2 The metadata of the target file system is stored in the designated device, and the designated device provides access to and updates of the metadata of the target file system.
  • a device changes the metadata of the target file system, the change is provided to the specified device.
  • Multiple devices can obtain the changed metadata of the target file system from the specified device, thereby realizing the metadata of the target file system. Data synchronization across multiple devices. It should be understood that the multiple devices here include a first storage device and a second storage device, and optionally include a migration scheduling device.
  • Methods for triggering multiple devices to obtain the metadata of the target file include but are not limited to: multiple devices actively (periodic or aperiodic) reading from a specified device, or a specified device notifying multiple devices to read (for example, through a message queue). Notify multiple devices), or the specified device actively publishes changes, or the device that changes the metadata of the target file system notifies multiple devices to read from the specified device, or the device that changes the metadata of the target file system notifies multiple devices with Change related device reads.
  • the device that manages the metadata of the target file system can provide the first storage device and the second storage device with the modified metadata of the first file when the metadata of the first file is changed, so that they can understand it.
  • a change in the ownership information of the first file triggers the first storage device and the second storage device to perform a migration task.
  • the source device can periodically read the metadata of the target file system to learn the ownership information of the first file.
  • the change triggers the push operation of the data of the first file.
  • the destination device can respond to the data scheduling notification, thereby reading the metadata of the file system, learning about changes in the ownership information of the first file, and triggering a pull operation for the data of the first file.
  • the migration scheduling device performs a first change on the metadata of the target file system.
  • the first change is provided to the designated device.
  • the designated device updates the metadata of the target file based on the first change.
  • the first storage device and the second storage device may read the metadata of the target file system that has undergone the first change from the designated device.
  • the aforementioned designated device may serve global metadata, or may be a first storage device, a second storage device, a migration scheduling device, etc.
  • the target file system's metadata is stored in a global metadata service.
  • the metadata of the file system is managed through the global metadata service, so that multiple devices can read or write metadata according to the format of the metadata in the global metadata service, unifying the metadata of the file.
  • the representation method shields the differences in metadata management and access control between heterogeneous storage devices, which not only improves the convenience of data use and management for users, but also improves the scalability and flexibility of the system.
  • any one of the multiple devices can dynamically obtain the changed metadata of the target system (that is, Say, you can get the current metadata of the target system).
  • the global metadata service can provide a service interface, and the device can call the service interface to access and update metadata.
  • the service interface is a communication interface, such as an application programming interface (API), which can be used to interact with data and provide services between different functional modules.
  • API application programming interface
  • the caller and the implementer can be decoupled.
  • the device calling the service interface can provide relevant data according to the requirements of the service interface, and the global metadata service can obtain relevant data through the service interface and implement related
  • the corresponding functions not only improve the efficiency of accessing and updating metadata, but also improve the scalability and flexibility of the system.
  • the migration scheduling device calls a service interface provided by the global metadata service to make the first change to the metadata of the first file.
  • the service interface may be provided by the global metadata service to the migration scheduling device.
  • the migration scheduling device may directly call the service interface.
  • the service interface may be provided by a global metadata service to devices that share the metadata stream of the target file system.
  • the migration scheduling device can call the service interface on any device that shares the metadata stream of the target file system.
  • the global data service provides a service interface to the first storage device, and the migration scheduling device can call the service interface on the first storage device to implement the first change.
  • the file's ownership information can be included in the file's metadata as a basic attribute of the file.
  • the migration scheduling device can change the ownership metadata of the file by executing customized instructions or setting instructions to modify attributes in advance.
  • the attribution information of the file can be included in the extended attribute field in the metadata of the file as an extended attribute of the file, such as xattr field, tags field, etc.
  • the migration scheduling device can change the extended attribute field in the metadata of the file by executing a customized instruction.
  • the migration scheduling device can change the file's ownership information through a preset instruction to modify xattr.
  • the migration scheduling device may send a notification (hereinafter referred to as the first notification for ease of distinction) to indicate that the metadata of the first file has undergone a first change.
  • the recipient of the first notification may be the person managing the target file system.
  • the metadata devices may be all devices that share the metadata of the target file system, or may be devices involved in the migration task (the first storage device and/or the second storage device).
  • the migration scheduling device may send the first notification to the global metadata service, and the global metadata service issues the first change.
  • the global metadata service may notify the source device (for example, the first storage device) and the destination device (for example, the second storage device) to read the first modified metadata of the first file, or send the metadata to the source device and the destination device.
  • the first changed metadata of the first file may be sent by the global metadata service.
  • the migration scheduling apparatus may notify the source device (for example, the first storage device) and the destination device (for example, the second storage device) to read the first changed metadata of the first file, or , sending the first changed metadata of the first file to the source device and the destination device.
  • the source device and the destination device may receive a notification from the migration scheduling device, or receive the first changed metadata of the first file sent by the migration scheduling device.
  • the method by which the migration scheduling device sends the notification may be a direct sending method or an indirect sending method.
  • the sender sends a message to the receiver.
  • messages can be copied multiple times and sent to multiple recipients.
  • indirect sending such as through message queues, forwarding through intermediate devices, etc.
  • messages in the message queue can be read by one or more devices; the sender writes messages to the message queue, and the receiver (the number of receivers can be one or more) can read from Read messages from the message queue to send and receive messages.
  • step S803 specifically as follows:
  • Step S803 The first storage device obtains metadata of the first file.
  • the metadata of the first file includes ownership information of the first file and storage layout information of the first file.
  • the ownership information of the first file indicates the device to which the first file belongs
  • the storage layout information of the first file indicates the device where the first file is stored.
  • the storage layout information is also used to indicate a device that stores the data segments of the first file.
  • the ownership information includes the identification of the device to which the file belongs.
  • the ownership information of the first file is the identification of the first storage device, which means that the device to which the file belongs is the first device.
  • the storage layout information includes an identification of the data of the stored file.
  • the storage layout information of the first file includes an identification of the first storage device, it means that the data of the file is stored on the first storage device.
  • the file ownership information and/or the file storage layout information can also be reflected in the form of field values.
  • the storage layout information includes multiple fields, and the multiple fields respectively correspond to multiple storage devices.
  • the value of the field corresponding to a certain storage device is the first value, it means that the first file is stored on the storage device.
  • the data For example, in “Device S1:1; Device S2:0", the value of field "Device S1” is 1, indicating that the data of the first file is stored on device S1, and the value of field "Device S2" is 0, indicating that device S2 There is no data stored in the first file.
  • file data can be stored on multiple storage devices in the form of data segments.
  • Storage layout information is also used to indicate the fragmentation of data stored by multiple storage devices.
  • the storage layout information may include a bitmap of the file to indicate the storage layout on multiple devices.
  • the data of a file contains 8 data segments
  • the data segments stored on a certain storage device can be indicated by an 8-bit bitmap.
  • the storage layout information can include: "Device S1: 0x1010 0000; Device S3: 0x1111 1111".
  • the first bit is 1, which means there is The first data segment is stored on the storage device S1, and the second bit is 0, indicating that the second data segment is not stored on the storage device S1, and so on for the remaining digits.
  • the first bit is 1, indicating that the first data segment is stored on storage device S3, and the second bit is 1, indicating that the second data segment is stored on storage device S3. The remaining digits are deduced in this way.
  • the metadata obtained by the first storage device may be the first modified metadata of the first file.
  • the first change may be performed by the migration scheduling device, or may be performed by other devices, such as the first storage device, the second storage device, or other storage devices.
  • step S804 specifically as follows:
  • Step S804 The second storage device obtains metadata of the first file.
  • step S805 specifically as follows:
  • Step S805 The second storage device sends a pull request.
  • the pull request may be sent by the second storage device to the first storage device, instructing the first storage device to push the data of the first file.
  • the pull request may carry the ownership information of the first file and the storage layout information of the first file.
  • the information may be sent to multiple devices in a broadcast or multicast manner, instructing the device that stores the data of the first file to push the data of the first file.
  • the pull request can be sent directly to the recipient, or it can be sent indirectly, such as through a message queue.
  • a message queue For detailed introduction, please refer to the relevant description of the method of sending the first notification.
  • Step S806 The first storage device migrates the data of the first file from the first storage device to the second storage device.
  • the first storage device determines that the storage device indicated by the ownership information of the first file is the second storage device, and the storage layout information of the first file indicates that the storage device that stores the first file does not include the second storage device.
  • the storage device includes a first storage device, the data of the first file is migrated from the first storage device to the second storage device.
  • the storage layout information of the first file indicates that the storage device storing the first file does not contain the second storage device and contains the first storage device, including the following situations: :
  • the first storage layout information indicates that the data segments (part or all of the data segments) of the first file are stored on the first storage device, and all the data segments of the first file are not stored on the second storage device. That is, when there is no complete data of the file on the device to which the first file belongs, the device that stores the data segments of the first file is triggered to push the data segments of the first file to the device to which the first file belongs.
  • the data segments of the first file pushed by the source device include data segments that are not stored on the destination device.
  • the ownership information and storage information content of the first file may be determined by the first storage device through the metadata of the first file, or may be determined by the first storage device through a pull request.
  • the first storage device obtains the metadata of the first file, the ownership information of the first file is the identification of the second storage device, and the storage layout information of the first file does not include the identification of the second storage device and includes the identification of the second storage device.
  • the first storage device migrates the data of the first file from the first storage device to the second storage device.
  • the first storage device migrates the data of the first file from the first storage device to the second storage device in response to the pull request.
  • migrating the data of the first file from the first storage device to the second storage device may include the following situations:
  • Scenario 1 The first storage device pushes the data of the first file to the shared storage area.
  • This shared storage area is related to the A storage device is connected to a second storage device, and the second storage device can obtain the data of the first file from the shared storage area.
  • the data segments pulled by the second storage device from the shared storage area are data segments that are not stored on the second storage device. Specifically, when the second storage device pulls a certain data segment of the first file, it first checks whether it stores the data segment, and if the data segment is not stored on the second storage, it pulls it from the shared storage area. The data is segmented.
  • the first storage device can notify other devices that the data has been pushed to the shared storage area. Notification methods can be implemented in the following ways:
  • Implementation Mode 1 The first storage device makes a second change to the metadata of the first file to trigger the second storage device to obtain the data of the first file from the shared storage area and store it in the second storage equipment.
  • the second change instruction is to add the shared storage area in the storage device indicated by the storage layout information of the first file.
  • the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the second storage device obtains the second modified metadata of the file, pulls the data of the first file from the shared storage area, and stores the data in the second storage device.
  • the ownership information of the first file is the identification of the second storage device.
  • the storage layout information of the first file includes the identification of the first storage device and does not include the identification of the second storage device. logo.
  • the first storage device may add the identification of the shared storage area to the storage layout information of the first file.
  • the storage layout information of the first file includes the information of the first storage device and the identification of the shared storage area, and does not include the identification of the second storage device.
  • the second changed metadata of the first file can be obtained by other devices in the following manner: the first storage device performs the first change by synchronizing the metadata of the target file system, and other devices synchronize the target file system with the metadata.
  • the file system metadata is used to obtain the second changed metadata of the first file.
  • the first storage device adds a second metadata record at the end of the metadata stream.
  • the second metadata record contains the identifier of the first file and the storage layout of the first file.
  • Information, the first stored storage layout information includes the identification of the shared storage area (optionally including the identification of the second storage device).
  • Implementation method two The first storage device sends a push notification to the second storage device to notify other devices that the data has been pushed to the shared storage area.
  • Case 2 The first storage device pushes the data of the first file to the second storage device, and accordingly, the second storage device pulls the data of the first file from the first storage device.
  • the first storage device can directly push the data of the first file to the second storage device without going through an intermediate device. Improve the efficiency of data migration.
  • the first storage device can push the data of the first file to the second storage device.
  • the data of the first file is stored in the form of multiple data segments.
  • the data segments pulled by the second storage device from the first storage device are data not stored on the second storage device. Segmentation.
  • Push data can also be pushed in data segments.
  • the third change is performed on the metadata of the first file.
  • the third change instruction is to add the second storage device to the storage device indicated by the storage layout information of the first file.
  • the second storage device pulls one data segment, the second storage device can The bitmap of the file corresponding to the second storage device is changed to indicate that the data segment has been stored on the second storage device.
  • the second storage device after storing the data of the first file, the second storage device sends a notification indicating that the second storage device has acquired the data of the first file.
  • the notification may be sent to the second storage device and/or the migration scheduling device, or may be sent by broadcast.
  • the first storage device can delete the data of the first file locally.
  • the ownership information of the first file is the second storage device and the storage layout information of the first file includes the identification of the first storage device and the identification of the second storage device
  • the first storage device deletes the data of the first file on the first storage device.
  • the first storage device deletes the data of the first file on the first storage device to avoid multiple storages.
  • the device repeatedly stores the data of a certain file to free up storage space and optimize overall storage costs.
  • the first storage device deletes the first file locally. data segmentation. This can avoid the problem that the file data is damaged due to errors during the data migration process, making the file data no longer complete.
  • the first storage device may perform a fourth change operation on the metadata of the first file, and the fourth change operation instructs to delete the identification of the first storage device from the storage layout information of the first file.
  • the data of the first file may be marked as deletable, so that when the data of the first file is deletable, Delete operation is performed when the status is reached. For example, when the data of the first file is in use and it is inconvenient to perform a deletion operation immediately, you can first mark the data of the first file and delete it after the file is used.
  • the file For another example, before deleting a file, mark the file as deletable. Deletable files cannot be accessed through normal means, and then delete the files uniformly when the preset conditions are met.
  • the preset conditions here may be that the time marked as deletable reaches the preset length, the data marked as deletable reaches the preset size, etc. This can facilitate the user to retrieve the data of the first file on the first storage device, reduce data loss caused by misoperation, and improve user experience.
  • the migration scheduling device can determine the execution progress of the task based on the file's ownership information and storage layout information.
  • the process of triggering migration, data migration and deletion of local data may change the metadata of the file.
  • the process of triggering migration, data migration and deletion of local data may change the metadata of the file.
  • By monitoring the change of the metadata of the file especially the change of the file's ownership information and the change of the file's storage layout information, which can determine the execution progress of the task and facilitate the user to understand the execution status of the task.
  • tasks that progress slowly or fail can be processed as soon as possible to improve system stability.
  • Figure 9 is a schematic diagram of a change of metadata of a first file provided by an embodiment of the present application.
  • metadata changes can include several stages:
  • Metadata 901 is the metadata of the first file before data migration is triggered, including the identification of the first file (inode is 60), the ownership information of the first file (i.e., ownership metadata) and the storage layout information of the first file (i.e. Layout metadata), it can be seen that the device to which the first file belongs is device S1, and the data of the first file exists in device S1.
  • the metadata 901 also includes the inode (represented as pinode) of the parent node of the first file.
  • Stage (2) metadata after triggering migration and before data migration.
  • Metadata 901 can be modified, and the modified metadata of the first file is metadata 902. It can be seen that the home device of the first file is changed to device S2. Since the metadata of the first file does not exist on device S2, migration from device S1 to device S2 is triggered.
  • Stage (3) metadata after data migration and before space is released. After the data of the first file is migrated from device S1 to device S2, the layout metadata of the first file needs to be updated accordingly. It can be seen from the metadata 903 that the data of the first file already exists on the device S2.
  • Stage (4) metadata before and after releasing storage space.
  • the data of the first file stored on the device S1 can be deleted.
  • the storage layout information of the first file also needs to be updated accordingly.
  • the file's layout metadata indicates that no metadata for the first file exists on device S1.
  • data migration can include more or fewer metadata change stages, or the attribution metadata and layout metadata in the metadata can also have other designs.
  • the first storage device and/or the second storage device may generate a local file view.
  • the local file refers to a file whose ownership is the own device and/or the storage layout information of the file indicates the file of the own device.
  • the second storage device provides a local file view of the second storage device, and the local file view indicates a hierarchical structure of multiple files stored on the second storage device.
  • the storage layout information of the plurality of files indicates the second storage device, and/or the ownership information of the plurality of files indicates the second storage device.
  • storage device S1 can provide a local file view.
  • data migration of the file is triggered by changing the metadata of the file.
  • the first file is moved from the first storage device to the second storage device.
  • the first storage device is migrated to the second storage device.
  • the migration scheduling device, source device and destination device can trigger the migration of the file based on changes to the metadata of the file (such as changes to the file's ownership, storage layout and other metadata), and the progress of the file migration can also be Reflected by the status of the file's metadata.
  • This method can improve the efficiency of data migration, improve the convenience of data use and management, and also achieve the decoupling of various devices during the migration process, greatly improving the flexibility and scalability of the business system.
  • the storage location of the file data is still indicated by the storage layout information. Therefore, the above data migration method does not affect the normal use of the file data by users and improves the stability of the business system.
  • Figure 10 is a schematic flowchart of yet another data migration method provided by an embodiment of the present application. optional, This method can be applied to the aforementioned data migration system, such as the data migration system shown in Figure 1, Figure 3 or Figure 5.
  • the data migration method shown in Figure 10 may include one or more steps from step S1001 to step S1008. It should be understood that this application describes the sequence of S1001 to S1008 for convenience of description, and is not intended to limit execution to the above sequence. The embodiments of the present application do not limit the execution sequence, execution time, number of executions, etc. of one or more of the above steps. The details of steps S1001 to S1008 are as follows:
  • Step S1001 The migration scheduling device determines the migration task for the first file.
  • Step S1002 The migration scheduling device performs the first change on the metadata of the first file.
  • the first change indicates that the storage device to which the first file belongs is changed from the first storage device to the second storage device.
  • the first change instruction is to change the ownership information of the first file from the identification of the first storage device to the identification of the second storage device.
  • Figure 11 is a schematic diagram of yet another change of metadata of a first file provided by an embodiment of the present application. Metadata 1101 of the first file undergoes the first modification to obtain metadata 1102.
  • the ownership information of the first file includes the identification of the second storage device.
  • the first storage device may obtain the change of the ownership information of the file.
  • the first storage device may obtain the first modified metadata of the first file.
  • the migration scheduling device appends metadata 1102 to the metadata stream, and accordingly, the first storage device can obtain the record appended to the metadata stream.
  • the data migration method shown in Figure 10 also includes step S1003, as follows:
  • Step S1003 The migration scheduling device determines the migration progress based on the ownership information and storage layout information of the first file.
  • the storage device indicated by the ownership information of the first file is a second storage device, but the storage layout information of the first file indicates that the storage device storing the first file does not include the second storage device but includes the first storage device,
  • the migration progress is not started.
  • the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and the shared storage area but does not include the second storage
  • the migration progress is: the source device has pushed data.
  • the migration progress is : The target device has pulled data.
  • Figure 11 is a schematic diagram of a change of metadata of a first file provided by an embodiment of the present application.
  • Metadata 1101 is the metadata of the first file before the migration is triggered
  • metadata 1102 is the metadata of the first file after the migration is triggered
  • metadata 1103 is the metadata after the source device pushes the data of the first file
  • metadata 1104 The metadata 1105 is the metadata after the target device pulls the data of the first file
  • the metadata 1105 is the metadata after the source device deletes the local data of the first file.
  • Step S1004 The first storage device pushes the data of the first file to the shared storage area.
  • the shared storage area is an intermediate device that provides storage space.
  • the shared storage area is connected to the first storage device, and the first storage device can push data to the shared storage area.
  • the shared storage area can also be connected to a second storage device, and the second storage device pulls data from the shared storage area.
  • shared storage can be provided by the global data service.
  • the shared storage area is provided by a third-party temporary storage device or intermediate device.
  • the ownership change of the first file is synchronized to the device where the data of the first file is located (i.e., the first storage device).
  • the first storage device detects that the ownership of the first file is not its own device, and the layout If the metadata shows that the data is local but not on the home device (that is, the second storage device), the source device pushes (or releases) the data of the first file to the shared storage area.
  • the local data of the first file may not be deleted from the source device to reduce the risk of data loss of the first file due to failure in subsequent steps and improve system stability.
  • the data of the first file may already be stored in the shared storage area, in order to avoid repeated pushing of data.
  • the first storage device detects that the first file does not belong to this device and the layout metadata shows that the data is local but not in the belonging device (i.e., the second storage device) and the shared storage area, it pushes (or is called Publish) the data of the first file.
  • Step S1005 The first storage device performs a second change on the metadata of the first file.
  • the first change indicates adding a shared storage area in the storage device indicated by the storage layout information of the first file.
  • the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the first storage device adds the identification of the shared storage area to the storage layout information of the first file.
  • the metadata 1104 after the second change includes the identification of the shared storage area (ie: shared storage area p1).
  • the second storage device may obtain the second modified metadata of the first file.
  • Step S1006 The second storage device pulls the data of the first file from the shared storage area.
  • Step S1007 The second storage device performs a third change on the metadata of the first file.
  • the third change instruction is to add a second storage device to the storage device indicated by the storage layout information of the first file.
  • the second storage device adds the identification of the second storage device to the storage layout information of the first file.
  • the metadata 1103 after the third change includes the identification of device S2 (ie: device S2).
  • the third change may also indicate deleting the shared storage area in the storage device indicated by the storage layout information of the first file.
  • the second storage device may delete the identification of the shared storage area in the storage layout information of the first file.
  • the device that provides the shared storage area can also change the metadata of the first file to delete the shared storage area in the storage device indicated by the storage layout information of the first file.
  • the third modified metadata of the first file may be synchronized to the first storage device and/or the migration scheduling device. For example, synchronization is performed through metadata of the target file, or the second storage device sends the third modified metadata of the first file to the first storage device. For related description, reference may be made to the manner of synchronizing the first change and the second change in step S802 and step S806.
  • Step S1008 The first storage device deletes the data of the first file stored on the second storage device.
  • the first storage device can delete the local data of the first file to free up storage space and reduce storage costs.
  • the first storage device deletes the first storage device.
  • One storage device on the first file data is the second storage device and the storage layout information of the first file includes the identification of the first storage device.
  • the first storage device may synchronize the third modified metadata of the first file, thereby determining that the ownership information of the first file is the second storage device and the storage layout information of the first file includes The identifier of the first storage device.
  • Step S1009 The first storage device performs a fourth change on the metadata of the first file.
  • the first storage device deletes the identification of the first storage device from the storage layout information of the first file.
  • the metadata 1105 after the fourth modification does not include the identification of the device S1.
  • the migration scheduling device, the source device, and the destination device can implement data migration by changing the metadata of the file, and the progress of the data migration can also be reflected by the metadata of the file.
  • the efficiency of data migration can be improved, and the convenience of data use and management can be improved.
  • the embodiments of the present application can not only improve the efficiency of data migration, but also realize the decoupling of various devices during the migration process, greatly improving the flexibility and scalability of the business system.
  • Figure 12 is a schematic flow chart of yet another data migration method provided by an embodiment of the present application.
  • this method can be applied to the aforementioned data migration system, such as the data migration system shown in Figure 1, Figure 3 or Figure 5.
  • the data migration method shown in Figure 12 may include one or more steps from step S1201 to step S1208. It should be understood that this application describes the sequence of S1001 to S1008 for convenience of description, and is not intended to limit execution to the above sequence. The embodiments of the present application do not limit the execution sequence, execution time, number of executions, etc. of one or more of the above steps. The details of steps S1001 to S1008 are as follows:
  • Step S1201 The migration scheduling device determines the migration task for the first file.
  • Step S1202 The migration scheduling device performs the first change on the metadata of the first file.
  • the data migration method shown in Figure 12 also includes step S1203, as follows:
  • Step S1203 The migration scheduling device determines the migration progress based on the ownership information and storage layout information of the first file.
  • Step S1204 The second storage device obtains metadata of the first file.
  • the metadata of the first file obtained by the first storage device is the metadata after the first change.
  • Step S1205 The second storage device sends a pull request.
  • the ownership change of the first file is synchronized to the ownership device of the first file (i.e., the second storage device), the second storage device detects that the ownership of the first file is this device, and the layout metadata display data is not local, Then the second storage device sends a pull request, causing the device that stores the data of the first file to push the data of the first file.
  • the pull request includes the identifier of the second storage device and the identifier of the first file.
  • it also includes ownership information of the first file and/or storage layout information of the first file.
  • pull request can be sent directly or indirectly.
  • the second storage device writes the pull request in the broadcast message queue
  • other devices such as the first storage device
  • the second storage device may send a pull request to the second storage device so that the first storage device pushes the data of the first file.
  • the second storage device detects that the first file does not belong to the device and the layout metadata shows that the data is not local or in the shared storage area, it sends a pull request.
  • Step S1206 The first storage device pushes the data of the first file to the shared storage area.
  • the first storage device receives the pull request.
  • the first storage device pushes the data of the first file to the shared storage area.
  • the pull request includes ownership information of the first file and storage layout information of the first file.
  • the first storage device determines that the first file does not belong to this device, and the layout metadata shows that the data is local but not on the home device (i.e., the second storage device), then the first storage device pushes (or is called publishing). ) the data of the first file to the shared storage area.
  • Step S1207 The first storage device performs a second change on the metadata of the first file.
  • Step S1208 The second storage device pulls the data of the first file from the shared storage area.
  • Step S1209 The second storage device performs a third change on the metadata of the first file.
  • Step S1210 The first storage device deletes the data of the first file stored on the second storage device.
  • step S1008 For related description, please refer to step S1008.
  • Step S1211 The first storage device performs a fourth change on the metadata of the first file.
  • the destination device for data migration actively sends a pull request, which can solve the problem of data migration failure caused by the destination device being offline or failing, and improve the success rate of data migration.
  • the multiple devices provided by the embodiments of the present application include corresponding hardware structures, software units, or The combination of hardware structure and software structure, etc.
  • the devices and modules in the devices can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or by computer software driving the hardware depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different device implementations to implement the foregoing method embodiments in different usage scenarios, and different device implementations should not be considered to be beyond the scope of the embodiments of this application.
  • FIG 13 is a schematic structural diagram of a migration scheduling device 130 provided by an embodiment of the present application.
  • the migration scheduling device 130 may include a task determination module 1301 and a metadata update module 1302.
  • the migration scheduling device 130 is used to implement the aforementioned data migration method, such as the data migration method in the embodiment shown in Figure 8, Figure 10 or Figure 12.
  • the task determination module 1301 is used to determine a migration task for the first file, the data of the first file is stored on the first storage device, and the migration task for the first file The task instructs to migrate the data of the first file from the first storage device to the second storage device;
  • the metadata update module 1302 is configured to make a first change to the metadata of the first file to trigger execution of the migration task for the first file; wherein the first change indicates that the first file
  • the home storage device is changed from the first storage device to the second storage device.
  • the metadata of the first file includes ownership information of the first file and storage layout information of the first file.
  • the storage device indicated by the ownership information of the first file is the first storage device, and the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and does not include the third storage device. 2. Storage devices.
  • the ownership information of the first file is the identification of the first storage device
  • the storage layout information of the first file includes the identification of the first storage device and does not include the identification of the second storage device.
  • the metadata update module 1302 is configured to change the ownership information of the first file from the identification of the first storage device to the identification of the second storage device.
  • the migration task for the first file includes an identifier of the first file, an identifier of the first storage device, and an identifier of the second storage device.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system is in multiple Sync between devices.
  • the plurality of devices include a first computing device and a second computing device.
  • the first computing device is located in a first storage device or is connected to the first storage device.
  • the second computing device is located in a second storage device or is connected to the second storage device.
  • the plurality of devices mentioned above may also include a migration scheduling device.
  • the migration scheduling device 130 further includes a communication module 1303, which is configured to send a first notification indicating that the metadata of the first file has occurred.
  • the first computing device or the second computing device obtains the first modified metadata of the first file according to the first notification, and obtains the first modified metadata of the first file according to all the metadata of the first file. Using the first changed metadata, a migration task is performed for the first file.
  • the first notification indicates what changes have occurred in the metadata of the first file.
  • the first notification may include the content of the first change, and/or the first notification may include the first change of the first file. the subsequent metadata.
  • the first notification indicates that the first change has occurred, but does not include the specific content of the change and the metadata after the first change of the first file.
  • the migration scheduling device further includes a task monitoring module 1304, which is used to:
  • the metadata of the target file system is stored locally on multiple devices.
  • the metadata of the target file system is stored in a global metadata service.
  • the global metadata service can store metadata of the target file system.
  • the global metadata service can support access and update of metadata of the target file system.
  • the migration scheduling device 130 further includes a communication module 1303, which is used to:
  • a second notification is received indicating that metadata of the first file has changed.
  • the task monitoring module 1304 is also used to:
  • the second notification includes the second changed content, or the second notification includes the second changed metadata of the first file.
  • the notification (such as the first notification, or the second notification, etc.) may be sent in the form of a message queue.
  • the sender writes messages to the message queue, and the receiver receives notifications by reading the message queue, thereby further reducing the coupling between different functional modules.
  • the communication module 1303 is further configured to: send a message for obtaining the changed metadata of the first file to the first computing device or the second computing device. ask;
  • the task monitoring module 1304 is also configured to obtain the second changed metadata of the first file according to the response of the first computing device or the second computing device to the request.
  • the task monitoring module 1304 is also configured to obtain the second changed metadata of the first file from the global metadata service.
  • the global metadata service provides a service interface
  • the migration scheduling device can call the service interface to implement access and update of metadata
  • the service interface is a communication interface, such as an application programming interface (API), which can be used to interact with data and provide services between different functional modules.
  • API application programming interface
  • the metadata update module 1302 is also used to:
  • the first change is implemented through a service interface provided by the global metadata service.
  • the global metadata service is located on any one of the multiple devices, or on any device outside the multiple devices.
  • the global metadata service is located on a third computing device.
  • the third computing device may be the same computing device as the first computing device or the second computing device, or may be another computing device other than the two.
  • the service interface of the global metadata service may be provided by the third computing device to the migration scheduling device.
  • the third computing device may provide another interface (referred to as a first interface for ease of differentiation) to the migration scheduling apparatus, and by calling the first interface, the function of calling the service interface of the global metadata service can be implemented.
  • the metadata of the target file system is in a table structure and the metadata can be modified.
  • the tabular structure is a data interface containing rows and columns. Each row (or each column) contains multiple values, and each value corresponds to a field.
  • Metadata with a tabular structure can add metadata, delete metadata, or modify existing metadata. That is, the first change can be implemented by modifying the metadata of the target file system.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an attribute of the node, wherein, A node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the metadata update module 1302 is also used to:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the changed ownership information of the first file.
  • the first metadata record includes the identifier of the first file and the changed ownership information of the first file.
  • the changed ownership information of the file indicates that the storage device to which the first file belongs is the second storage device.
  • the task determination module 1301 is also used to:
  • the external event information includes one or more of the following information: network connection status, device health status, or all Describe the personnel transfer status related to the first document.
  • the task determination module 1301 is also used to:
  • the migration task for the first file is determined according to the analysis result of the metadata of the first file; wherein the analysis result includes one or more of the following information: the hot and cold status of the first file, the The security of the first file or the business related to the first file.
  • the task determination module 1301 is also used to:
  • the migration task for the first file is determined according to the migration instruction for the first file input by the user.
  • the task determination module 1301 is also used to determine the migration task for the second file
  • the migration scheduling device 130 also includes a task orchestration module 1305, which is configured to orchestrate the execution sequence of the migration task for the first file and the migration task for the second file.
  • orchestrating tasks can include determining the execution order, execution priority, etc. of tasks.
  • multiple tasks can be merged during the task orchestration process.
  • Figure 14 is a schematic structural diagram of a computing device 140 provided by an embodiment of the present application.
  • the computing device 140 is used to implement the aforementioned data migration method, such as the data migration method in the embodiment shown in FIG. 8, FIG. 10, or FIG. 12.
  • the computing device 140 includes the storage device, computing device, etc. in the previous embodiments, such as the first storage device in the embodiment shown in FIG. 1, FIG. 2, FIG. 3, FIG. 8, FIG. 10 or FIG. 12. device, or a secondary storage device.
  • the storage device S1 the storage device S2, and/or the storage device S3 in FIG. 4 .
  • the computing device 140 is an independent device that can be connected to the aforementioned storage device or computing device.
  • the computing device 140 may include a metadata acquisition module 1401 and a migration module 1402.
  • the computing device 140 is used to implement the method on the first storage device side in the embodiment shown in FIG. 8 or FIG. 10 .
  • the metadata acquisition module 1401 is used to acquire the metadata of the first file.
  • the metadata of the first file includes the attribution information of the first file and the metadata of the first file. storage layout information;
  • the migration module 1402 is configured to determine that the storage device indicated by the ownership information of the first file is a second storage device, and the storage layout information of the first file indicates that the storage device storing the first file does not contain the When the second storage device includes the first storage device, migrate the data of the first file from the first storage device to the second storage device.
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system is in multiple Synchronization between devices, the plurality of devices including the computing device or the first computing device where the computing device is located.
  • the metadata of the target file system is stored in a global metadata service and synchronized among the multiple devices through the global metadata service;
  • the metadata acquisition module 1401 is also used to:
  • the migration module 1402 is also used to:
  • the second computing device is connected to the second storage device, and the second computing device is located in the second storage device or connected to the second storage device;
  • the computing device 140 also includes a metadata update module 1403.
  • the metadata update module 1403 is configured to perform a first change on the metadata of the first file to trigger the second computing device to retrieve the metadata from the shared storage.
  • the area obtains the data of the first file and stores it in the second storage device, and the first change instruction is to add the shared storage area in the storage device indicated by the storage layout information of the first file; in the After the first change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes the first The identification of the storage device does not include the identification of the second storage device;
  • the metadata update module 1403 is also used to:
  • the computing device 140 further includes a communication module 1404, and the communication module 1404 is used to:
  • a first notification is received, the first notification indicating that metadata of the first file has changed.
  • the computing device 140 further includes a communication module 1404, and the communication module 1404 is used to:
  • a second notification is sent, the first notification indicating that the metadata of the first file has changed.
  • the metadata acquisition module 1401 is also used to:
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an attribute of the node, wherein, The node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the metadata update module 1403 is also configured to append a first metadata record at the end of the metadata of the target file system, where the first metadata record includes the identifier of the first file and the first metadata record.
  • Storage layout information of the file, and the storage layout information of the first file indicates that the storage device that stores the first file includes the first storage device and the shared storage area.
  • the computing device 140 further includes a deletion control module 1405, and the deletion control module 1405 is used to:
  • the second modified metadata of the first file When the second modified metadata of the first file is obtained, the data of the first file on the first storage device is deleted; the second modification indicates the storage layout information of the first file. After the second change, the storage layout information of the first file indicates that the storage device storing the first file includes the first storage device and includes the second storage device;
  • the metadata update module 1403 is further configured to perform a third change on the metadata of the first file, where the third change indicates deletion of the metadata in the storage device indicated by the storage layout information of the first file.
  • deletion control module 1405 is also used to:
  • the migration module 1402 is also used to:
  • Pushing data of the first file to a second computing device Pushing data of the first file to a second computing device.
  • the computing device 140 further includes a communication module 1404, and the communication module 1404 is used to:
  • a pull request for the first file is received from the second computing device.
  • the computing device 140 further includes a view providing module 1406, which is configured to provide a home file view of the first storage device, where the home local file view includes multiple files. information, and the ownership information of the multiple files indicates the first storage device.
  • a view providing module 1406 which is configured to provide a home file view of the first storage device, where the home local file view includes multiple files. information, and the ownership information of the multiple files indicates the first storage device.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the computing device 140 also includes a view providing module 1406 for providing a global file view that includes information about files belonging to the first storage device and information about files belonging to the second storage device.
  • the computing device 140 may include a metadata acquisition module 1401 and a migration module 1402.
  • the computing device 140 is used to implement the method on the second storage device side in the embodiment shown in FIG. 8, FIG. 10 or FIG. 12.
  • the metadata acquisition module 1401 is configured to acquire metadata of a first file, where the metadata of the first file includes attribution information of the first file and the first file.
  • File storage layout information
  • the migration module 1402 is configured to: when the ownership information of the first file indicates that the storage device to which the first file belongs is the second storage device, and the storage layout information of the first file indicates that the first file is stored When the storage device does not include the second storage device, pull the data of the first file from the device that stores the data of the first file to the second storage device.
  • the computing device 140 further includes a metadata update module 1403, and the metadata update module 1403 is also used to:
  • a first change is performed on the metadata of the first file, and the first change indicates that the second storage device is added to the storage device indicated by the storage layout information of the first file.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file does not include the second storage device. 2. The identification of the storage device;
  • the metadata update module 1403 is also used to:
  • the storage layout information of the first file indicates that the storage device that stores the first file includes the first storage device.
  • the computing device 140 also includes a communication module 1404 for:
  • the pull request is used to instruct a first computing device to push the first file; the first computing device is located in the first storage device or is connected to the first storage device. The device is connected.
  • the storage layout information of the first file indicates that the storage device that stores the first file includes a shared storage area.
  • the migration module 1402 is used for:
  • the storage layout information of the first file indicates a storage location where the first file is stored.
  • the storage device includes a first storage device.
  • the migration module 1402 is also used to:
  • the first file belongs to the target file system, the metadata of the first file is included in the metadata of the target file system, and the metadata of the target file system is stored in In the global metadata service;
  • the metadata acquisition module 1401 is also used to:
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an attribute of the node, wherein, A node is a file or directory, and the attributes of the node include the node's ownership information and the node's storage layout information.
  • the first change to the metadata of the first file includes:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information includes the identification of the second storage device.
  • the computing device 140 further includes a view providing module 1406, which is configured to provide a local file view of the first storage device, where the local file view indicates storage in the first storage device.
  • a view providing module 1406 is configured to provide a local file view of the first storage device, where the local file view indicates storage in the first storage device.
  • a hierarchical structure of multiple files on a storage device, and storage layout information of the multiple files indicates the first storage device.
  • the computing device 140 further includes a view providing module 1406, which is configured to provide a home file view of the first storage device, where the home local file view includes multiple files. Information, the ownership information of the plurality of files indicates the first storage device.
  • files belonging to the first storage device and files belonging to the second storage device are federated to form a global file system.
  • the computing device also includes a view providing module 1406 for providing a global file view that includes information about files belonging to the first storage device and information about files belonging to the second storage device.
  • the computing device 140 may include a communication module 1404, a migration module 1402, and a metadata update module 1403.
  • the computing device 140 is used to implement the method on the first storage device side in the embodiment shown in FIG. 12 .
  • the communication module 1404 is configured to receive a pull request for the first file from a second computing device, and the second computing device is connected to the second storage device;
  • the migration module 1402 is used to push the data of the first file to the shared storage area
  • the metadata update module 1403 is configured to perform a first change on the metadata of the first file to trigger the second computing device to obtain the data of the first file from the shared storage area and store it in the shared storage area.
  • the first change indicates that the shared storage area is added to the storage device indicated by the storage layout information of the first file; after the first change, the storage of the first file
  • the layout information indicates that the storage device storing the first file includes the first storage device and the shared storage area, and does not include the second storage device.
  • the ownership information of the first file is an identification of the second storage device, and before the first change, the storage layout information of the first file includes the first The identification of the storage device does not include the identification of the second storage device;
  • the metadata update module 1403 is also used to:
  • the first file belongs to the target file system
  • the metadata of the first file is included in the metadata of the target file system
  • the metadata of the target file system is in multiple Synchronization between devices, the plurality of devices including the first computing device.
  • the communication module 1404 is also used to:
  • a first notification is sent, the first notification indicating that the metadata of the first file has changed.
  • the metadata of the target file system is in a streaming structure and includes multiple metadata records, and each metadata record includes an identifier of a node and an attribute of the node, wherein, The node is a file or directory, and the attributes of the node include the ownership information of the node and the storage layout information of the node;
  • the metadata update module 1403 is used for:
  • a first metadata record is appended to the end of the metadata of the target file system.
  • the first metadata record includes the identification of the first file and the storage layout information of the first file.
  • the storage layout information indicates that the storage device storing the first file includes the first storage device and the shared storage area.
  • the computing device 140 includes a metadata acquisition module 1401, which acquires the second modified metadata of the first file.
  • the computing device 140 includes a deletion control module 1405, the deletion control module 1405 configured to delete the second modified metadata of the first file when the second modified metadata of the first file is obtained.
  • the metadata update module 1403 is configured to perform a third change on the metadata of the first file, where the third change indicates deletion of the first file in the storage device indicated by the storage layout information of the first file. Storage device; after the third change, the storage layout information of the first file indicates that the storage device storing the first file does not include the first storage device.
  • deletion control module 1405 is also used to:
  • the computing device 140 further includes a view providing module 1406, which is configured to provide a home file view of the first storage device, where the home local file view includes multiple files. information, and the ownership information of the multiple files indicates the first storage device.
  • a view providing module 1406 which is configured to provide a home file view of the first storage device, where the home local file view includes multiple files. information, and the ownership information of the multiple files indicates the first storage device.
  • the files belonging to the first storage device and the files belonging to the second storage device belong to the global file system.
  • the computing device 140 also includes a view providing module 1406 for providing a global file view that includes information about files belonging to the first storage device and information about files belonging to the second storage device.
  • FIG. 15 shows a schematic structural diagram of a computing device 150 provided by an embodiment of the present application.
  • Computing device 150 is a device with computing capabilities.
  • the device here may be a physical device, such as a controller, a processor, a server (such as a rack server), a host, etc., or it may be a virtual device, such as a virtual machine or container. wait.
  • the computing device 150 includes: a processor 1502 and a memory 1501, optionally including a bus 1504 and a communication interface 1503.
  • the processor 1502, the memory 1501, etc. communicate through the bus 1504. It should be understood that this application is not limited to Determine the number of processors and memories in the computing device 150.
  • the memory 1501 is used to provide storage space, and the storage space can optionally store application data, user data, operating systems, computer programs, etc.
  • Memory 1501 may include volatile memory, such as random access memory (RAM).
  • RAM random access memory
  • the memory 1501 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, mechanical hard disk (hard disk drive, HDD) or solid state drive (solid state drive). , SSD), etc.
  • the processor 1502 is a module that performs calculations and may include a controller (such as a storage controller), a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, MP), digital signal processor (digital signal processor, DSP), co-processor (assist the central processor to complete corresponding processing and applications), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), microcontroller unit (Microcontroller Unit, MCU ), virtual machines, containers, etc. any one or more.
  • a controller such as a storage controller
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • co-processor assistantst the central processor to complete corresponding processing and applications
  • ASIC Application Specific Integrated Circuit
  • MCU microcontroller unit
  • virtual machines containers, etc. any one or more.
  • the communication interface 1503 is used to provide information input or output to the at least one processor. And/or, the communication interface 1503 may be used to receive data sent from the outside and/or send data to the outside.
  • the communication interface 1503 may be a wired link interface such as an Ethernet cable, or a wireless link (Wi-Fi, Bluetooth, universal wireless transmission, other wireless communication technologies, etc.) interface.
  • the communication interface 1503 may also include a transmitter (such as a radio frequency transmitter, an antenna, etc.) or a receiver coupled with the interface.
  • the bus 1504 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one line is used in Figure 15, but it does not mean that there is only one bus or one type of bus.
  • Bus 1504 may include a path that carries information between various components of computing device 150 (eg, memory 1501, processor 1502, communications interface 1503).
  • the memory 1501 stores executable instructions
  • the processor 1502 executes the executable instructions to implement the aforementioned data migration method, such as the data migration method in the embodiments of Figure 8, Figure 10, or Figure 12. That is, the memory 1501 stores instructions for executing the data migration method.
  • Embodiments of the present application also provide a computing device cluster, which includes at least one computing device 150, and each computing device 150 includes a processor 1502 and a memory 1501;
  • the processor 1502 of at least one computing device 150 is configured to execute instructions stored in the memory 1501 of the at least one computing device 150, so that the computing device cluster implements the aforementioned data migration method, such as in Figure 8, Figure 10 or Figure 12. Data migration method in the example.
  • instructions for executing the data migration method are stored in the memory.
  • An embodiment of the present application also provides a storage device.
  • the storage device includes a storage disk, and a computing device as shown in Figure 14 or a computing device as shown in Figure 15.
  • the storage disk is used to provide space for storing file data
  • the computing device or computing device is used to implement the aforementioned data migration method, such as the side of the first storage device in embodiments such as Figure 8, Figure 10, or Figure 12, and/or, Method on the side of the second storage device.
  • the relevant description of the storage device may also refer to the description of the first computing device and the second computing device in the embodiments such as FIG. 1, FIG. 2, FIG. 3, FIG. 4, etc.
  • the storage device can be a storage product provided by a storage manufacturer.
  • the storage device may include storage products Dorado or Pacific provided by Huawei.
  • Embodiments of the present application provide a computer-readable storage medium. Instructions are stored in the computer-readable storage medium. When the instructions are run on at least one processor, the aforementioned data migration method is implemented. For example, FIG. 8 and FIG. 10 or the data migration method in embodiments such as Figure 12.
  • the computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media.
  • the computer-readable storage medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state drive), etc.
  • the present application provides a computer program product.
  • the computer program product includes computer instructions.
  • the instructions are run on at least one processor, the aforementioned data migration method is implemented, for example, in the embodiments of Figure 8, Figure 10 or Figure 12. data migration method.
  • the computer program product can be a software installation package or image package. If the foregoing method needs to be used, the computer program product can be downloaded and executed on the computing device.
  • At least one mentioned in the embodiments of this application means one or more, and “multiple” means two or more. “At least one of the following” or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items).
  • at least one of a, b, or c can represent: a, b, c, (a and b), (a and c), (b and c), or (a and b and c), where a, b, c can be single or multiple.
  • “And/or” describes the relationship between related objects, indicating that there can be three relationships. For example, A and/or B can mean: A alone exists, A and B exist simultaneously, and B exists alone, where A and B can be singular or plural. The character "/" generally indicates that the related objects are in an "or” relationship.
  • first and second in the embodiments of this application is used to distinguish multiple objects and is not used to limit the order, timing, priority or importance of multiple objects. degree.
  • first storage device and the second storage device are just for convenience of description and do not indicate the differences in device structure, deployment order, importance, etc. between the first storage device and the second storage device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据迁移方法及相关装置,迁移调度装置确定对第一文件的数据从源设备迁移到目标设备的迁移任务,并对第一文件的元数据进行第一变更,以触发上述源设备或目标设备基于该第一文件的元数据的第一变更,完成该迁移任务。该方法基于对文件的元数据的变更(如,对文件的归属、存储布局等元数据的变更)来触发对该文件的数据迁移,而且文件的数据的迁移的进度也能够通过该文件的元数据的状态来反映。该方法可以提高数据迁移的效率,提升在数据使用和管理上的便捷性,还实现了迁移过程中各个设备的解耦,极大地提高业务系统的灵活性和可扩展性。

Description

数据迁移方法及相关装置 技术领域
本申请涉及信息技术(information technology,IT)、存储技术领域,尤其涉及数据迁移方法及相关装置。
背景技术
随着用户业务规模的增长,单一的存储设备无法满足业务的需求,用户业务的数据可能会被存储在多个存储设备中。有些用户的业务比较广,可能会涉及不同的地域的存储设备或不同的数据中心。此时,业务运行的过程中则经常会涉及数据迁移,例如:A存储设备容量较低但存储性能较高,B存储设备容量较高但存储性能较差,因此当业务对存储于A设备的某一数据(例如称为DATA1)的访问性能要求降低后,可以通过迁移操作,将数据DATA1存放于B设备中,实现整体存储成本最优。
目前的数据分级方法主要包含基于中间设备的分级方法和基于复制技术的迁移方法。其中,第一种方法需要引入外部的迁移控制器,由迁移控制器从源设备读取数据并写入目标设备。第二种方法是在源设备和目的设备之间建立双向通道,由源设备控制并执行向目的设备写入数据的过程。可以看出,上述两种方法在迁移之前都需要实现设备的相互感知,在第一种方法中,迁移器需要和源设备和目的设备建立连接和访问安全控制,在第二种方法中,源设备和目的设备之间需要建立连接和访问安全控制。由于需要设备之间的相互感知和安全控制,使得数据迁移的流程复杂,导致数据迁移的效率较低,容易影响用户对于数据的正常使用。
发明内容
本申请实施例提供了数据迁移方法及装置,能够实现基于状态的数据迁移,提高了数据迁移的效率,提升用户在数据使用和管理上的便捷性。
第一方面,本申请实施例提供一种数据迁移方法,该方法包括:
确定针对第一文件的迁移任务,所述第一文件的数据存储在第一存储设备上,所述针对第一文件的迁移任务指示将所述第一文件的数据从所述第一存储设备迁移到第二存储设备;
对所述第一文件的元数据进行第一变更,以触发执行所述针对第一文件的迁移任务;其中,所述第一变更指示所述第一文件归属的存储设备从所述第一存储设备变更为所述第二存储设备。
可选的,上述方法由迁移调度装置来实现。
本申请实施例通过变更文件的元数据来触发针对文件的迁移操作,当文件的归属为第二存储设备而文件的数据仍然存储在第一存储设备上时,则将第一文件从所述第一存储设备迁移到第二存储设备。这种方式能够基于文件状态(文件的归属、文件的存储布局等属性状态)来触发数据迁移。当设备具备变更文件的元数据的能力,即可变更文件状态进而触发迁移过程,无需为了迁移而建立与第一存储设备、第二存储设备之间的数据访问的安全控制,简化了数据迁移的安全控制流程,提高了数据迁移的效率,提升用户在数据使用和管理上的便捷 性。尤其对于包含多个存储设备或多个数据中心的业务,基于状态的数据迁移可以进一步解耦各个设备的功能,极大地提高业务系统的灵活性和可扩展性。
另外,使用该方法进行迁移时,文件的数据的存储位置仍然通过存储布局信息来指示,因此上述的数据迁移方法可以不影响用户对于文件的数据的正常使用,提升了业务系统的稳定性。
在第一方面的一种可能的实施方式中,第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息。进行第一变更以前,所述第一文件的归属信息指示的存储设备为第一存储设备,所述第一文件的存储布局信息指示存储第一文件的存储设备包含第一存储设备且不包含第二存储设备。
可选的,第一文件的归属信息为第一存储设备的标识,第一文件的存储布局信息包含第一存储设备的标识且不包含第二存储设备的标识。
在第一方面的又一种可能的实施方式中,所述对所述第一文件的元数据进行第一变更,包括:
将所述第一文件的归属信息从所述第一存储设备的标识变更为所述第二存储设备的标识。
在第一方面的又一种可能的实施方式中,针对第一文件的迁移任务包括所述第一文件的标识、所述第一存储设备的标识和所述第二存储设备的标识。
在第一方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步。
其中,在多个设备之间同步,是指:可以被多个设备中的任意一个设备改动,改动后的内容能够被该多个设备获知到,且多个设备获知的目标系统的元数据是一致的。
上述多个设备包含第一计算设备和第二计算设备,第一计算设备位于第一存储设备中或与第一存储设备相连,第二计算设备位于第二存储设备中或与第二存储设备相连。
在这种实施方式中,由于目标文件系统的元数据在多个设备之间同步,当对第一文件的元数据进行变更时,多个设备都能够读取文件系统的元数据的变更,使得第一计算设备和第二计算设备能够基于元数据的变更触发迁移操作。
示例性的,源设备可以同步文件系统的元数据,了解第一文件的归属信息的变更,从而触发对第一文件的数据的推送操作。
示例性的,目的设备可以同步文件系统的元数据,了解第一文件的归属信息的变更,从而触发对第一文件的数据的拉取操作。
另外,目标文件系统的元数据在多个设备之间同步,使得多个设备可以使用一致的元数据表示目标文件系统的层次结构和文件(和/或目录)的信息,能够方便地实现某个文件系统的联合、互通,有利于对文件系统进行管理。
进一步的,上述多个设备还可以包含迁移调度装置。
在第一方面的又一种可能的实施方式中,在所述对所述第一文件的元数据进行第一变更之后,所述方法还包括:
发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更,以使得所述第一计算设备或所述第二计算设备根据所述第一通知获取所述第一文件的所述第一变更后的元数 据,并根据所述第一文件的所述第一变更后的元数据执行所述针对第一文件的迁移任务。
可选的,第一通知指示第一文件的元数据发生了哪些变更。例如,第一通知可以包含第一变更的内容,如第一文件的标识和第二变更所改变文件的属性(或属性的值)。这种情况下,第一计算设备和第二计算设备可以根据第一文件的第一变更前的元数据和第一变更的内容,得到第一文件的第一变更后的元数据,并根据第一文件的第一变更后的元数据执行迁移任务。再如,第一通知可以包含第一文件的所述第一变更后的元数据。这种情况下,第一计算设备和第二计算设备可以根据第一文件的第一变更后的元数据执行迁移任务。
可选的,第一通知指示发生了第一变更,但不包含变更的具体内容和第一文件的第一变更后的元数据。此时,第一计算设备和/或第二计算设备可以响应于第一通知,向迁移调度装置请求第一文件的所述第一变更后的元数据,并根据迁移调度装置提供的第一文件的所述第一变更后的元数据来执行迁移任务。
在第一方面的又一种可能的实施方式中,在触发执行针对第一文件的迁移任务之后,所述方法还包括:
获取所述第一文件的第二变更后的元数据,所述第二变更由所述第一计算设备或所述第二计算设备执行,所述第二变更指示所述第一文件的存储布局信息的变化;
根据所述第一文件的所述第二变更后的元数据,确定所述第一文件的迁移进度。
通过监控存储布局信息,可以确定第一文件的数据是否已经存在归属的设备上,从而了解文件的迁移进度,提升用户体验。另一方面,通过跟踪任务执行进度,对于进展缓慢或者失败的任务等,可以尽快进行处理,提升系统稳定性。
另外,进度监控通过文件的元数据变化来实现,可以无需与执行迁移的设备之间进行数据交互,实现了进度跟踪与迁移执行控制之间的解耦。
在第一方面的又一种可能的实施方式中,目标文件系统的元数据在多个设备的本地进行存储。当某一设备对目标文件系统进行了变更时,该设备向存储了目标文件系统的元数据的其他设备通知该变更,其他设备基于通知来相应变更本地存储的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。
示例性的,迁移调度装置、第一计算设备和第二计算设备都在本地存储了第一文件系统的元数据。当迁移调度装置对目标文件系统的元数据中的第一文件的元数据进行了第一变更时,迁移调度装置可以发送第一通知,第一通知指示第一文件的元数据产生了第一变更,第一计算设备和第二计算设备基于第一通知来相应变更本地存储的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。
在第一方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务。其中,全局元数据服务能够存储目标文件系统的元数据。
进一步的,全局元数据服务能够支持对目标文件系统的元数据的访问和更新。具体的,当某一设备对目标文件系统的元数据进行了变更时,该变更被提供给全局元数据服务,多个设备可以从全局元数据服务处访问经过变更的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。
通过全局元数据服务来对文件系统的元数据进行管理、提供访问和更新,则多个设备都按照全局元数据服务中的元数据的格式来读取或者写入元数据,统一了文件的元数据的表示方式,屏蔽了异构的存储设备之间的元数据管理和访问控制的差异,不仅提升用户在数据使 用和管理上的便捷性,还能够提升系统的可扩展性和灵活性。
例如,当有新的存储设备需要共享文件系统的元数据时,通过使用全局元数据服务能够加入共享;类似的,当存储设备退出共享时,可以通过断开与全局元数据服务之间的功能交互来实现退出共享。总之,上述实施方式使得业务的扩展和收缩更灵活且易于实现。
在第一方面的又一种可能的实施方式中,在所述获取所述第一文件的第二变更后的元数据之前,所述方法还包括:
接收第二通知,所述第二通知指示所述第一文件的元数据发生了变更。
在第一方面的又一种可能的实施方式中,所述迁移调度装置、所述第一计算设备和所述第二计算设备均维护有所述目标文件系统的元数据;所述获取所述第一文件的第二变更后的元数据,包括:
根据所述第二通知获取所述第一文件的第二变更后的元数据。
可选的,第二通知包含第二变更的内容,或者,第二通知包含所述第一文件的第二变更后的元数据。
作为一种可能的方案,第二通知包含所述第二变更的内容。例如,第二变更的内容可以包含第一文件的标识和第二变更所改变文件的属性(或属性的值)。此时,迁移调度装置可以根据第一文件的第二变更前的元数据和第二变更的内容得到第一文件的第二变更后的元数据。
作为又一种可能的方案,所述第二通知包含所述第一文件的第二变更后的元数据。此时,迁移调度装置可以根据第二通知得到第一文件的第二变更后的元数据。
在第一方面的又一种可能的实施方式中,通知(例如第一通知、或第二通知等)可以通过消息队列的形式发送。由发送方将消息写入消息队列,接收方通过读取消息队列来接收通知,从而进一步减少不同功能模块之间的耦合度。
在第一方面的又一种可能的实施方式中,所述迁移调度装置、所述第一计算设备和所述第二计算设备均维护有所述目标文件系统的元数据;
所述获取所述第一文件的第二变更后的元数据,包括:
向所述第一计算设备或所述第二计算设备发送用于获取所述第一文件的变更后的元数据的请求;
根据所述第一计算设备或所述第二计算设备对所述请求的响应获取所述第一文件的第二变更后的元数据。
在第一方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
所述获取所述第一文件的第二变更后的元数据,包括:
从所述全局元数据服务获取所述第一文件的所述第二变更后的元数据。
在第一方面的又一种可能的实施方式中,全局元数据服务提供服务接口,设备可以调用服务接口来实现对元数据的访问和更新。
其中,服务接口是一种通信接口,例如应用程序接口(application programming interface,API),能够用于不同的功能模块之间数据交互并提供服务。通过抽象的服务接口,可以将调用者和实现者解耦和,例如调用服务接口的设备可以按照服务接口的要求提供相关的数据,而全局元数据服务可以通过服务接口获取相关的数据并实现相对应的功能,不仅提 升了访问、更新元数据的效率,也提高了系统的可扩展性和灵活性。
在第一方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
所述对所述第一文件的元数据进行第一变更,包括:
通过所述全局元数据服务提供的服务接口来实现所述第一变更。
在第一方面的又一种可能的实施方式中,所述全局元数据服务位于所述多个设备中的任意一个设备上,或者位于所述多个设备之外的任意一个设备上。
示例性的,所述全局元数据服务位于第三计算设备,第三计算设备可以与第一计算设备或第二计算设备相同的计算设备,也可以是二者之外的另一计算设备。
可选的,全局元数据服务的服务接口可以是由第三计算设备向迁移调度装置提供的。或者,第三计算设备可以向迁移调度装置提供另一个接口(便于区分称为第一接口),通过调用该第一接口可以实现调用全局元数据服务的服务接口的功能。
在第一方面的又一种可能的实施方式中,所述目标文件系统的元数据为表式结构且元数据可以被修改。其中,表式结构是一种包含行和列的数据结构,每一行(或者每一列)包含多个值,每个值对应了一个属性。
表式结构的元数据可以增加一行(或一列)元数据、删除一行(或一列)元数据,也可以修改元数据中已有的属性值。也即是,第一变更可以通过修改目标文件系统的元数据的方式实现。
在第一方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。
其中,流式结构是包含多条信息的一种数据结构,每一条信息为一条元数据记录。流式结构具有以下特征:只读、只增、有序,其中“只读”是指流式结构中的记录的值只能读取而无法修改;“只增”指示流式结构中只能追加新的记录而无法删除(或修改)已有的记录,但属于同一个节点的多条记录可以被合并成一条记录;“有序”是指流式结构中的记录具有逻辑顺序,追加的记录在流式结构的尾部增加。
在第一方面的又一种可能的实施方式中,所述对所述第一文件的元数据进行第一变更,包括:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的变更后的归属信息,所述第一文件的变更后的归属信息指示所述第一文件归属的存储设备为所述第二存储设备。
上述实施方式可以进行第一变更,并且,第一变更通过在元数据流中添加元数据记录的方式来实现。此时,其他设备通过获取元数据流的变化(追加的记录),即可获知第一文件系统中的元数据的变更,相应的可以更新第一文件系统在本地的文件的层次结构或节点的属性,有利于实现文件视图在多个设备上的同步。
在第一方面的又一种可能的实施方式中,所述确定针对第一文件的迁移任务之前,所述方法还包括:
根据外部事件信息,确定所述针对第一文件的迁移任务;
所述外部事件信息包含以下信息中的一项或者多项:网络连接情况、设备健康情况或所 述第一文件相关的人员调动状况。
上述实施方式中,数据迁移的触发,与外部事件信息相关。当这些外部事件信息达到触发迁移的条件时,则会确定对应的迁移任务,对数据进行迁移,实现综合多信息流的智能迁移,提升用户的使用体验。
以网络连接情况为例,当某个线路通信中断,预测A地的通信受影响时,可以将A地的数据迁移到B地。
以设备健康情况为例,当某个存储设备的存储访问储能力达到预置下限,则将该存储设备上的数据迁移到其他存储设备上。
以第一文件相关的人员调动状况为例,当某个业务的研发团队人员出差异地,可以主动触发迁移,将该业务的数据迁移到距离出差目的地更近的存储设备。
在第一方面的又一种可能的实施方式中,所述确定针对第一文件的迁移任务,包括:
根据所述第一文件的元数据的分析结果,确定所述针对第一文件的迁移任务;其中,所述分析结果包含以下一项或者多项信息:所述第一文件的冷热状态、所述第一文件的安全性或所述第一文件相关的业务。
作为一种示例,元数据的冷热状态可以通过文件的访问频次来指示。例如,第一文件的元数据中包含表示文件在一段时长内的访问次数的属性。若访问次数大于或者大于等于第一阈值,则将第一文件的数据迁移到高存储速度的设备(例如为第二存储设备),从而提升存取第一文件的数据的效率,提升系统的服务质量。类似的,若第一文件的访问次数小于或者小于等于第二阈值,则将文件的数据迁移到高存储容量的设备,降低存储成本。可选的,第一阈值、第二阈值可以是管理员(例如开发人员、管理部分等)、厂商等输入的、或预先设置的。
作为一种示例,第一文件的元数据中可以包含表示文件的安全等级的属性。例如,若第一文件的安全等级为高级,而第一存储设备安全等级不满足第一文件的安全等级需求,则将第一文件的数据迁移到能够满足第一文件的安全等级的需求的设备上,有效保障用户对文件的安全性需求,提升系统的服务质量。
作为一种示例,第一文件的元数据中包含表示第一文件相关的业务的属性。例如,第一文件的相关业务为车载业务、视频业务、或文件下载业务等。示例性的,若第一文件用于存储车载服务的数据,当车载服务的数据需要迁移到第二存储设备时,第一文件的数据也对应的迁移。如此,用户可以基于不同的业务来对文件进行迁移,提升了用户管理业务数据的便捷性用户,提升系统的服务质量。
可以看出,元数据的分析结果能够指示对文件的迁移需求(例如访问需求、安全需求、业务需求等),基于迁移需求来确定迁移任务,能够实现整体的存储优化。另外,用户通过更新文件的元数据就可以表达对文件的迁移需求,实现了对数据的智能化管理,提升用户在数据使用和管理上的便捷性。
在第一方面的又一种可能的实施方式中,所述确定针对第一文件的迁移任务,包括:
根据用户输入的针对所述第一文件的迁移指示,确定所述针对第一文件的迁移任务。
这种实施方式中,用户可以通过输入迁移指示来实现针对某一文件的迁移,这样可以满足用户的个性化需求,提升用户体验。
在第一方面的又一种可能的实施方式中,所述方法还包括:
确定针对第二文件的迁移任务;
编排所述针对第一文件的迁移任务和所述针对第二文件的迁移任务的执行顺序。
作为一种可能的设计,编排任务可以包含确定任务的执行顺序、执行优先级等,例如先迁移哪一个文件等。这样可以根据需求的轻重缓急,来确定使得部分文件的迁移任务优先被执行,提升用户的使用体验。例如,对于在短时间内访问频次暴涨的文件,可以被优先执行迁移,尽快提升该文件的访问速率,提升用户体验。
作为又一种可能的设计,编排任务的过程中可以合并多个任务。例如,A任务指示将第一文件从第一存储设备迁移到第二存储设备,而B任务指示将第一文件从第一存储设备迁移到第三存储设备,则A任务和B任务可以被合并得到新的任务,新的任务指示将第一文件从第一存储设备迁移到第三存储设备。这样一来可以减少任务执行出错的概率,二来可以减少任务执行的算力消耗,有效提高任务执行效率,提升用户体验。
总之,上述实施方式在执行任务之前先进行编排,从而使得各个迁移任务按照合理的执行顺序和执行方式执行,提升用户体验。
第二方面,本申请实施例提供一种数据迁移方法,应用于第一计算设备,所述第一计算设备位于第一存储设备中或与所述第一存储设备相连,所述第一存储设备上存储第一文件的数据,所述方法还包括:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在确定所述第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备。
本申请实施例通过变更文件的归属信息来触发针对文件的迁移操作,当文件的归属信息指示第二存储设备、而文件的存储布局信息包含指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,则第一存储设备将第一文件从所述第一存储设备迁移到第二存储设备。本申请实施例实现了基于文件状态(文件的归属、文件的存储布局等属性状态)来触发数据迁移,使得第一计算设备在获取到文件的归属信息产生变更后即可对文件进行迁移,提高了数据迁移的效率,提升在数据使用和管理上的便捷性。尤其对于包含多个存储设备或多个数据中心的业务,基于状态的数据迁移过程,可以进一步解耦各个设备的功能,极大地提高业务系统的灵活性和可扩展性。
另外,使用该方法进行迁移时,文件的存储位置仍然通过存储布局信息来指示,因此上述的数据迁移方法可以不影响用户对于文件的数据的正常使用,提升了业务系统的稳定性。
第二方面的一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述第一计算设备。
第二方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
所述获取第一文件的元数据,包括:
从所述全局元数据服务获取所述第一文件的当前的元数据。
第二方面的又一种可能的实施方式中,所述将所述第一文件的数据从所述第一存储设备 迁移到所述第二存储设备,包括:
向共享存储区推送所述第一文件的数据;所述共享存储区与所述第一计算设备和第二计算设备相连,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连;
对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在这种实施方式中,迁移过程中通过中间的共享存储区来暂存第一文件的数据,源设备无需与目的设备建立数据的安全访问控制机制,进一步解耦源设备和目的设备,业务系统的灵活性和可扩展性。
另外,通过存储布局信息来指示数据已经被推送到共享存储区。这种方式能够在不感知第二存储设备的条件下,向第二存储设备指示数据的存储位置,进一步解耦第一存储设备和第二存储设备,提高了业务系统的灵活性和可扩展性。
在第二方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述对所述第一文件的元数据进行第一变更,包括:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在第二方面的又一种可能的实施方式中,在所述获取第一文件的元数据之前,所述方法还包括:
接收第一通知,所述第一通知指示所述第一文件的元数据发生了变更;
在对所述第一文件的元数据执行第一变更后,所述方法还包括:
发送第二通知,所述第一通知指示所述第一文件的元数据发生了变更。
在第二方面的又一种可能的实施方式中,所述获取第一文件的元数据,包括:
同步目标文件系统的元数据,所述目标文件系统的元数据包含所述第一文件的元数据。
作为一种可能的实施方式,目标文件系统的元数据在多个设备的本地进行存储。当某一设备对目标文件系统进行了变更时,该设备向存储了目标文件系统的元数据的其他设备通知该变更,其他设备基于通知来相应变更本地存储的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。
作为一种可能的实施方式,所述目标文件系统的元数据存储在全局元数据服务。其中,全局元数据服务能够存储目标文件系统的元数据。
进一步的,全局元数据服务能够支持对目标文件系统的元数据的访问和更新。
在第二方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述对所述第一文件的元数据执行第一变更,包含:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储 所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在第二方面的又一种可能的实施方式中,所述方法还包括:
当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
上述实施方式中,当第一文件的数据已经存储在第二存储设备中,可以将源设备(第一存储设备)上的数据删除,从而避免多个存储设备重复存储某一个文件的数据,解决存储空间,优化整体存储成本。
在第二方面的又一种可能的实施方式中,在所述删除所述第一存储设备上的所述第一文件之前,所述方法还包括:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
例如,当第一文件的数据正在使用,不便于立即执行删除操作,此时可以先标记第一文件的数据,等到文件使用结束后删除。
再如,在删除文件之前,先将文件标记为可删除,可删除的文件无法通过正常方式访问到,等到满足预设条件时在统一删除文件。这里的预设条件可以是,标记为可删除的时间达到预设时长、标记文可删除的数据达到预设大小等。这样可以方便用户在第一存储设备上找回第一文件的数据,减少因为误操作带来的数据丢失,提升用户体验。
在第二方面的又一种可能的实施方式中,所述将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备,包括:
向第二计算设备推送所述第一文件的数据。
例如,第一计算设备和第二计算设备已经建立过连接或者确认对端设备为可信设备,则第一计算设备可以直接向第二计算设备推送第一文件的数据,而不用经过中间设备,提升数据迁移的效率。
再如,第一存储设备(或第一计算设备)和第二存储设备(或第二计算设备)是位于同一数据中心的设备,则第一计算设备可以向第二计算设备推送第一文件的数据。
在第二方面的又一种可能的实施方式中,所述方法还包括:
接收来自所述第二计算设备的针对所述第一文件的拉取请求。
在上述实施方式中,第二计算设备发送了拉取请求,说明第二计算设备和/或第二存储设备应当处于可用状态,这样可以避免将第一文件的数据迁移到不可用的设备上,提升数据的安全性和可用性。
在第二方面的又一种可能的实施方式中,所述方法还包括:
提供所述第一存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
上述实施方式中,第一计算设备可以基于文件的存储布局信息对外提供本地视图,使得 用户或应用可以方便地获取到存储在第一存储设备上的文件的数据,满足了对文件的可视化需求,提升了用户体验。
在第二方面的又一种可能的实施方式中,所述方法还包括:
提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
上述实施方式中,第一计算设备可以提供归属文件视图,以呈现归属于第一存储设备的文件的信息,满足了对归属于第一存储设备的文件的可视化需求,提升了用户体验。
在第二方面的又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述方法还包括:
提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
这种实施方式中,全局文件视图能够将跨设备的文件的信息融合在一个文件视图中,使得不同的设备中的数据不再是孤单的数据,全局文件系统的低层联合对于上层应用无感,上层应用使用全局文件系统的方式就像使用传统的一个文件系统一样轻松,极大地提升用户在数据使用和管理上的便捷性。
第三方面,本申请实施例提供一种数据迁移方法,应用于第二计算设备,所述第二计算设备位于第二存储设备中,或者,所述第二计算设备与所述第二存储设备相连,所述方法包括:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备。
本申请实施例通过变更文件的归属信息来触发针对文件的迁移操作,当文件的归属信息指示第二存储设备、而文件的存储布局信息包含指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,则第二计算设备将第一文件的数据从所述第一存储设备拉取到第二存储设备。本申请实施例实现了基于文件状态(文件的归属、文件的存储布局等属性状态)来触发数据迁移,使得第二计算设备在获取到文件的归属信息产生变更后即可拉取归属于第二存储设备的文件的数据,提高了数据迁移的效率,提升在数据使用和管理上的便捷性。尤其对于包含多个存储设备或多个数据中心的业务,基于状态的数据迁移过程,可以进一步解耦各个设备的功能,极大地提高业务系统的灵活性和可扩展性。
另外,使用该方法进行迁移时,文件的存储位置仍然通过存储布局信息来指示,因此上述的数据迁移方法可以不影响用户对于文件的数据的正常使用,提升了业务系统的稳定性。
在第三方面的一种可能的实施方式中,所述方法还包括:
对所述第一文件的元数据执行第一变更,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述第二存储设备。
在第三方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息不包含所述第二存储设备的标 识;
所述对所述第一文件的元数据执行第一变更,包括:
在所述第一文件的存储布局信息中增加所述第二存储设备的标识。
在第三方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备,在所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备之前,所述方法还包括:
发送针对所述第一文件的拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件;所述第一计算设备位于第一存储设备中或与所述第一存储设备相连。
在第三方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含共享存储区,所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备,包括:
从所述共享存储区拉取所述第一文件的数据到所述第二存储设备。
在第三方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备,所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备,包括:
从第一存储设备拉取所述第一文件的数据到所述第二存储设备。
在第三方面的又一种可能的实施方式中,所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备:
接收第一计算设备推送的第一文件的数据。
在第三方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据存储在全局元数据服务中;
所述获取第一文件的元数据,包括:
从所述全局元数据服务获取所述第一文件的当前的元数据。
在第三方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。所述对所述第一文件的元数据执行第一变更,包括:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息包含所述第二存储设备的标识。
在第三方面的又一种可能的实施方式中,所述方法还包括:
提供所述第二存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
在第三方面的又一种可能的实施方式中,所述方法还包括:
提供所述第二存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第二存储设备。
在第三方面的又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述方法还包括:
提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
第四方面,本申请实施例提供一种数据迁移方法,应用于第一计算设备,所述第一计算设备位于第一存储设备中或与所述第一存储设备相连,所述第一存储设备上存储第一文件的数据,所述方法还包括:
接收来自第二计算设备的针对所述第一文件的拉取请求,所述第二计算设备与第二存储设备相连;
向共享存储区推送所述第一文件的数据;
对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在第四方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述对所述第一文件的元数据进行第一变更,包括:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在第四方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述第一计算设备。
在第四方面的又一种可能的实施方式中,在对所述第一文件的元数据执行第一变更后,所述方法还包括:
发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更。
在第四方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述对所述第一文件的元数据执行第一变更,包含:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在第四方面的又一种可能的实施方式中,所述方法还包括:
当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布 局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
在第四方面的又一种可能的实施方式中,在所述删除所述第一存储设备上的所述第一文件之前,所述方法还包括:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
在第四方面的又一种可能的实施方式中,所述方法还包括:
提供所述第一存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
在第四方面的又一种可能的实施方式中,所述方法还包括:
提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在第四方面的又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述方法还包括:
提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
第五方面,本申请实施例提供一种迁移调度装置,所述迁移调度装置包含任务确定模块和元数据更新模块,所述迁移调度装置用于第一方面任一项所述的方法。
在第五方面的一种可能的实施方式中,所述任务确定模块,用于确定针对第一文件的迁移任务,所述第一文件的数据存储在第一存储设备上,所述针对第一文件的迁移任务指示将所述第一文件的数据从所述第一存储设备迁移到第二存储设备;
所述元数据更新模块,用于对所述第一文件的元数据进行第一变更,以触发执行所述针对第一文件的迁移任务;其中,所述第一变更指示所述第一文件归属的存储设备从所述第一存储设备变更为所述第二存储设备。
在第五方面的又一种可能的实施方式中,第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息。进行第一变更以前,所述第一文件的归属信息指示的存储设备为第一存储设备,所述第一文件的存储布局信息指示存储第一文件的存储设备包含第一存储设备且不包含第二存储设备。可选的,第一文件的归属信息为第一存储设备的标识,第一文件的存储布局信息包含第一存储设备的标识且不包含第二存储设备的标识。
在第五方面的又一种可能的实施方式中,所述元数据更新模块,用于将所述第一文件的归属信息从所述第一存储设备的标识变更为所述第二存储设备的标识。
在第五方面的又一种可能的实施方式中,针对第一文件的迁移任务包括所述第一文件的标识、所述第一存储设备的标识和所述第二存储设备的标识。
在第五方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步。上述多个设备包含第一计算设备和第二计算设备,第一计算设备位于第一存储设备中或与第一存储设备相连,第二计算设备位于第二存储设备中或与第二存储设备相连。进一步的,上述多个设备还可以包含迁移调度装置。
在第五方面的又一种可能的实施方式中,所述迁移调度装置还包含通信模块,所述通信 模块,用于发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更,以使得所述第一计算设备或所述第二计算设备根据所述第一通知获取所述第一文件的所述第一变更后的元数据,并根据所述第一文件的所述第一变更后的元数据执行针对第一文件的迁移任务。
可选的,第一通知指示第一文件的元数据发生了哪些变更,例如,第一通知可以包含第一变更的内容,和/或,第一通知可以包含第一文件的所述第一变更后的元数据。或者可选的,可选的,第一通知指示发生了第一变更,但不包含变更的具体内容和第一文件的第一变更后的元数据。
在第五方面的又一种可能的实施方式中,所述迁移调度装置还包含任务监控模块,所述任务监控模块,用于:
获取所述第一文件的第二变更后的元数据,所述第二变更由所述第一计算设备或所述第二计算设备执行,所述第二变更指示所述第一文件的存储布局信息的变化;
根据所述第一文件的所述第二变更后的元数据,确定所述第一文件的迁移进度。
在第五方面的又一种可能的实施方式中,目标文件系统的元数据在多个设备的本地进行存储。
在第五方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务。其中,全局元数据服务能够存储目标文件系统的元数据。进一步的,全局元数据服务能够支持对目标文件系统的元数据的访问和更新。
在第五方面的又一种可能的实施方式中,迁移调度装置还包含通信模块,所述通信模块,用于:
接收第二通知,所述第二通知指示所述第一文件的元数据发生了变更。
在第五方面的又一种可能的实施方式中,所述任务监控模块,还用于:
根据所述第二通知获取所述第一文件的第二变更后的元数据。可选的,第二通知包含第二变更的内容,或者,第二通知包含所述第一文件的第二变更后的元数据。
在第五方面的又一种可能的实施方式中,通知(例如第一通知、或第二通知等)可以通过消息队列的形式发送。由发送方将消息写入消息队列,接收方通过读取消息队列来接收通知,从而进一步减少不同功能模块之间的耦合度。
在第五方面的又一种可能的实施方式中,所述通信模块,还用于:向所述第一计算设备或所述第二计算设备发送用于获取所述第一文件的变更后的元数据的请求;
所述任务监控模块,还用于:根据所述第一计算设备或所述第二计算设备对所述请求的响应获取所述第一文件的第二变更后的元数据。
在第五方面的又一种可能的实施方式中,所述任务监控模块,还用于从所述全局元数据服务获取所述第一文件的所述第二变更后的元数据。
在第五方面的又一种可能的实施方式中,全局元数据服务提供服务接口,所述迁移调度装置可以调用服务接口来实现对元数据的访问和更新。
其中,服务接口是一种通信接口,例如应用程序接口(application programming interface,API),能够用于不同的功能模块之间数据交互并提供服务。
在第五方面的又一种可能的实施方式中,所述元数据更新模块,还用于:
通过所述全局元数据服务提供的服务接口来实现所述第一变更。
在第五方面的又一种可能的实施方式中,所述全局元数据服务位于所述多个设备中的任 意一个设备上,或者位于所述多个设备之外的任意一个设备上。
示例性的,所述全局元数据服务位于第三计算设备。全局元数据服务的服务接口可以是由第三计算设备向迁移调度装置提供的。或者,第三计算设备可以向迁移调度装置提供另一个接口(便于区分称为第一接口),通过调用该第一接口可以实现调用全局元数据服务的服务接口的功能。
在第五方面的又一种可能的实施方式中,所述目标文件系统的元数据为表式结构且元数据可以被修改。其中,表式结构是一种包含行和列的数据接口,每一行(或者每一列)包含多个值,每个值对应了一个字段。
表式结构的元数据可以增加元数据、删除元数据,也可以修改已有的元数据。也即是或,第一变更可以通过修改目标文件系统的元数据的方式实现。
在第五方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。
在第五方面的又一种可能的实施方式中,所述元数据更新模块,还用于:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的变更后的归属信息,所述第一文件的变更后的归属信息指示所述第一文件归属的存储设备为所述第二存储设备。
在第五方面的又一种可能的实施方式中,所述任务确定模块,还用于:
根据外部事件信息,确定所述针对第一文件的迁移任务;
所述外部事件信息包含以下信息中的一项或者多项:网络连接情况、设备健康情况或所述第一文件相关的人员调动状况。
在第五方面的又一种可能的实施方式中,所述任务确定模块,还用于:
根据所述第一文件的元数据的分析结果,确定所述针对第一文件的迁移任务;其中,所述分析结果包含以下一项或者多项信息:所述第一文件的冷热状态、所述第一文件的安全性或所述第一文件相关的业务。
在第五方面的又一种可能的实施方式中,所述任务确定模块,还用于:
根据用户输入的针对所述第一文件的迁移指示,确定所述针对第一文件的迁移任务。
在第五方面的又一种可能的实施方式中,所述任务确定模块,还用于确定针对第二文件的迁移任务;
所述迁移调度装置还包含,任务编排模块,所述任务编排模块用于编排所述针对第一文件的迁移任务和所述针对第二文件的迁移任务的执行顺序。
作为一种可能的设计,编排任务可以包含确定任务的执行顺序、执行优先级等。
作为又一种可能的设计,编排任务的过程中可以合并多个任务。
第六方面,本申请实施例提供一种计算装置,所述计算装置包含元数据获取模块和迁移模块,所述计算装置用于实现第二方面任一项所描述的方法。
可选的,所述计算装置位于第一存储设备内或者第一存储设备相连。
在第六方面的一种可能的实施方式中,所述元数据获取模块,用于获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
所述迁移模块,用于在确定所述第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备。
在第六方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述计算装置或者所述计算装置所在的第一计算设备。
在第六方面的又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
所述元数据获取模块,还用于:
从所述全局元数据服务获取所述第一文件的当前的元数据。
在第六方面的又一种可能的实施方式中,所述迁移模块,还用于:
向共享存储区推送所述第一文件的数据;所述共享存储区与所述第一计算设备和第二计算设备相连,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连;
所述计算装置还包含元数据更新模块,所述元数据更新模块,用于对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在第六方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述元数据更新模块,还用于:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在第六方面的又一种可能的实施方式中,所述计算装置还包含通信模块,所述通信模块,用于:
接收第一通知,所述第一通知指示所述第一文件的元数据发生了变更。
在第六方面的又一种可能的实施方式中,所述计算装置还包含通信模块,所述通信模块,用于:
发送第二通知,所述第一通知指示所述第一文件的元数据发生了变更。
在第六方面的又一种可能的实施方式中,所述元数据获取模块,还用于:
同步目标文件系统的元数据,所述目标文件系统的元数据包含所述第一文件的元数据。
在第六方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述元数据更新模块,还用于:在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在第六方面的又一种可能的实施方式中,所述计算装置还包含删除控制模块,所述删除控制模块,用于:
当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
所述元数据更新模块,还用于:对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
在第六方面的一种可能的实施方式中,所述删除控制模块,还用于:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
在第六方面的一种可能的实施方式中,所述迁移模块,还用于:
向第二计算设备推送所述第一文件的数据。
在第六方面的一种可能的实施方式中,所述计算装置还包含通信模块,所述通信模块,用于:
接收来自所述第二计算设备的针对所述第一文件的拉取请求。
在第六方面的又一种可能的实施方式中,所述计算装置还包含视图提供模块,视图提供模块用于提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在第六方面的又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述计算装置还包含视图提供模块,视图提供模块用于提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
第七方面,本申请实施例提供一种计算装置,所述计算装置包含元数据获取模块和迁移模块,所述计算装置用于实现第三方面任一项所描述的方法。
可选的,所述计算装置位于第二存储设备内或者第二存储设备相连。
在第七方面的又一种可能的实施方式中,所述元数据获取模块,用于获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
所述迁移模块,用于在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备。
在第七方面的又一种可能的实施方式中,所述计算装置还包含元数据更新模块,所述元数据更新模块,还用于:
对所述第一文件的元数据执行第一变更,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述第二存储设备。
在第七方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息不包含所述第二存储设备的标识;
所述元数据更新模块,还用于:
在所述第一文件的存储布局信息中增加所述第二存储设备的标识。
在第七方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备。所述计算装置还包含通信模块,所述通信模块用于:
发送针对所述第一文件的拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件;所述第一计算设备位于第一存储设备中或与所述第一存储设备相连。
在第七方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含共享存储区。所述迁移模块,用于:
从所述共享存储区拉取所述第一文件的数据到所述第二存储设备。
在第七方面的又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备。所述迁移模块,还用于:
从第一存储设备拉取所述第一文件的数据到所述第二存储设备。
在第七方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据存储在全局元数据服务中;
所述元数据获取模块,还用于:
从所述全局元数据服务获取所述第一文件的当前的元数据。
在第七方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。所述对所述第一文件的元数据执行第一变更,包括:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息包含所述第二存储设备的标识。
在第七方面的又一种可能的实施方式中,所述计算装置还包含视图提供模块,视图提供模块用于提供所述第一存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
在第七方面的又一种可能的实施方式中,所述计算装置还包含视图提供模块,视图提供模块用于提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在第七方面的又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件以联邦构成全局文件系统。所述计算装置还包含视图提供模块,视图提供模块用于提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
第八方面,本申请实施例提供一种计算装置,所述计算装置包含通信模块、迁移模块和 元数据更新模块,所述计算装置用于实现第四方面任一项所描述的方法。
可选的,所述计算装置位于第一存储设备内或者第一存储设备相连。
在第八方面的一种可能的实施方式中,所述通信模块,用于接收来自第二计算设备的针对所述第一文件的拉取请求,所述第二计算设备与第二存储设备相连;
所述迁移模块,用于向共享存储区推送所述第一文件的数据;
所述元数据更新模块,用于对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在第八方面的又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述元数据更新模块,还用于:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在第八方面的又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述第一计算设备。
在第八方面的又一种可能的实施方式中,所述通信模块,还用于:
发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更。
在第八方面的又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述元数据更新模块,用于:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在第八方面的又一种可能的实施方式中,所述计算装置包含删除控制模块,所述删除控制模块,用于当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
所述元数据更新模块,用于对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
在第八方面的又一种可能的实施方式中,所述删除控制模块,还用于:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
第九方面,本申请实施例提供一种数据迁移系统,该数据迁移系统包含第一计算设备和第二计算设备,所述第一计算设备位于第一存储设备内或者所述第一计算设备与第一存储设备相连,所述第二计算设备位于第二存储设备内或者所述第二计算设备与第二存储设备相连。
其中,所述第一存储设备用于实现第二方面任一项或第四方面任一项所描述的方法,所述第二存储设备用于实现第三方面任一项所描述的方法。
或者,所述第一存储设备包含第六方面任一项或第八方面任一项所描述的计算装置,所述第二存储设备包含第七方面任一项所描述的计算装置。
在第九方面的一种可能的实施方式中,所述数据迁移系统还包含迁移调度装置,所述迁移调度装置用于实现第一方面任一项所描述的方法。
在第九方面的又一种可能的实施方式中,所述数据迁移系统还包含迁移调度装置,所述计算设备为第五方面任一项所描述的迁移调度装置。
第十方面,本申请实施例提供一种数据迁移系统,该数据迁移系统包含第一计算设备和第二计算设备,所述第一计算设备位于第一存储设备内或者所述第一计算设备与第一存储设备相连,所述第二计算设备位于第二存储设备内或者所述第二计算设备与第二存储设备相连;
第一计算设备,用于:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在确定所述第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,向共享存储区推送所述第一文件的数据;所述共享存储区与所述第一计算设备和第二计算设备相连,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连;
第二计算设备,用于:
从所述共享存储区拉取所述第一文件的数据。
在第十方面的又一种可能的实施方式中,所述数据迁移系统还包含迁移调度装置,所述迁移调度装置用于实现第一方面任一项所描述的方法,或者,所述迁移调度装置为第五方面任一项的迁移调度装置。
第十方面的可能的实施方式可以参考前述第一至第九方面的可能实施方式。
第十一方面,本申请实施例提供一种数据迁移系统,该数据迁移系统包含第一计算设备和第二计算设备,所述第一计算设备位于第一存储设备内或者所述第一计算设备与第一存储设备相连,所述第二计算设备位于第二存储设备内或者所述第二计算设备与第二存储设备相连;
第二计算设备,用于:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,向 第二存储设备发送拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件的数据;
第一计算设备,用于:
接收来自第二计算设备的针对所述第一文件的拉取请求;
向共享存储区推送所述第一文件的数据;
所述第二计算设备,还用于:
从所述共享存储区拉取所述第一文件的数据。
在第十一方面的又一种可能的实施方式中,所述数据迁移系统还包含迁移调度装置,所述迁移调度装置用于实现第一方面任一项所描述的方法,或者,所述迁移调度装置为第五方面任一项的迁移调度装置。
第十一方面的可能的实施方式可以参考前述第一至第九方面的可能实施方式。
第十二方面,本申请实施例提供一种数据迁移系统,该数据迁移系统包含第一计算设备和第二计算设备,所述第一计算设备位于第一存储设备内或者所述第一计算设备与第一存储设备相连,所述第二计算设备位于第二存储设备内或者所述第二计算设备与第二存储设备相连;
第一计算设备,用于:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,向第二计算设备推送所述第一文件的数据;
第二计算设备,用于:
接收所述第一计算设备推送的所述第一文件的数据。
在第十二方面的一种可能的实施方式中,所述数据迁移系统还包含迁移调度装置,所述迁移调度装置用于实现第一方面任一项所描述的方法,或者,所述迁移调度装置为第五方面任一项的迁移调度装置。
第十二方面的可能的实施方式可以参考前述第一至第九方面的可能实施方式。
第十三方面,本申请实施例提供一种数据迁移系统,该数据迁移系统包含第一计算设备和第二计算设备,所述第一计算设备位于第一存储设备内或者所述第一计算设备与第一存储设备相连,所述第二计算设备位于第二存储设备内或者所述第二计算设备与第二存储设备相连;
第二计算设备,用于:
获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,向第二存储设备发送拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件的数据;
第一计算设备,用于:
接收来自第二计算设备的针对所述第一文件的拉取请求;
向所述第二计算设备推送所述第一文件的数据;
所述第二计算设备,还用于:
接收所述第二计算设备推送的所述第一文件的数据。
第十三方面的可能的实施方式可以参考前述第一至第九方面的可能实施方式。
第十四方面,本申请实施例提供一种计算设备,该计算设备包括处理器和存储器;所述处理器执行存储器中存储的指令,以使得所述计算设备实现前述第一方面任一项所描述的方法。
可选的,所述计算设备还包括通信接口,所述通信接口用于接收和/或发送数据,和/或,所述通信接口用于为所述处理器提供输入和/或输出。
需要说明的是,上述实施例是以通过调用计算机指定来执行方法的处理器(或称通用处理器)为例进行说明。具体实施过程中,处理器还可以是专用处理器,此时计算机指令已经预先加载在处理器中。可选的,处理器还可以既包括专用处理器也包括通用处理器。
可选的,处理器和存储器还可能集成于一个器件中,即处理器和存储器还可以被集成在一起。
第十五方面,本申请实施例还提供一种计算设备集群,该计算设备集群包含至少一个计算设备,每个计算设备包括处理器和存储器;
所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行第一方面任一项所述的方法。
第十六方面,本申请实施例提供一种计算设备,该计算设备包括处理器和存储器;所述存储器用于存储计算机指令,所述处理器用于执行所述存储器存储的计算机指令,以使得所述计算设备实现第二方面任一项所描述的方法,或者实现第四方面任一项所描述的方法。
第十七方面,本申请实施例提供一种存储设备,该存储设备包含计算设备和与计算设备相连的存储盘。其中,相连可以是通过有线线路相连,也可以通过无线线路相连。例如,二者通过总线相连。再如,二者通过交换机相连。其中,计算设备可以为第十六方面所描述的计算设备。
第十八方面,本申请实施例提供一种存储设备,该存储设备包括计算设备和与计算设备相连的存储盘;所述存储器用于存储计算机指令,所述处理器用于执行所述存储器存储的计算机指令,以使得所述计算设备实现第三方面任一项所描述的方法。
第十九方面,本申请实施例提供一种存储设备,该存储设备包含计算设备和与计算设备相连的存储盘。其中,相连可以是通过有线线路相连,也可以通过无线线路相连。例如,二者通过总线相连。再如,二者通过交换机相连。其中,计算设备可以为第十八方面所描述的计算设备。
第二十方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在至少一个处理器上运行时,实现前述第一方面任一项所描述的方法;又或者实现前述第二方面任一项所描述的方法;又或者实现前述第三方面任一项所描述的方法;又或者实现前述第四方面任一项所描述的方法。
第二十一方面,本申请提供了一种计算机程序产品,计算机程序产品包括计算机指令, 当所述指令在至少一个处理器上运行时,实现前述第一方面任一项所描述的方法;又或者实现前述第二方面任一项所描述的方法;又或者实现前述第三方面任一项所描述的方法;又或者实现前述第四方面任一项所描述的方法。
可选的,该计算机程序产品可以为一个软件安装包或镜像包,在需要使用前述方法的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。
本申请第三至第二十一方面所提供的技术方案,其有益效果可以参考第一方面和/或第二方面的技术方案的有益效果,此处不再赘述。
附图说明
下面将对实施例描述中所需要使用的附图作简单地介绍。
图1是本申请实施例提供的一种数据迁移系统的架构示意图;
图2是本申请实施例提供的一种存储设备的架构示意图;
图3是本申请实施例提供的又一种数据迁移系统的架构示意图;
图4是本申请实施例提供的一种数据迁移系统的运行场景示意图;
图5是本申请实施例提供的一种全局文件系统的视图;
图6是本申请实施例提供的一种元数据流的示意图;
图7是本申请实施例提供的一种元数据表的变更示意图;
图8是本申请实施例提供的一种数据迁移方法的流程示意图;
图9是本申请实施例提供的一种第一文件的元数据的变更的示意图;
图10是本申请实施例提供的又一种数据迁移方法的流程示意图;
图11是本申请实施例提供的又一种第一文件的元数据的变更的示意图;
图12是本申请实施例提供的又一种数据迁移方法的流程示意图;
图13是本申请实施例提供的一种迁移调度装置的结构示意图;
图14是本申请实施例提供的一种计算装置的结构示意图;
图15是本申请实施例提供的一种计算设备的结构示意图。
具体实施方式
下面结合附图对本申请实施例进行详细介绍。
为了便于理解,以下示例地给出了部分与本申请实施例相关概念的说明以供参考。如下所示:
1.文件系统:文件系统是用于明确存储盘(例如磁盘、固态硬盘、或分区等)上的文件的方法和数据结构,即在存储盘上组织文件的方法。文件系统的主要作用是让用户可以便捷地读写文件,例如,用户向文件系统提供指定文件的标识(例如文件的名称、文件的路径等),文件系统就可以存取对应文件的数据。
2.文件、数据、元数据:文件,或称计算机文件,是一个信息集合。文件包含数据和元数据。数据即文件的数据内容;元数据即描述文件的信息,例如文件的名称、文件的大小、文件的类型等。
示例性的,文件的数据一般为非结构化的数据,例如文档、图片、视频、音频等没有固定结构的数据。
3.归属:文件的元数据中可以包含归属信息,归属信息用于指定文件的归属设备,例如文件所归属的存储设备。文件的归属设备用于管理文件的数据,包含但不限于是维护文件最新的完整数据、当文件的数据变化时发布数据变更、或发放数据(例如向请求数据的应用返回数据)等中的一项或者多项。
需要说明的是,目录的元数据中也可以包含归属信息,用于指定目录的归属设备。
4.消息队列
消息队列是一种数据结构,可以理解为包含一条或者多条消息的列表。消息在被处理和删除之前存储在消息队列上,消息发送方通过消息队列服务可以与消息接收方进行交互。应理解,本申请为了便于描述,故将包含多个消息的数据结构统一称为消息队列,并不旨在限定通过队列的方式实现消息队列。例如,具体实施过程中,消息队列还可以通过列表、堆、链表、或栈等方式来实现。
5.数据迁移、数据分级
数据迁移是指将数据从一个设备(源设备)迁移到另一个设备(目的设备)的过程。
数据分级,或称为分级存储管理(Hierarchical Storage Management,HSM),是指将数据从一个设备(源设备)迁移到另一个设备(目的设备),且在源设备上删除该数据的过程。在数据分级的过程中,源设备和目的设备通常是不同级别的(或者说具备不同的存储能力),例如,源设备和目的设备的成本、存取数据速度等不同。
一般来说,数据迁移不关注源设备后续对于其上存储的数据如何处理,而数据分级需要将源设备上存储的数据删除以释放存储空间。
当然,一些场景中,数据迁移和数据分级也可以相互替换,为了便于理解,本申请统一使用数据迁移来描述。
上文中对术语的示例性解释可以应用在下文的实施例中。需要说明的是,下文所涉及的文件系统是为数据提供存取、访问服务的系统。一些场景中,具备类似特征的系统其名称可能不一定称为文件系统,本申请为了方面描述以对文件系统中的数据的进行迁移为例进行描述,对于其他类似的系统也同样适用。
例如,一些对象系统在存储对象时,可以通过对象格式来存取、访问数据。而以对象形式来存储的数据也具有对应的元数据,本申请实施例对于对象系统也同样适用。
本申请实施例提供一种数据迁移方法及装置,该方法基于对文件的元数据的变更(如,对文件的归属、存储布局等元数据的变更)来触发对该文件的数据的迁移,而且该文件的数据的迁移进度也能够通过该文件的元数据的状态来反映。对于迁移调度装置来说,无需为了迁移而建立与源设备、目的设备之间的访问安全控制,简化了数据迁移的安全控制流程。对于源设备和目的设备来说,通过更新文件的归属信息和存储布局信息,来向对方指示其完成的操作,可以提高数据迁移的效率,提升在数据使用和管理上的便捷性。总之,本申请实施例不仅能够提升数据迁移的效率,并且实现了迁移过程中各个设备的解耦,极大地提高业务系统的灵活性和可扩展性。
下面对本申请实施例的系统架构进行示例性地描述。
需要说明的是,本申请描述的系统架构是为了更加清楚的说明本申请的技术方案,并不构成对于本申请提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和 新业务场景的出现,本申请提供的技术方案对于类似的技术问题,同样适用。
请参见图1,图1是本申请实施例提供的一种数据迁移系统的架构示意图,该数据迁移系统10包含存储设备101和存储设备102。
存储设备101能够提供存储空间且具有存储数据的能力。存储设备101包含计算设备1011和存储盘1012,计算设备1011和存储盘1012相连。其中,计算设备1011具有计算能力;存储盘1012用于提供存储空间,存储盘1012中可以存储文件的数据。存储盘1012包含但不限于是硬盘、随机存取存储器或只读存储器(Read Only Memory,ROM)等,或者,存储盘还可以是虚拟的,例如为虚拟存储池等。
可选的,计算设备1011能够完成以下一种或者多种功能:获取文件的元数据、控制存储盘1012中的数据的读出和写入、或对元数据进行变更等。
类似的,存储设备102包含计算设备1021和存储盘1022,计算设备1021和存储盘1022相连接。其中,计算设备1021具有计算能力;存储盘1022用于提供存储空间。关于计算设备1021和存储盘1022的介绍请参考前述对计算设备1011和存储盘1012的介绍。
文件的数据经常需要进行跨设备的迁移,例如,将存储在存储设备101中的文件的数据迁移到存储设备102中。
本申请实施例可以基于文件的状态来控制数据迁移。具体的,文件的元数据包含文件的归属信息和存储布局信息,归属信息指示文件的归属设备,例如指示该文件归属的存储设备;文件的存储布局信息指示存储该文件的设备。在图1所示的系统中,文件的归属设备为存储设备101,而且,由于文件的数据存储在存储设备101上,文件的存储布局信息指示存储设备101。这种情况下,通过变更文件的归属信息,可以触发文件的数据的迁移。例如,文件的归属信息由指示存储设备101变更为指示存储设备102时,触发存储设备101和/或存储设备102将文件的数据从存储设备101迁移到存储设备102。
这样一来,只要设备具备变更文件的元数据的能力,即可触发数据迁移操作,无需为了迁移而建立与源设备(数据迁出的设备)和目的设备(数据迁入的设备)之间的数据访问的安全控制,简化了数据迁移的安全控制流程,提高了数据迁移的效率,提升用户在数据使用和管理上的便捷性。尤其对于包含多个存储设备或多个数据中心的业务,基于状态的数据迁移可以进一步解耦各个设备的功能,极大地提高业务系统的灵活性和可扩展性。
作为一种可能的实施方式,在文件的数据迁移的过程中,存储设备101可以将文件的数据推送至指定设备,以使得存储设备102从指定设备获取文件,以实现数据迁移。其中,指定设备可以为自定义的某一设备,能够为存储设备101和存储设备102提供数据存储服务(也称为共享存储区)。可选的,数据存储服务可以由全局数据服务提供,也可以由第三方临时存储设备、中间设备提供。
应理解,在图1所示的数据迁移系统中,存储设备中的计算设备(例如计算设备1011、计算设备1021)可以通过软件实现和/或通过硬件实现。
作为计算设备通过硬件实现的一种举例,计算设备可以为控制器、处理器或服务器等。其中,控制器包含但不限于存储控制器(例如内存控制器、硬盘控制器、集成驱动器,电子控制器、磁盘阵列控制器等)、组合逻辑控制器、硬布线控制器等。处理器包含但不限于是中央处理器、图片处理器、人工智能处理器、微处理器或可编程逻辑门阵列等。另外,在一些场景中,因控制器也具有计算能力和/或能够执行指令,因此控制器也可以看作处理器。服务 器包含但不限于是通用计算机、存储服务器、云服务器或刀片式服务器等。当计算设备的功能由服务器实现时,其所包含的服务器的数量也可以是一个,也可以是多个(如服务器集群)。
作为一种可能的方案,计算设备所实现的功能可以通过软件功能单元来实现。示例性地,计算模块可以为虚拟机、容器、云端等。其中,虚拟机是通过软件模拟的具有完整硬件系统功能的、运行在隔离环境中的计算机系统。容器是将应用和应用依赖包进行打包得到的隔离环境。云端是采用应用程序虚拟化技术的软件平台,能够让一个或者多个软件、应用在独立的虚拟化环境中开发、运行。可选的,云端可以部署在公有云、私有云、或者混合云上等。
作为软件功能单元的一种举例,计算设备可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算设备)、虚拟机、容器中的至少一种。
可选的,存储设备中的计算设备和存储盘可以集成的。作为计算设备和存储盘集成设置的一个举例,存储设备为盘控一体的存储系统,存储设备包含控制器(数量可以为一个或者多个),控制器与存储盘(例如硬盘)通过总线连接。控制器可以用于处理来自存储设备外部(服务器或者其他存储系统)的数据访问请求,也用于处理存储设备内部生成的请求。示例性的,控制器接收应用服务器发送的写数据请求时,可以通过将写数据请求中携带的数据发送给存储盘进行存储。在这种情况下,可选地,计算设备可以是存储设备中的控制器。
或者可选的,计算设备和存储盘也可以是独立设置的。如图2所示是本申请实施例提供的一种可能的存储设备的架构示意图,存储设备101中的计算设备1021为独立的设备,存储盘1022位于计算设备1021之外。
当然,无论二者是独立设置还是集成于同一设备中,计算设备和存储盘之间是相互连接的,二者的连接方式可以是总线或网络等。网络例如为有线网络、无线网络、有线网络和无线网络的组合等,示例性的,二者可以通过网线连接,或者,通过交换机连接。
在上文中,文件的归属信息的变更,可以由存储设备101执行,或者由存储设备102执行,或者由其他设备来执行。
作为一种可能的方案,数据迁移系统还包含迁移调度装置,迁移调度装置用于确定迁移任务并基于迁移任务变更文件的元数据。
如图3所示是本申请实施例提供的又一种数据迁移系统的架构示意图,该数据迁移系统30包含存储设备101、存储设备102和迁移调度装置301。
其中,迁移调度装置301可以变更文件的元数据,且元数据的变更可以被存储设备101和/或存储设备102获取,从而触发文件的数据的迁移。一些场景中,迁移调度装置也可以称为数据调度引擎。
作为一种可能的实施方式,迁移调度装置301可以基于输入信息来确定针对文件的迁移任务,并基于迁移任务来变更元数据。示例性地,输入信息可以为外部事件信息、用户输入的指示、元数据分析结果等中的一项或者多项。
可选的,迁移调度装置301中可以包含迁移策略模块,迁移策略模块用于实现分级迁移策略,这些迁移策略可以是由预先定义的算法(例如AI模块)定义的、或预先设置的规则定义的。进一步的,迁移策略模块还可以基于输入信息和分级迁移策略,确定迁移任务。
迁移调度装置301所确定的任务的数量可能有多个。这多个任务可能是针对同一文件的,也可能是针对不同文件的。任务之间是否具有冲突、任务执行的顺序等都会影响任务执行的 成功率和效率。作为一种可能的方案,迁移调度装置301可以包含任务编排模块,任务编排模块用于编排多个任务。进一步的,编排任务可以包含确定任务的执行顺序、执行优先级等,例如先迁移哪一个文件等。另外,迁移调度装置在编排任务的过程中还可以合并多个任务。
可选的,迁移调度装置301可以包含迁移任务管理模块。迁移任务管理模块用于跟踪任务的执行进度,便于获取任务的执行情况。另外,对于进展缓慢或者失败的任务等,可以尽快进行处理,提升系统稳定性。
作为一种可能的实施方式,数据迁移系统30还包含元数据分析装置302。一些场景中,元数据分析装置也称为元数据分析引擎。应理解,迁移调度装置301也可以通过软件实现和/或通过硬件实现;元数据分析装置302也可以通过软件实现和/或通过硬件实现。可选的,迁移调度装置301和元数据分析装置302可以独立设置,也可以集成在同一设备中。
元数据分析装置302用于对文件的元数据进行分析,向迁移调度装置301提供分析结果。迁移调度装置基于分析结果,确定针对文件的迁移任务。示例性地,分析结果包含以下一项或者多项信息:文件的冷热状态、文件的安全性或文件相关的业务等。
以下以根据文件的冷热状态进行数据迁移为例,对数据迁移系统的一种可能的运行场景进行说明。
如图4所示是本申请实施例提供的一种数据迁移系统的运行场景示意图。其中,数据调度引擎包含迁移策略模块、任务编码模块和迁移任务管理模块中的一项或者多项,相关描述可以参考前述。数据调度引擎可以基于外部事件信息、用户输入和元数据分析结果等中的一项或者多项来确定针对文件的迁移任务。
其中,元数据分析引擎可以包含数据冷热画像模块,数据冷热画像模块用于确定文件系统中的文件的数据的冷热程度。示例性的,数据的冷热程度包含热数据、温数据和冷数据三个等级,其中热数据为访问频次较多的数据,温数据次之,冷数据再次之。
文件的元数据可以包含于全局文件系统的元数据中。全局文件系统是指将多个存储设备上的文件(或文件系统)联合得到的文件系统,也称为联合文件系统。
以图4为例,存储设备S1上的文件系统、存储设备S2上的文件系统和存储设备S3上的文件系统可以联合得到全局文件系统,如图5所示为一种可能的全局文件系统的视图,其中,为了方便理解将目录使用方框(为了便于区别,根目录为菱形)图案来示意,文件使用圆形图案来示意。当然,这并不旨在说明文件和目录在存储方式、呈现方式上的不同。形状中间的数字为目录和文件对应的节点编号,形状外的名称为目录的名称或文件的名称,应理解,节点的编号、文件(或目录)的排列顺序、文件(或目录)的名称等仅为示例,并不旨在限定本申请。
作为一种示例,存储设备S1是主打高存取速度的存储设备。存储设备S3是主打大容量的存储设备。而存储设备S2是在容量和存取速度都处于中等的存储设备。不难看出,存储设备S1适用于存储热数据,有利于实现文件的高速存取。存储设备S2适用于存储温数据。存储设备S1适用于存储冷数据。
这种情况下,元数据分析引擎可以向数据调度引擎提供数据的冷热程度。相应的,数据调度引擎根据数据的冷热程度,确定针对文件的迁移任务,并基于迁移任务修改文件的元数据。当存储设备获取经过修改后的文件的元数据后,基于元数据的变更来执行数据的迁移,将热数据迁移到存储设备S1上存储,将温数据迁移到存储设备S2上存储,将冷数据迁移到 存储设备S3上存储,实现数据的存取性能最优和存储成本最优。
例如,当文件名为“001.png”的文件(当前存储在存储设备S1)的数据变为温数据,则数据调度引擎将文件名为“001.png”文件的归属设备变更为存储设备S2,使得该文件的数据被迁移到存储设备S2上存储。
再如,当文件名为“002.png”的文件(当前存储在存储设备S1)的数据变为冷数据,则数据调度引擎将文件名为“002.png”文件的归属设备变更为存储设备S3,使得该文件的数据被迁移到存储设备S3上存储。
上文对架构的介绍中提到了文件系统的元数据,下面对文件系统的元数据的格式进行介绍。
文件的元数据包含于文件系统的元数据。由于文件的属性会被变更,因此,文件系统的元数据需要支持元数据的动态变化。
作为一种可能的实施方式,文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录中包含文件的标识和文件的属性,文件的属性例如文件的归属信息、文件的存储布局信息、文件的创建时间等中的一项或者多项。
其中,流式结构是包含多条信息的一种数据结构,每一条信息为一条元数据记录。流式结构具有以下特征:只读、只增、有序,其中“只读”是指流式结构中的记录的值只能读取而无法修改;“只增”指示流式结构中只能追加新的记录而无法删除(或修改)已有的记录,但属于同一个文件(或目录)的多条记录可以被合并成一条记录;“有序”是指流式结构中的记录具有逻辑顺序,追加的记录在流式结构的尾部增加。
在流式结构的元数据(以下称为元数据流)中,当文件系统中的文件的元数据变更时,一条元数据记录被追加到元数据流的末端。其他设备通过读取元数据流末端新追加的元数据记录,则可以获取文件的属性的变更。
如图6所示为本申请实施例提供的一种元数据流的示意图,如元数据记录601所示,文件名为“001.png”的文件,其归属属性的值包含存储设备S1的标识(即S1);存储布局信息为{S1:1,S2:0},指示存储设备S1上存储了文件的数据而存储设备S2上未存储文件的数据。即该文件归属存储设备S1,且文件的数据存储于存储设备S1。
当变更“001.png”的归属设备时,元数据记录602被追加到元数据流的末端,在元数据记录602中,文件的归属信息已经被变更为存储设备S2的标识(即S2)。文件的归属信息已经变更,但文件的数据仍然存储在存储设备S1中,因此,需要将文件的数据从存储设备S1迁移到存储设备S2中。
相应的,当存储设备S1和/或存储设备S1从元数据流中读取新追加的记录时,响应于文件归属设备的变更,执行迁移操作。
作又一种可能的实施方式,文件系统的元数据为表式结构。其中,表式结构的元数据(以下称为元数据表)包含行和列,每一行(或者每一列)包含多个属性的值,每个属性的值对应了一个属性。元数据表可以增加一行(或一列)元数据、删除一行(或一列)元数据,也可以修改已有的属性的值。
在表式结构的元数据(以下称为元数据表)中,当文件系统中文件的元数据产生变更时,元数据表中原本有的一个字段的值被修改。其他设备通过获取修改后的文件的元数据的值, 来获取文件的属性的变更。
如图7所示为本申请实施例提供的一种元数据表的变更示意图,如列701所示,文件名为“001.png”的文件,其归属包含存储设备S1的标识(即S1);存储布局信息为{S1:1,S2:0},指示存储设备S1上存储了文件的数据而存储设备S2上未存储文件的数据。即该文件归属存储设备S1,且文件的数据存储于存储设备S1。
当变更“001.png”的归属设备时,列701中的属性被修改,得到列702。可以看出,文件的归属信息已经被修改为存储设备S2的标识(即S2),但文件的数据仍然存储在存储设备S1中,因此,需要将文件的数据从存储设备S1迁移到存储设备S2中。
相应的,当存储设备S1和/或存储设备S1根据元数据表确定文件的归属信息变更后,响应于文件的归属信息的变更,执行迁移操作。
需要说明的是,图6、图7所示的元数据中,目录的归属信息未示出。在实际使用过程中,目录也可以具有归属信息,用于指示目录的归属设备。作为一种可能的设计,目录的归属可以为指定设备,即:由指定设备来统一维护文件系统中目录。
应理解,上述的文件系统的元数据的格式仅为示例,具体实施过程中,文件系统的元数据可以为其他格式。
上面说明了本申请实施例的架构和文件系统的元数据格式,下面对本申请实施例的方法进行详细介绍。
请参见图8,图8是本申请实施例提供的一种数据迁移方法的流程示意图。可选的,该方法可以应用于前述的数据迁移系统,例如图1、图3、或图5所示的数据迁移系统。
如图8所示的数据迁移方法可以包括步骤S801至步骤S806中的一个或者多个步骤。应理解,本申请为了方便描述,故通过S801至S806这一顺序进行描述,并不旨在限定一定通过上述顺序进行执行。本申请实施例对于上述一个或多个步骤的执行的先后顺序、执行的时间、执行的次数等不做限定。步骤S801至步骤S806具体如下:
步骤S801:迁移调度装置确定针对第一文件的迁移任务。
其中,迁移任务指示将所述第一文件从第一存储设备迁移到第二存储设备。应理解,第一文件可能为一个文件,也可能为多个文件。例如,迁移任务指示将某一目录下的多个文件的数据迁移到第二存储设备。
作为一种可能的实现,迁移任务包括第一文件的标识、第一存储设备的标识和第二存储设备的标识。其中,第一文件的标识用于指示该迁移任务所针对的文件或文件所属的目录,第一存储设备的标识为迁移的源设备的标识,第二存储设备的标识为迁移的目标设备的标识。
可选的,迁移调度装置可以基于输入信息确定针对文件的迁移任务。下面示例性的列举几种基于输入信息来确定迁移任务的设计:
设计1,迁移调度装置根据外部事件信息,确定针对第一文件的迁移任务。
外部事件是指在发生在迁移调度装置和/或业务系统之外的事件。外部事件信息包含但不限于是网络连接情况、设备健康情况或人员调动情况等中一项或者多项。
网络连接情况可以理解为在线状态,用于描述设备是否能够被其他设备感知。示例性的,当某个线路通信中断,预测A地的通信受影响时(或A地的设备可能不能被感知,或通信的速率受到影响),此时可以将A地的数据迁移到B地。
设备健康情况可以描述设备当前的存储能力,或者描述设备当前的故障状态。示例性的,设备的存取速度可能会随着使用时长和使用次数而下降,当某个存储设备的存取速度达到预置下限、或者使用时长达到预设时长、或者使用次数到达预设次数,则将该存储设备上的数据迁移到其他存储设备上。再如,当某一存储设备出现故障时,将该存储设备上的数据迁移到其他存储设备上。
第一文件相关的人员调动状况包含数据的拥有者或者数据的管理者的所在地变化。示例性的,当某个业务的研发团队人员出差异地,可以触发迁移,将该业务的数据迁移到距离出差目的地更近的存储设备。
具体实施过程中,迁移调度装置可以通过预设数据迁移策略来根据外部事件确定迁移任务。数据迁移策略可以为算法、预设的规则或用户设置的条件等。
在这种设计中,外部事件信息满足触发数据迁移的条件时,则会确定对应的迁移任务,实现综合多信息流的智能迁移,提升用户的使用体验。
可选的,触发数据迁移的条件可以由迁移策略来定义。迁移策略可以通过算法或者规则来实现。
设计2,迁移调度装置根据元数据分析结果确定针对第一文件的迁移任务。元数据分析结果是对第一文件的元数据(或目标文件系统的元数据)进行分析得到的结果,元数据分析结果包含但不限于第一文件的冷热状态、第一文件的安全性或第一文件相关的业务等中的一项或者多项。
作为一种示例,元数据的冷热状态可以通过文件的访问频次来指示。例如,若第一文件的元数据中包含表示一段时长内的访问次数的属性。若访问次数大于或者大于等于第一阈值,则将第一文件的数据迁移到高存储速度的设备(例如为第二存储设备),从而提升存取第一文件的数据的效率,提升系统的服务质量。类似的,若第一文件的访问次数小于或者小于等于第二阈值,则将文件的数据迁移到高存储容量的设备,降低存储成本。可选的,这里的第一阈值、第二阈值可以是管理员(例如开发人员、管理部分等)、厂商等输入的、或预先设置的。
作为一种示例,第一文件的元数据中可以包含表示文件的安全等级的属性。例如,若第一文件的安全等级为高级,而第一存储设备安全等级不满足第一文件的安全等级需求,则将第一文件的数据迁移到能够满足第一文件的安全等级的需求的设备上,有效保障用户对文件的安全性需求,提升系统的服务质量。
作为一种示例,第一文件的元数据中包含表示第一文件相关的业务的属性。例如,第一文件的相关业务为车载业务、视频业务、或文件下载业务等。示例性的,若第一文件用于存储车载服务的数据,当车载服务的数据需要迁移到第二存储设备时,第一文件的数据也对应的迁移。如此,用户可以基于不同的业务来对文件进行迁移,提升了用户管理业务数据的便捷性用户,提升系统的服务质量。
在这种设计中,元数据分析结果能够指示对文件的迁移需求(例如访问需求、安全需求、业务需求等),基于迁移需求来确定迁移任务,能够实现整体的存储优化。另外,用户通过更新文件的元数据就可以表达对文件的迁移需求,实现了对数据的智能化管理,提升用户在数据使用和管理上的便捷性。
设计3,迁移调度装置根据用户输入的针对第一文件的迁移指示,确定迁移任务。例如,用户输入的指示信息,指示将第一文件的数据迁移到第二存储设备上,迁移调度装置可以响 应与该指示信息,确定针对第一文件的迁移任务。
这种设计中,用户可以通过输入迁移指示来实现针对某一文件的迁移,这样可以满足用户的个性化需求,提升用户体验。
上述的三种设计仅为示例,具体实施过程中输入信息还可以包含其他信息。上述三种设计还可以在不互斥的情况下结合,对于结合的情况此处不在赘述。
一些场景中,针对第一文件的迁移任务是迁移调度装置所确定的多个任务中的其中一部分。作为一种可能的方案,迁移调度装置可以确定多个任务,并对多个任务进行编排,以使得多个任务可以合理、有序地执行。
示例性的,迁移调度装置可以确定针对第二文件的迁移任务,并编排针对第一文件的迁移任务和针对第二文件的迁移任务。
作为一种可能的实施方式,编排任务可以包含以下的一个或者多个操作:确定任务的执行顺序、确定执行优先级、或合并多个任务等。
示例性的,数据调度装置可以根据需求的轻重缓急,来确定使得部分文件的迁移任务优先被执行,提升用户的使用体验。例如,对于在短时间内访问频次暴涨的文件,可以被优先执行迁移,尽快提升该文件的访问速率,提升用户体验。
示例性的,A任务指示将第一文件从第一存储设备迁移到第二存储设备,而B任务指示将第一文件从第一存储设备迁移到第三存储设备,则A任务和B任务可以被合并得到新的任务,新的任务指示将第一文件从第一存储设备迁移到第三存储设备。这样一来可以减少任务执行出错的概率,二来可以减少任务执行的算力消耗,有效提高任务执行效率,提升用户体验。
步骤S802:迁移调度装置对第一文件的元数据进行第一变更。
具体的,第一变更指示所述第一文件归属的存储设备从第一存储设备变更为第二存储设备。
在一些可能的场景中,第一文件的归属通过归属信息来指示,归属信息包含文件的归属设备的标识。例如,文件的归属信息为第一存储设备的标识,即标识文件的归属设备为第一设备。此时,迁移调度装置对第一文件的元数据进行第一变更,具体为:迁移调度装置所述第一文件的归属信息从所述第一存储设备的标识变更为所述第二存储设备的标识。再如,归属信息通过字段的取值来指示,当第一文件的元数据中第二存储设备对应的字段取值为第一值,则表示第二存储设备为第一文件的归属设备。其中,第一值可以预先定义或者预先配置。
作为又一种可能的实施方式,迁移调度装置可以通过变更目标文件系统的元数据,来变更第一文件的元数据。这里的目标文件系统是指第一文件所属的文件系统,应理解,目标文件系统是某一个或者某一组具体的文件系统。
可选的,目标文件系统的元数据为流式结构。此时,迁移调度装置可以通过在元数据流中追加元数据记录的方式,来进行第一变更。具体地,迁移调度装置在目标文件系统的元数据的末端追加第一元数据记录,第一元数据记录包括第一文件的标识和第一文件的归属信息,第一文件的归属信息包含第二存储设备的标识。
或者可选的,目标文件系统的元数据为表式结构。此时,迁移调度装置可以通过在目标文件系统的元数据表中修改第一文件的元数据,将第一文件的归属信息修改为第二存储设备的标识。在一些可能的设计中,目标文件系统的元数据在多个设备之间同步,或者说,多个 设备共享目标文件系统的元数据。这里的同步是指:目标文件系统的元数据可以被多个设备中的任意一个设备改动,改动后的内容能够被该多个设备获知到,且多个设备获知的目标系统的元数据是一致的。因此,当迁移调度装置对第一文件的元数据进行第一变更时,共享目标文件系统的元数据的设备可以获取第一文件的第一变更后的元数据。
可选的,上述目标文件系统的元数据在多个设备之间同步,有以下几种可能的实现方式:
实现方式1:目标文件系统的元数据在多个设备的本地进行存储,当某一设备对目标文件系统进行了变更时,该设备向存储了目标文件系统的元数据的其他设备通知该变更,其他设备基于通知来相应变更本地存储的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。应理解,这里的多个设备包含迁移调度装置、第一存储设备和第二存储设备。
示例性的,迁移调度装置、第一存储设备和第二存储设备都在本地存储了第一文件系统的元数据。当迁移调度装置对目标文件系统的元数据中的第一文件的元数据进行了第一变更时,迁移调度装置可以发送第一通知,第一通知指示第一文件的元数据产生了第一变更,第一存储设备和第二存储设备基于第一通知来相应变更本地存储的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。
可选的,第一通知中指示第一文件的元数据发生了哪些变更。例如,第一通知可以包含第一变更的内容,如第一文件的标识和第二变更所改变文件的属性(或属性的值)。这种情况下,第一存储设备和第二存储设备可以根据第一文件的第一变更前的元数据和第一变更的内容,得到第一文件的第一变更后的元数据,并根据第一文件的第一变更后的元数据执行迁移任务。再如,第一通知可以包含第一文件的所述第一变更后的元数据。这种情况下,第一存储设备和第二存储设备可以根据第一文件的第一变更后的元数据执行迁移任务。
或者可选的,第一通知指示发生了第一变更,但第一通知不包含第一变更的具体内容和/或第一文件的第一变更后的元数据。在这种情况下,第一存储设备和第二存储设备向迁移调度装置请求第一变更的具体内容和/或第一文件的第一变更后的元数据,并根据迁移调度装置提供的信息来相应变更本地存储的目标文件系统的元数据。
实现方式2:目标文件系统的元数据由指定设备存储,并由指定设备来提供对目标文件系统的元数据的访问和更新。当某一设备对目标文件系统的元数据进行了变更时,该变更被提供给指定设备,多个设备可以从指定设备处获取经过变更的目标文件系统的元数据,从而实现目标文件系统的元数据在多个设备上的同步。应理解,这里的多个设备包含第一存储设备和第二存储设备,可选包含迁移调度装置。
触发多个设备获取目标文件的元数据的方式包含但不限于:由多个设备主动(周期或者非周期性)从指定设备读取、或者由指定设备通知多个设备读取(例如通过消息队列通知多个设备)、或者由指定设备主动发布变更、或者由变更目标文件系统的元数据的设备通知多个设备从指定设备读取、或者由变更目标文件系统的元数据的设备通知多个与变更相关的设备读取。
示例性的,管理目标文件系统的元数据的设备可以在第一文件的元数据产生变更时,向第一存储设备和第二存储设备提供经过第一文件的变更后的元数据,使其了解第一文件的归属信息的变更,触发第一存储设备和第二存储设备执行迁移任务。
示例性的,源设备可以周期性地读取目标文件系统的元数据,了解第一文件的归属信息 的变更,触发对第一文件的数据的推送操作。示例性的,目的设备可以响应于数据调度的通知,从而读取文件系统的元数据,了解第一文件的归属信息的变更,触发对第一文件的数据的拉取操作。
作为一种同步目标文件系统的元数据的举例,迁移调度装置对目标文件系统的元数据进行第一变更,该第一变更被提供给指定设备,指定设备基于第一变更更新了目标文件的元数据后。第一存储设备和第二存储设备可以从指定设备读取经过第一变更的目标文件系统的元数据。
可选的,前述的指定设备可以为全局元数据服务,或者,可以为第一存储设备、第二存储设备、或迁移调度装置等。
作为一种可能的方案,目标文件系统的元数据存储在全局元数据服务中。通过全局元数据服务来对文件系统的元数据进行管理,使得多个设备则多个设备都按照全局元数据服务中的元数据的格式来读取或者写入元数据,统一了文件的元数据的表示方式,屏蔽了异构的存储设备之间的元数据管理和访问控制的差异,不仅提升用户在数据使用和管理上的便捷性,还能够提升系统的可扩展性和灵活性。
应理解,通过上述种种方式实现目标文件系统的元数据在多个设备之间的同步后,该多个设备中的任意一个设备都可以动态地获取到目标系统的变更后的元数据(也就是说,可以获取到目标系统的当前的元数据)。
进一步的,全局元数据服务可以提供服务接口,设备可以调用服务接口来实现对元数据的访问和更新。其中,服务接口是一种通信接口,例如应用程序接口(application programming interface,API),能够用于不同的功能模块之间数据交互并提供服务。通过抽象的服务接口,可以将调用者和实现者解耦和,例如调用服务接口的设备可以按照服务接口的要求提供相关的数据,而全局元数据服务可以通过服务接口获取相关的数据并实现相对应的功能,不仅提升了访问、更新元数据的效率,也提高了系统的可扩展性和灵活性。
作为一种可能的实施方式,迁移调度装置调用全局元数据服务提供的服务接口,对第一文件的元数据进行第一变更。可选的,服务接口可以是全局元数据服务向迁移调度装置提供的,此时,迁移调度装置可以直接调用该服务接口。或者可选的,服务接口可以是全局元数据服务向共享目标文件系统的元数据流的设备提供的。此时迁移调度装置可以调用任一共享目标文件系统的元数据流的设备上的服务接口。例如,全局数据服务向第一存储设备提供服务接口,迁移调度装置可以调用第一存储设备上的服务接口实现第一变更。
作为一种可能的实施方式,文件的归属信息可以作为文件的一个基本属性包含于文件的元数据中。此时,迁移调度装置可以通过执行自定义的指令或者预先设置修改属性的指令来变更文件的归属元数据。
作为又一种可能的实施方式,文件的归属信息可以作为文件的扩展属性包含于文件的元数据中的扩展属性字段中,例如xattr字段、tags字段等。此时,迁移调度装置可以通过执行自定义的指令来变更文件的元数据中的扩展属性字段。或者,以文件的归属信息包含于xattr字段为例,迁移调度装置可以通过预先设置的修改xattr的指令来变更文件的归属信息。
在上文中我们提到,迁移调度装置可以发送通知(便于区分以下称为第一通知),以指示第一文件的元数据发生了第一变更。可选的,第一通知的接收方可以是管理目标文件系统的 元数据的设备,或者可以是所有共享目标文件系统的元数据的设备,或者可以是迁移任务涉及的设备(第一存储设备和/或第二存储设备)。
作为一种发送第一通知的举例,迁移调度装置可以向全局元数据服务发送第一通知,由全局元数据服务发布第一变更。例如,全局元数据服务可以通知源设备(例如第一存储设备)和目的设备(例如第二存储设备)读取第一文件的第一变更后的元数据,或者,向源设备和目的设备发送第一文件的第一变更后的元数据。相应的,在这种示例中,源设备和目的设备可以接收来自全局元数据服务的通知,或者,接收全局元数据服务发送的第一文件的第一变更后的元数据。
作为又一种发送第一通知的举例,迁移调度装置可以通知源设备(例如第一存储设备)和目的设备(例如第二存储设备)读取第一文件的第一变更后的元数据,或者,向源设备和目的设备发送第一文件的第一变更后的元数据。相应的,在这种示例中,源设备和目的设备可以接收来自迁移调度装置的通知,或者,接收迁移调度装置发送的第一文件的第一变更后的元数据。
应理解,迁移调度装置发送通知的方法可以是直接发送方式,也可以是间接发送方式。在直接发送方式中,发送方向接收方发送消息。当然,消息可以被复制多份,向多个接收方分别发送消息。间接发送方式有多种可能实现,例如通过消息队列实现、通过中间设备转发的方式实现等。以消息队列实现消息间接发送为例,消息队列中的消息可以被一个或者多个设备读取;发送方向消息队列中写入消息,而接收方(接收方数量可以为一个或者多个)可以从消息队列中读取消息,从而实现消息的收发。
可选的,图8所示的实施例包含步骤S803,具体如下:
步骤S803:第一存储设备获取第一文件的元数据。
第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息。其中,第一文件的归属信息指示的第一文件的归属设备,第一文件的存储布局信息指示存储第一文件的设备。进一步的,在第一文件的数据以多个数据分段的形式来存储的情况下,存储布局信息还用于指示存储所述第一文件的数据分段的设备。
可选的,归属信息包含文件的归属设备的标识,例如,第一文件的归属信息为第一存储设备的标识,即表示文件的归属设备为第一设备。
可选的,存储布局信息包含存储文件的数据的标识,例如,第一文件的存储布局信息包含第一存储设备的标识,则表示文件的数据存储在第一存储设备上。
或者可选的,文件的归属信息和/或文件的存储布局信息也可以通过字段的值的方式来体现。示例性的,存储布局信息中包含多个字段,多个字段分别对应多个存储设备,当某一个存储设备对应的字段的取值为第一值,则表示该存储设备上存储了第一文件的数据。如,“设备S1:1;设备S2:0”中,字段“设备S1”的取值为1表示设备S1上存储了第一文件的数据,字段“设备S2”的取值为0表示设备S2上没有存储第一文件的数据。
在一些场景中,文件的数据可以以数据分段的形式存储在多个存储设备上。存储布局信息还用于指示多个存储设备所存储的数据分段。示例性,存储布局信息可以包含文件的位图(bitmap),以指示多个设备上的存储布局。当一个文件的数据包含8个数据分段时,某一存储设备上存储的数据分段可以通过8位的位图来指示,如存储布局信息可以包含:“设备S1:0x1010 0000;设备S3:0x1111 1111”。其中,字段“设备S1”的值中,第一位为1表示存 储设备S1上存储了第一个数据分段,第二位为0表示存储设备S1上未存储第二个数据分段,其余位数以此类推。类似的,字段“设备S3”的值中,第一位为1表示存储设备S3上存储了第一个数据分段,第二位为1表示存储设备S3上存储了第二个数据分段,其余位数以此类推。
需要说明的是,第一存储设备获取的元数据可以是第一文件的第一变更后的元数据。第一变更可以是迁移调度装置执行的,也可能是其它设备执行的,例如第一存储设备、第二存储设备或其他存储设备等。
可选的,图8所示的实施例包含步骤S804,具体如下:
步骤S804:第二存储设备获取第一文件的元数据。
相关描述可以参考S803。
可选的,图8所示的实施例还包含步骤S805,具体如下:
步骤S805:第二存储设备发送拉取请求。
可选的,拉取请求可以是第二存储设备向第一存储设备发送,指示第一存储设备推送所述第一文件的数据。可选的,拉取请求中可以携带第一文件的归属信息和第一文件的存储布局信息。
或者可选的,也可以是通过广播或者组播的方式发送给多个设备,指示存储所述第一文件的数据的设备推送所述第一文件的数据。
可选的,拉取请求的发送方式可以是直接向接收方发送,或者可以通过间接发送的方式进行发送,例如通过消息队列发送等。详细介绍可以参见发送第一通知的方式的相关描述。
步骤S806:第一存储设备将第一文件的数据从第一存储设备迁移到第二存储设备。
作为一种可能的实施方式,第一存储设备在确定第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储第一文件的存储设备不包含第二存储设备且包含第一存储设备时,将第一文件的数据从第一存储设备迁移到第二存储设备。
一些可能的场景中,当第一文件的数据包含多个数据分段时,第一文件的存储布局信息指示存储第一文件的存储设备不包含第二存储设备且包含第一存储设备包含以下情况:第一存储布局信息指示第一文件的数据分段(部分或全部数据分段)存储在第一存储设备上,且第二存储设备上未存储第一文件的全部数据分段。即:在第一文件的归属设备上没有文件的完整数据时,即触发存储了第一文件的数据分段的设备向第一文件的归属设备推送第一文件的数据分段。
作为一种可能的设计,源设备所推送的第一文件的数据分段包含目的设备上未存储的数据分段。可选的,第一文件的归属信息和存储信息的内容,可以是第一存储设备通过第一文件的元数据确定的,也可以是第一存储设备通过拉取请求确定的。
例如,第一存储设备获取第一文件的元数据,第一文件的归属信息为第二存储设备的标识,且第一文件的存储布局信息不包含所述第二存储设备的标识且包含所述第一存储设备的标识时,第一存储设备将第一文件的数据从第一存储设备迁移到第二存储设备。
作为又一种可能的实施方式,第一存储设备响应于拉取请求,将第一文件的数据从第一存储设备迁移到第二存储设备。
可选的,将第一文件的数据从第一存储设备迁移到第二存储设备,可以有以下几种情况:
情况一:第一存储设备向共享存储区推送所述第一文件的数据。该共享存储区与所述第 一存储设备和第二存储设备相连,第二存储设备可以从共享存储区获取第一文件的数据。
在第一文件的数据以多个数据分段的形式存储的情况下,第二存储设备从共享存储区拉取的数据分段为第二存储设备上未存储的数据分段。具体的,第二存储设备在拉取第一文件的某一数据分段时,先检查自身是否存储了该数据分段,在第二存储上未存储该数据分段时从共享存储区拉取该数据分段。
进一步的,第一存储设备可以告知其他设备,数据已被推送到共享存储区。告知方式可以有以下几种实现:
实现方式一:第一存储设备对第一文件的元数据进行第二变更,以触发所述第二存储设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备。
其中,第二变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区。在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。第二存储设备获取文件的第二变更后的元数据,从共享存储区中拉取第一文件的数据并存储到第二存储设备中。
示例性的,所述第一文件的归属信息为所述第二存储设备的标识,在第二变更之前,第一文件的存储布局信息包含第一存储设备的标识且不包含第二存储设备的标识。第一存储设备可以在所述第一文件的存储布局信息中增加共享存储区的标识。在第二变更之后,第一文件的存储布局信息包含第一存储设备的信息和共享存储区的标识,且不包含第二存储设备的标识。
在这种实现方式中,第一文件的第二变更后的元数据可以通过以下方式被其他设备获取:第一存储设备通过同步目标文件系统的元数据来进行第一变更,其他设备通过同步目标文件系统的元数据来获取第一文件的第二变更后的元数据。该过程可以参考步骤S802中对目标文件系统的元数据在多个设备之间同步的两种实现方式,具体不再赘述。
以目标文件系统的元数据为流式结构为例,第一存储设备在元数据流的末端增加第二元数据记录,第二元数据记录中包含第一文件的标识和第一文件的存储布局信息,所述第一存储的存储布局信息中包含共享存储区的标识(可选包含第二存储设备的标识)。
相关说明可以参见步骤S802中,迁移调度装置执行第一变更相关描述以及发送第一通知的方式的相关描述。
实现方式二:第一存储设备向第二存储设备发送推送通知,通知其他设备,数据已被推送到共享存储区。
情况二:第一存储设备向第二存储设备推送所述第一文件的数据,相应的,第二存储设备从第一存储设备处拉取该第一文件的数据。
例如,第一存储设备和第二存储设备已经建立过连接或者确认对端设备为可信设备,则第一存储设备可以直接向第二存储设备推送第一文件的数据,而不用经过中间设备,提升数据迁移的效率。
再如,第一存储设备和第二存储设备是位于同一数据中心的设备,则第一存储设备可以向第二存储设备推送第一文件的数据。
一些可能的场景中,第一文件的数据以多个数据分段的形式存储,此时,第二存储设备从第一存储设备上拉取的数据分段为第二存储设备上未存储的数据分段。
应理解,上述两种情况仅为示例,具体实施过程中也可以有其他的设计。
作为一种可能的实施方式,在文件的数据以多个数据分段的形式进行存储时。推送数据也可以以数据分段的方式进行推送。
可选的,第二存储设备存储第一文件的数据后,对第一文件的元数据执行第三变更。第三变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述第二存储设备。可选的,当文件的数据以多个数据分段的形式存储、存储布局信息中包含文件的位图的情况下,第二存储设备在拉取一个数据分段以后,则第二存储设备可以变更第二存储设备对应的文件的位图,以指示第二存储设备上已存储该数据分段。
或者,第二存储设备在存储了第一文件的数据后,发送通知,该通知指示第二存储设备已经获取第一文件的数据。其中,该通知可以是发送给第二存储设备和/或迁移调度装置,或者是通过广播发送。
在一些可能的场景中,当第二存储设备已经存储了第一文件的数据,则第一存储设备可在本地删除第一文件的数据。
示例性地,在所述第一文件的归属信息为所述第二存储设备且所述第一文件的存储布局信息包含所述第一存储设备的标识和所述第二存储设备的标识时,第一存储设备删除第一存储设备上的第一文件的数据。
示例性地,在第一文件接收通知,而通知指示第二存储设备已经获取第一文件的数据的情况下,第一存储设备删除第一存储设备上的第一文件的数据,避免多个存储设备重复存储某一个文件的数据,释放存储空间,优化整体存储成本。
作为一种可能的方案,当文件的数据以多个数据分段的形式存储时,在第二存储设备上存储了第一文件的全部数据分段后,第一存储设备在本地删除第一文件的数据分段。这样可以避免因为数据迁移过程中的错误而导致文件的数据被损坏,而使得文件的数据不再完整的问题。
进一步的,第一存储设备可以对所述第一文件的元数据执行第四变更操作,第四变更操作指示将第一存储设备的标识从第一文件的存储布局信息中删除。
作为一种可能的实施方式中,在第一存储设备删除第一文件的数据之前,可以先将所述第一文件的数据标记为可删除,以便于在所述第一文件的数据处于可删除状态时执行删除操作。例如,当第一文件的数据正在使用,不便于立即执行删除操作,此时可以先标记第一文件的数据,等到文件使用结束后删除。
再如,在删除文件之前,先将文件标记为可删除,可删除的文件无法通过正常方式访问到,等到满足预设条件时在统一删除文件。这里的预设条件可以是,标记为可删除的时间达到预设时长、标记文可删除的数据达到预设大小等。这样可以方便用户在第一存储设备上找回第一文件的数据,减少因为误操作带来的数据丢失,提升用户体验。
在一些可能的场景中,迁移调度装置可以基于文件的归属信息和存储布局信息,确定任务的执行进度。具体的,触发迁移、数据迁移和删除本地数据的过程都可能对文件的元数据进行变更,通过监控文件的元数据的变更(尤其是文件的归属信息的变更和文件的存储布局信息的变更),可以确定任务的执行进度,便于用户了解任务的执行情况。另外,对于进展缓慢或者失败的任务等,可以尽快进行处理,提升系统稳定性。
如图9所示是本申请实施例提供的一种第一文件的元数据的变更的示意图。示例性的,元数据的变更可以包含几下几个阶段:
阶段(1),触发迁移之前的元数据。元数据901为未触发数据迁移之前第一文件的元数据,包含第一文件的标识(inode为60)、第一文件的归属信息(即归属元数据)和第一文件的存储布局信息(即布局元数据),可以看出,第一文件的归属设备为设备S1,设备S1存在第一文件的数据。可选的,元数据901还包含第一文件的父节点的inode(表示为pinode)。
阶段(2),触发迁移之后、数据迁移之前的元数据。元数据901可以被执行变更,经过变更后的第一文件的元数据为元数据902。可以看出,第一文件的归属设备被变更为设备S2。由于设备S2上不存在第一文件的元数据,因此触发从设备S1至设备S2的迁移。
阶段(3),数据迁移之后、释放空间之前的元数据。将第一文件的数据从设备S1迁移到设备S2后,第一文件的布局元数据需要对应更新。根据元数据903可以看出,设备S2上已经存在第一文件的数据。
阶段(4),释放存储空间之后前的元数据。为了释放存储空间,在设备S1上存储的第一文件的数据可以被删除,而删除设备S1存储的第一文件的数据后,第一文件的存储布局信息也需要对应更新。从元数据904可以看出,文件的布局元数据指示设备S1上不存在第一文件的元数据。
当然,上述图9所示的4个阶段的元数据是为了方便理解元数据的变更所列举的一种可能的情况,不作为对元数据的变更阶段和元数据格式的限定。具体实施过程中,数据迁移可以包含更多或者更少的元数据变更阶段,或者,元数据中的归属元数据和布局元数据也可以有其他设计。
一些场景中,第一存储设备和/或第二存储设备可以生成本地文件视图。其中,本地文件是指文件的归属为本设备和/或文件的存储布局信息指示本身设备的文件。
示例性的,第二存储设备提供所述第二存储设备的本地文件视图,本地文件视图指示存储在所述第二存储设备上的多个文件的层次结构。其中,该多个文件的存储布局信息指示第二存储设备,和/或,该多个文件的归属信息指示第二存储设备。如图4所示中,存储设备S1可以提供本地文件视图。
在图8所示的实施例通过变更文件的元数据来触发文件的数据迁移,当文件的归属为第二存储设备而文件的数据仍然存储在第一存储设备上时,则将第一文件从所述第一存储设备迁移到第二存储设备。迁移调度装置、源设备和目的设备能够基于对文件的元数据的变更(如,对文件的归属、存储布局等元数据的变更)来触发对该文件的迁移,而且该文件迁移的进度也能够通过该文件的元数据的状态来反映。该方法可以提高数据迁移的效率,提升在数据使用和管理上的便捷性,还实现了迁移过程中各个设备的解耦,极大地提高业务系统的灵活性和可扩展性。
另外,使用该方法进行迁移时,文件的数据的存储位置仍然通过存储布局信息来指示,因此上述的数据迁移方法可以不影响用户对于文件的数据的正常使用,提升了业务系统的稳定性。
以上图8所示的方法实施例中包含了很多可能的实现方案,下面对其中的部分实现方案进行举例说明,需要说明的是,下文中未解释到的相关概念、操作或者逻辑关系可以参照图8所示实施例中的相应描述。
请参见图10,图10是本申请实施例提供的又一种数据迁移方法的流程示意图。可选的, 该方法可以应用于前述的数据迁移系统中,例如图1、图3或图5所示的数据迁移系统。
如图10所示的数据迁移方法可以包括步骤S1001至步骤S1008中的一个或者多个步骤。应理解,本申请为了方便描述,故通过S1001至S1008这一顺序进行描述,并不旨在限定一定通过上述顺序进行执行。本申请实施例对于上述一个或多个步骤的执行的先后顺序、执行的时间、执行的次数等不做限定。步骤S1001至步骤S1008具体如下:
步骤S1001:迁移调度装置确定针对第一文件的迁移任务。
相关描述可以参见步骤S801的描述。
步骤S1002:迁移调度装置对第一文件的元数据执行第一变更。
第一变更指示第一文件归属的存储设备从第一存储设备变更为第二存储设备。
例如,第一变更指示将第一文件的归属信息从第一存储设备的标识变更为第二存储设备的标识。请参见图11,图11是本申请实施例提供的又一种第一文件的元数据的变更的示意图。第一文件的元数据1101经过第一变更,得到元数据1102。在元数据1102中,第一文件的归属信息包含第二存储设备的标识。
第一存储设备可以获取文件的归属信息的变更。示例性的,第一存储设备可以获取第一文件的第一变更后的元数据。例如,迁移调度装置在元数据流中追加元数据1102,相应的,第一存储设备可以获取在元数据流中追加的记录。
可选的,图10所示的数据迁移方法中还包含步骤S1003,具体如下:
步骤S1003:迁移调度装置根据第一文件的归属信息和存储布局信息,确定迁移进度。
示例性的,当第一文件的归属信息指示的存储设备为第二存储设备,但第一文件的存储布局信息指示存储所述第一文件的存储设备不包含第二存储设备且包含第一存储设备时,迁移进度为未开始。
当第一文件的归属信息指示的存储设备为第二存储设备,但第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备和共享存储区但不包含第二存储设备时,迁移进度为:源设备已推送数据。
当第一文件的归属信息指示的存储设备为第二存储设备,但第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备和第二存储设备时,迁移进度为:目标设备已拉取数据。
当第一文件的归属信息指示的存储设备为第二存储设备,但第一文件的存储布局信息指示存储所述第一文件的存储设备包含第二存储设备但不指示第一存储设备时,迁移进度为:已完成。
请参见图11,图11是本申请实施例提供的一种第一文件的元数据的变更的示意图。元数据1101为迁移未触发之前第一文件的元数据,元数据1102为触发迁移之后的第一文件的元数据,元数据1103为源设备推送第一文件的数据之后的元数据,元数据1104为目标设备拉取第一文件的数据之后的元数据,元数据1105为源设备删除本地的第一文件的数据之后的元数据。
步骤S1004:第一存储设备向共享存储区推送第一文件的数据。
其中,共享存储区是一个提供存储空间的中间设备。共享存储区与第一存储设备相连,第一存储设备可以向共享存储区推送数据。进一步的,共享存储区还可以与第二存储设备相连,第二存储设备从共享存储区拉取数据。
可选的,共享存储区可以由全局数据服务提供。或者可选的,共享存储区由第三方临时存储设备、中间设备提供。
作为一种可能的实施方式,第一文件的归属变更被同步到第一文件的数据所在的设备(即第一存储设备),第一存储设备检测到第一文件的归属不是本设备,且布局元数据显示数据在本地但不在归属设备(即第二存储设备),则源设备推送(或称为发布)第一文件的数据到共享存储区。
可选的,源设备向共享存储区推送数据后,源设备中可以先不删除本地的第一文件的数据,降低由于后续步骤执行失败而造成第一文件的数据丢失的风险,提升系统稳定性。
作为一种可能的方式,由于共享存储区中可能已经存储有第一文件的数据,为了避免数据的重复推送。第一存储设备在检测到第一文件的归属不是本设备,且布局元数据显示数据在本地但不在归属设备(即第二存储设备)和共享存储区时,向共享存储区推送(或称为发布)第一文件的数据。
步骤S1005:第一存储设备对第一文件的元数据执行第二变更。
第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加共享存储区。在第二变更后,第一文件的存储布局信息指示存储第一文件的存储设备包含第一存储设备和共享存储区,且不包含第二存储设备。
例如,第一存储设备在第一文件的存储布局信息中增加共享存储区的标识。如图11所示,经过第二变更后的元数据1104中包含共享存储区的标识(即:共享存储区p1)。
可选的,第二存储设备可以获取第一文件的经过第二变更后的元数据。
步骤S1006:第二存储设备从共享存储区拉取第一文件的数据。
步骤S1007:第二存储设备对第一文件的元数据执行第三变更。
第三变更指示在所述第一文件的存储布局信息指示的存储设备中增加第二存储设备。例如,第二存储设备在第一文件的存储布局信息中增加第二存储设备的标识。示例性的,如图11所示,经过第三变更后的元数据1103中包含设备S2的标识(即:设备S2)。
可选的,由于第二存储设备拉取第一文件的数据后,共享存储区中可以删除在共享存储区中的第一文件的数据。因此,第三变更还可以指示在第一文件的存储布局信息指示的存储设备中删除共享存储区。例如,第二存储设备可以在第一文件的存储布局信息中删除共享存储区的标识。
或者可选的,提供共享存储区的设备也可以对第一文件的元数据进行变更,以在第一文件的存储布局信息指示的存储设备中删除共享存储区。
第一文件的第三变更后的元数据可以同步给第一存储设备和/或迁移调度装置。例如,通过目标文件的元数据进行同步,或者,第二存储设备将第一文件的第三变更后的元数据发送给第一存储设备。相关描述可以参考步骤S802和步骤S806中同步第一变更和第二变更的方式。
步骤S1008:第一存储设备删除存储在第二存储设备上的第一文件的数据。
具体的,在第一文件的归属设备上已经存在第一文件的数据时,第一存储设备可以删除本地的第一文件的数据,以释放存储空间,降低存储成本。
作为一种可能的方案,在所述第一文件的归属信息为所述第二存储设备且所述第一文件的存储布局信息包含所述第一存储设备的标识时,第一存储设备删除第一存储设备上的第一 文件的数据。
可选的,第一存储设备可以同步第一文件的第三变更后的元数据,从而确定所述第一文件的归属信息为所述第二存储设备且所述第一文件的存储布局信息包含所述第一存储设备的标识。
步骤S1009:第一存储设备对第一文件的元数据执行第四变更。
在所述第一文件的存储布局信息指示的存储设备中删除第一存储设备。
例如,第一存储设备将第一存储设备的标识从第一文件的存储布局信息中删除。如图11所示,经过第四变更后的元数据1105中不包含设备S1的标识。
在图10所示的实施例中,迁移调度装置、源设备和目的设备能够文件的元数据的变更来实现数据迁移,而且数据迁移的进度也能够通过文件的元数据来反映。迁移调度装置、第一存储设备、第二存储设备之间无需为了数据迁移而建立额外的数据访问安全控制,简化了数据迁移的流程。对于第一存储设备和第二存储设备来说,通过更新文件的归属信息和存储布局信息,来向对方指示其完成的操作,可以提高数据迁移的效率,提升在数据使用和管理上的便捷性。总之,本申请实施例不仅能够提升数据迁移的效率,并且实现了迁移过程中各个设备的解耦,极大地提高业务系统的灵活性和可扩展性。
请参见图12,图12是本申请实施例提供的又一种数据迁移方法的流程示意图。可选的,该方法可以应用于前述的数据迁移系统中,例如图1、图3或图5所示的数据迁移系统。
如图12所示的数据迁移方法可以包括步骤S1201至步骤S1208中的一个或者多个步骤。应理解,本申请为了方便描述,故通过S1001至S1008这一顺序进行描述,并不旨在限定一定通过上述顺序进行执行。本申请实施例对于上述一个或多个步骤的执行的先后顺序、执行的时间、执行的次数等不做限定。步骤S1001至步骤S1008具体如下:
步骤S1201:迁移调度装置确定针对第一文件的迁移任务。
相关描述可以参见步骤S801的描述。
步骤S1202:迁移调度装置对第一文件的元数据执行第一变更。
相关描述可以参见步骤S802的描述。
可选的,图12所示的数据迁移方法中还包含步骤S1203,具体如下:
步骤S1203:迁移调度装置根据第一文件的归属信息和存储布局信息,确定迁移进度。
步骤S1204:第二存储设备获取第一文件的元数据。
其中,第一存储设备获取的第一文件的元数据为经过第一变更后的元数据。
步骤S1205:第二存储设备发送拉取请求。
具体地,第一文件的归属变更被同步到第一文件的归属设备(即第二存储设备),第二存储设备检测到第一文件的归属为本设备,且布局元数据显示数据不在本地,则第二存储设备发送拉取请求,使得存储了第一文件的数据的设备推送第一文件的数据。
可选的,拉取请求中包含第二存储设备的标识、第一文件的标识。可选还包含第一文件的归属信息和/或第一文件的存储布局信息。
可循的,拉取请求的发送方式可以是直接发送方式,或者间接发送方式。
示例性地,第二存储设备在广播消息队列中写入拉取请求,其他设备(例如第一存储设备)通过读取广播消息队列,可以接收该拉取请求。
可选的,由于第一文件的存储布局信息包含第一存储设备的信息,第二存储设备可以向第二存储设备发送拉取请求,以使地第一存储设备推送第一文件的数据。
作为一种可能的方式,由于共享存储区中可能已经存储有第一文件的数据,为了避免数据的重复推送。第二存储设备在检测到第一文件的归属不是本设备,且布局元数据显示数据不在本地也不在共享存储区时,发送拉取请求。
步骤S1206:第一存储设备向共享存储区推送第一文件的数据。
具体地,第一存储设备接收拉取请求。响应于拉取请求,第一存储设备向共享存储区推送第一文件的数据。
作为一种可能的实施方式,拉取请求包含第一文件的归属信息和第一文件的存储布局信息。第一存储设备根据拉取请求,确定第一文件的归属不是本设备,且布局元数据显示数据在本地但不在归属设备(即第二存储设备),则第一存储设备推送(或称为发布)第一文件的数据到共享存储区。
步骤S1207:第一存储设备对第一文件的元数据执行第二变更。
相关描述可以参考步骤S1005。
步骤S1208:第二存储设备从共享存储区拉取第一文件的数据。
步骤S1209:第二存储设备对第一文件的元数据执行第三变更。
相关描述可以参考步骤S1007。
步骤S1210:第一存储设备删除存储在第二存储设备上的第一文件的数据。
相关描述可以参考步骤S1008。
步骤S1211:第一存储设备对第一文件的元数据执行第四变更。
相关描述可以参考步骤S1009。
在图12所示的实施例中,由数据迁移的目的设备主动发送拉取请求,可以解决由于目的设备不在线或者目的设备故障造成的数据迁移失败的问题,提升数据迁移的成功率。
上面说明了本申请实施例的方法,下面提供本申请实施例的装置。
可以理解的是,本申请实施例提供的多个装置,例如数据迁移装置、迁移调度装置等,为了实现上述方法实施例中的功能,其包含了执行各个功能相应的硬件结构、软件单元、或硬件结构和软件结构的组合等。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的多种功能,装置以及装置中的模块能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以在不同的使用场景中,使用不同的装置实现方式来实现前述的方法实施例,对于装置的不同实现方式不应认为超出本申请实施例的范围。
以下列举几种可能的装置。
请参见图13,图13是本申请实施例提供的一种迁移调度装置130的结构示意图。该迁移调度装置130可以包括任务确定模块1301和元数据更新模块1302。该迁移调度装置130用于实现前述的数据迁移方法,例如图8、图10或图12所示实施例中的数据迁移方法。
在一种可能的实施方式中,所述任务确定模块1301,用于确定针对第一文件的迁移任务,所述第一文件的数据存储在第一存储设备上,所述针对第一文件的迁移任务指示将所述第一文件的数据从所述第一存储设备迁移到第二存储设备;
所述元数据更新模块1302,用于对所述第一文件的元数据进行第一变更,以触发执行所述针对第一文件的迁移任务;其中,所述第一变更指示所述第一文件归属的存储设备从所述第一存储设备变更为所述第二存储设备。
在又一种可能的实施方式中,第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息。进行第一变更以前,所述第一文件的归属信息指示的存储设备为第一存储设备,所述第一文件的存储布局信息指示存储第一文件的存储设备包含第一存储设备且不包含第二存储设备。
可选的,第一文件的归属信息为第一存储设备的标识,第一文件的存储布局信息包含第一存储设备的标识且不包含第二存储设备的标识。
在又一种可能的实施方式中,所述元数据更新模块1302,用于将所述第一文件的归属信息从所述第一存储设备的标识变更为所述第二存储设备的标识。
在又一种可能的实施方式中,针对第一文件的迁移任务包括所述第一文件的标识、所述第一存储设备的标识和所述第二存储设备的标识。
在又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步。上述多个设备包含第一计算设备和第二计算设备,第一计算设备位于第一存储设备中或与第一存储设备相连,第二计算设备位于第二存储设备中或与第二存储设备相连。进一步的,上述多个设备还可以包含迁移调度装置。
在又一种可能的实施方式中,所述迁移调度装置130还包含通信模块1303,所述通信模块1303,用于发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更,以使得所述第一计算设备或所述第二计算设备根据所述第一通知获取所述第一文件的所述第一变更后的元数据,并根据所述第一文件的所述第一变更后的元数据执行针对第一文件的迁移任务。
可选的,第一通知指示第一文件的元数据发生了哪些变更,例如,第一通知可以包含第一变更的内容,和/或,第一通知可以包含第一文件的所述第一变更后的元数据。或者可选的,可选的,第一通知指示发生了第一变更,但不包含变更的具体内容和第一文件的第一变更后的元数据。
在又一种可能的实施方式中,所述迁移调度装置还包含任务监控模块1304,所述任务监控模块1304,用于:
获取所述第一文件的第二变更后的元数据,所述第二变更由所述第一计算设备或所述第二计算设备执行,所述第二变更指示所述第一文件的存储布局信息的变化;
根据所述第一文件的所述第二变更后的元数据,确定所述第一文件的迁移进度。
在又一种可能的实施方式中,目标文件系统的元数据在多个设备的本地进行存储。
在又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务。其中,全局元数据服务能够存储目标文件系统的元数据。进一步的,全局元数据服务能够支持对目标文件系统的元数据的访问和更新。
在又一种可能的实施方式中,迁移调度装置130还包含通信模块1303,所述通信模块1303,用于:
接收第二通知,所述第二通知指示所述第一文件的元数据发生了变更。
在又一种可能的实施方式中,所述任务监控模块1304,还用于:
根据所述第二通知获取所述第一文件的第二变更后的元数据。可选的,第二通知包含第二变更的内容,或者,第二通知包含所述第一文件的第二变更后的元数据。
在又一种可能的实施方式中,通知(例如第一通知、或第二通知等)可以通过消息队列的形式发送。由发送方将消息写入消息队列,接收方通过读取消息队列来接收通知,从而进一步减少不同功能模块之间的耦合度。
在又一种可能的实施方式中,所述通信模块1303,还用于:向所述第一计算设备或所述第二计算设备发送用于获取所述第一文件的变更后的元数据的请求;
所述任务监控模块1304,还用于:根据所述第一计算设备或所述第二计算设备对所述请求的响应获取所述第一文件的第二变更后的元数据。
在又一种可能的实施方式中,所述任务监控模块1304,还用于从所述全局元数据服务获取所述第一文件的所述第二变更后的元数据。
在又一种可能的实施方式中,全局元数据服务提供服务接口,所述迁移调度装置可以调用服务接口来实现对元数据的访问和更新。
其中,服务接口是一种通信接口,例如应用程序接口(application programming interface,API),能够用于不同的功能模块之间数据交互并提供服务。
在又一种可能的实施方式中,所述元数据更新模块1302,还用于:
通过所述全局元数据服务提供的服务接口来实现所述第一变更。
在又一种可能的实施方式中,所述全局元数据服务位于所述多个设备中的任意一个设备上,或者位于所述多个设备之外的任意一个设备上。
示例性的,所述全局元数据服务位于第三计算设备,第三计算设备可以与第一计算设备或第二计算设备相同的计算设备,也可以是二者之外的另一计算设备。
可选的,全局元数据服务的服务接口可以是由第三计算设备向迁移调度装置提供的。或者,第三计算设备可以向迁移调度装置提供另一个接口(便于区分称为第一接口),通过调用该第一接口可以实现调用全局元数据服务的服务接口的功能。
在又一种可能的实施方式中,所述目标文件系统的元数据为表式结构且元数据可以被修改。其中,表式结构是一种包含行和列的数据接口,每一行(或者每一列)包含多个值,每个值对应了一个字段。
表式结构的元数据可以增加元数据、删除元数据,也可以修改已有的元数据。也即是或,第一变更可以通过修改目标文件系统的元数据的方式实现。
在又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。
在又一种可能的实施方式中,所述元数据更新模块1302,还用于:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的变更后的归属信息,所述第一文件的变更后的归属信息指示所述第一文件归属的存储设备为所述第二存储设备。
在又一种可能的实施方式中,所述任务确定模块1301,还用于:
根据外部事件信息,确定所述针对第一文件的迁移任务;
所述外部事件信息包含以下信息中的一项或者多项:网络连接情况、设备健康情况或所 述第一文件相关的人员调动状况。
在又一种可能的实施方式中,所述任务确定模块1301,还用于:
根据所述第一文件的元数据的分析结果,确定所述针对第一文件的迁移任务;其中,所述分析结果包含以下一项或者多项信息:所述第一文件的冷热状态、所述第一文件的安全性或所述第一文件相关的业务。
在又一种可能的实施方式中,所述任务确定模块1301,还用于:
根据用户输入的针对所述第一文件的迁移指示,确定所述针对第一文件的迁移任务。
在又一种可能的实施方式中,所述任务确定模块1301,还用于确定针对第二文件的迁移任务;
所述迁移调度装置130还包含,任务编排模块1305,所述任务编排模块1305用于编排所述针对第一文件的迁移任务和所述针对第二文件的迁移任务的执行顺序。
作为一种可能的设计,编排任务可以包含确定任务的执行顺序、执行优先级等。
作为又一种可能的设计,编排任务的过程中可以合并多个任务。
请参见图14,图14是本申请实施例提供的一种计算装置140的结构示意图。该计算装置140用于实现前述的数据迁移方法,例如图8、图10或图12所示实施例中的数据迁移方法。
可选的,该计算装置140包含于前述实施例中的存储设备、计算设备等,例如包含于图1、图2、图3、图8、图10或图12所示实施例中第一存储设备、或第二存储设备。再如,包含于图4的存储设备S1、存储设备S2和/或存储设备S3中。
或者,该计算装置140为独立的设备,能够与前述的存储设备或计算设备连接。
在一种可能的设计中,该计算装置140可以包括元数据获取模块1401和迁移模块1402。计算装置140用于实现图8或图10所示实施例中,第一存储设备一侧的方法。
在一种可能的实施方式中,所述元数据获取模块1401,用于获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
所述迁移模块1402,用于在确定所述第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备。
在又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述计算装置或者所述计算装置所在的第一计算设备。
在又一种可能的实施方式中,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
所述元数据获取模块1401,还用于:
从所述全局元数据服务获取所述第一文件的当前的元数据。
在又一种可能的实施方式中,所述迁移模块1402,还用于:
向共享存储区推送所述第一文件的数据;所述共享存储区与所述第一计算设备和第二计 算设备相连,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连;
所述计算装置140还包含元数据更新模块1403,所述元数据更新模块1403,用于对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述元数据更新模块1403,还用于:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在又一种可能的实施方式中,所述计算装置140还包含通信模块1404,所述通信模块1404,用于:
接收第一通知,所述第一通知指示所述第一文件的元数据发生了变更。
在又一种可能的实施方式中,所述计算装置140还包含通信模块1404,所述通信模块1404,用于:
发送第二通知,所述第一通知指示所述第一文件的元数据发生了变更。
在又一种可能的实施方式中,所述元数据获取模块1401,还用于:
同步目标文件系统的元数据,所述目标文件系统的元数据包含所述第一文件的元数据。
在又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述元数据更新模块1403,还用于:在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在又一种可能的实施方式中,所述计算装置140还包含删除控制模块1405,所述删除控制模块1405,用于:
当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
所述元数据更新模块1403,还用于:对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
在一种可能的实施方式中,所述删除控制模块1405,还用于:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时 执行删除所述第一文件的操作。
在一种可能的实施方式中,所述迁移模块1402,还用于:
向第二计算设备推送所述第一文件的数据。
在一种可能的实施方式中,所述计算装置140还包含通信模块1404,所述通信模块1404,用于:
接收来自所述第二计算设备的针对所述第一文件的拉取请求。
在又一种可能的实施方式中,所述计算装置140还包含视图提供模块1406,视图提供模块1406用于提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述计算装置140还包含视图提供模块1406,视图提供模块用于提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
在又一种可能的设计中,该计算装置140可以包括元数据获取模块1401和迁移模块1402。该计算装置140用于实现图8、图10或图12所示实施例中,第二存储设备一侧的方法。
在又一种可能的实施方式中,所述元数据获取模块1401,用于获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
所述迁移模块1402,用于在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备。
在又一种可能的实施方式中,所述计算装置140还包含元数据更新模块1403,所述元数据更新模块1403,还用于:
对所述第一文件的元数据执行第一变更,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述第二存储设备。
在又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息不包含所述第二存储设备的标识;
所述元数据更新模块1403,还用于:
在所述第一文件的存储布局信息中增加所述第二存储设备的标识。
在又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备。所述计算装置140还包含通信模块1404,所述通信模块1404用于:
发送针对所述第一文件的拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件;所述第一计算设备位于第一存储设备中或与所述第一存储设备相连。
在又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含共享存储区。所述迁移模块1402,用于:
从所述共享存储区拉取所述第一文件的数据到所述第二存储设备。
在又一种可能的实施方式中,所述第一文件的存储布局信息指示存储所述第一文件的存 储设备包含第一存储设备。所述迁移模块1402,还用于:
从第一存储设备拉取所述第一文件的数据到所述第二存储设备。
在又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据存储在全局元数据服务中;
所述元数据获取模块1401,还用于:
从所述全局元数据服务获取所述第一文件的当前的元数据。
在又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息。所述对所述第一文件的元数据执行第一变更,包括:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息包含所述第二存储设备的标识。
在又一种可能的实施方式中,所述计算装置140还包含视图提供模块1406,视图提供模块用于提供所述第一存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
在又一种可能的实施方式中,所述计算装置140还包含视图提供模块1406,视图提供模块用于提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件以联邦构成全局文件系统。所述计算装置还包含视图提供模块1406,视图提供模块1406用于提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
在一种可能的设计中,该计算装置140可以包括通信模块1404、迁移模块1402和元数据更新模块1403。计算装置140用于实现图12所示实施例中,第一存储设备一侧的方法。
一种可能的实施方式中,所述通信模块1404,用于接收来自第二计算设备的针对所述第一文件的拉取请求,所述第二计算设备与第二存储设备相连;
所述迁移模块1402,用于向共享存储区推送所述第一文件的数据;
所述元数据更新模块1403,用于对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
在又一种可能的实施方式中,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
所述元数据更新模块1403,还用于:
在所述第一文件的存储布局信息中增加所述共享存储区的标识。
在又一种可能的实施方式中,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述第一计算设备。
在又一种可能的实施方式中,所述通信模块1404,还用于:
发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更。
在又一种可能的实施方式中,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
所述元数据更新模块1403,用于:
在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的存储布局信息,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和所述共享存储区。
在又一种可能的实施方式中,所述计算装置140包含元数据获取模块1401,所述元数据获取模块1401获取所述第一文件的第二变更后的元数据。
在又一种可能的实施方式中,所述计算装置140包含删除控制模块1405,所述删除控制模块1405,用于当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
所述元数据更新模块1403,用于对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
在又一种可能的实施方式中,所述删除控制模块1405,还用于:
将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
在又一种可能的实施方式中,所述计算装置140还包含视图提供模块1406,视图提供模块1406用于提供所述第一存储设备的归属文件视图,所述归属本地文件视图包含多个文件的信息,该多个文件的归属信息指示所述第一存储设备。
在又一种可能的实施方式中,归属于第一存储设备的文件和归属于第二存储设备的文件属于全局文件系统。所述计算装置140还包含视图提供模块1406,视图提供模块用于提供全局文件视图,所述全局文件视图包含归属于第一存储设备的文件的信息和归属于第二存储设备的文件的信息。
图15所示为本申请实施例提供的一种计算设备150的结构示意图。计算设备150是具有计算能力的设备,这里的设备可以是实体的设备,例如控制器、处理器、服务器(如机架式服务器)、主机等,也可能是虚拟的设备,例如虚拟机、容器等。
如图15所示,计算设备150包括:处理器1502和存储器1501,可选包含总线1504、通信接口1503。处理器1502和存储器1501等之间通过总线1504通信。应理解,本申请不限 定计算设备150中的处理器、存储器的个数。
存储器1501用于提供存储空间,存储空间中可选存储应用数据、用户数据、操作系统和计算机程序等。存储器1501可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1501还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)等。
处理器1502是进行运算的模块,可以包括控制器(例如存储控制器)、中央处理器(central processing unit,CPU)、微图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)、数字信号处理器(digital signal processor,DSP)、协处理器(协助中央处理器完成相应处理和应用)、专用集成电路(Application Specific Integrated Circuit,ASIC)、微控制单元(Microcontroller Unit,MCU)、虚拟机、容器等中的任意一种或多种。
通信接口1503用于为所述至少一个处理器提供信息输入或者输出。和/或,所述通信接口1503可以用于接收外部发送的数据和/或向外部发送数据。通信接口1503可以为包括诸如以太网电缆等的有线链路接口,也可以是无线链路(Wi-Fi、蓝牙、通用无线传输及其他无线通信技术等)接口。可选的,通信接口1503还可以包括与接口耦合的发射器(如射频发射器、天线等),或者接收器等。
总线1504可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图15中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线1504可包括在计算设备150各个部件(例如,存储器1501、处理器1502、通信接口1503)之间传送信息的通路。
本申请实施例中,存储器1501存储有可执行的指令,处理器1502执行该可执行的指令以实现前述的数据迁移方法,例如图8、图10或图12等实施例中的数据迁移方法。也即,存储器1501上存有用于执行数据迁移方法的指令。
本申请实施例还提供一种计算设备集群,该计算设备集群包含至少一个计算设备150,每个计算设备150包括处理器1502和存储器1501;
至少一个计算设备150的处理器1502用于执行所述至少一个计算设备150的存储器1501中存储的指令,以使得计算设备集群实现前述的数据迁移方法,例如图8、图10或图12等实施例中的数据迁移方法。可选的,存储器上存有用于执行数据迁移方法的指令。
本申请实施例还提供一种存储设备,存储设备包含存储盘,以及如图14所示的计算装置或如图15所示的计算设备。存储盘用于提供存储文件的数据的空间,计算装置或计算设备用于实现前述的数据迁移方法,例如图8、图10或图12等实施例中第一存储设备一侧,和/或,第二存储设备一侧的方法。
应理解,关于存储设备的相关描述还可以参考图1、图2、图3、图4等实施方式中对于第一计算设备和第二计算设备的描述。
在一些场景中,存储设备可以为存储厂商提供的存储产品。例如,存储设备可以包含华为提供的存储产品Dorado、或Pacific等。
本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在至少一个处理器上运行时,实现前述的数据迁移方法,例如图8、图10或图12等实施例中的数据迁移方法。
其中,所述计算机可读存储介质可以是计算设备能够存储的任何可用介质,或者是包含一个或多个可用介质的数据中心等数据存储设备。所述计算机可读存储介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。
本申请提供了一种计算机程序产品,计算机程序产品包括计算机指令,当所述指令在至少一个处理器上运行时,实现前述的数据迁移方法,例如图8、图10或图12等实施例中的数据迁移方法。
可选的,该计算机程序产品可以为一个软件安装包或镜像包,在需要使用前述方法的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。
本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请中实施例提到的“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b、或c中的至少一项(个),可以表示:a、b、c、(a和b)、(a和c)、(b和c)、或(a和b和c),其中a、b、c可以是单个,也可以是多个。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A、同时存在A和B、单独存在B这三种情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。
以及,除非有相反的说明,本申请实施例使用“第一”、“第二”等序数词是用于对多个对象进行区分,不用于限定多个对象的顺序、时序、优先级或者重要程度。例如,第一存储设备和第二存储设备,只是为了便于描述,而并不是表示第一存储设备和第二存储设备的装置结构、部署顺序、重要程度等的不同。
本领域普通技术人员可以理解,实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。

Claims (41)

  1. 一种数据迁移方法,其特征在于,所述方法包括:
    确定针对第一文件的迁移任务,所述第一文件的数据存储在第一存储设备上,所述针对第一文件的迁移任务指示将所述第一文件的数据从所述第一存储设备迁移到第二存储设备;其中,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息,所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第一存储设备,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且不包含所述第二存储设备;
    对所述第一文件的元数据进行第一变更,以触发执行所述针对第一文件的迁移任务;其中,所述第一变更指示所述第一文件归属的存储设备从所述第一存储设备变更为所述第二存储设备。
  2. 根据权利要求1所述的方法,其特征在于,所述第一文件的归属信息为所述第一存储设备的标识,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
    所述对所述第一文件的元数据进行第一变更,包括:
    将所述第一文件的归属信息从所述第一存储设备的标识变更为所述第二存储设备的标识。
  3. 根据权利要求1或2所述的方法,其特征在于,所述针对第一文件的迁移任务包括所述第一文件的标识、所述第一存储设备的标识和所述第二存储设备的标识。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法由迁移调度装置执行,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述迁移调度装置,第一计算设备和第二计算设备;其中,所述第一计算设备位于所述第一存储设备中或与所述第一存储设备相连,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连。
  5. 根据权利要求4所述的方法,其特征在于,在所述对所述第一文件的元数据进行第一变更之后,所述方法还包括:
    发送第一通知,所述第一通知指示所述第一文件的元数据发生了变更,以使得所述第一计算设备或所述第二计算设备根据所述第一通知获取所述第一文件的所述第一变更后的元数据,并根据所述第一文件的所述第一变更后的元数据执行所述针对第一文件的迁移任务。
  6. 根据权利要求4或5所述的方法,其特征在于,在触发执行所述针对第一文件的迁移任务之后,所述方法还包括:
    获取所述第一文件的第二变更后的元数据,所述第二变更由所述第一计算设备或所述第二计算设备执行,所述第二变更指示所述第一文件的存储布局信息的变化;
    根据所述第一文件的所述第二变更后的元数据,确定所述第一文件的迁移进度。
  7. 根据权利要求6所述的方法,其特征在于,在所述获取所述第一文件的第二变更后的元数据之前,所述方法还包括:
    接收第二通知,所述第二通知指示所述第一文件的元数据发生了变更。
  8. 根据权利要求7所述的方法,其特征在于,所述迁移调度装置、所述第一计算设备和所述第二计算设备均维护有所述目标文件系统的元数据;
    所述获取所述第一文件的第二变更后的元数据,包括:
    根据所述第二通知获取所述第一文件的第二变更后的元数据,所述第二通知包含所述第二变更的内容,或者,所述第二通知包含所述第一文件的第二变更后的元数据。
  9. 根据权利要求7所述的方法,其特征在于,所述迁移调度装置、所述第一计算设备和所述第二计算设备均维护有所述目标文件系统的元数据;
    所述获取所述第一文件的第二变更后的元数据,包括:
    向所述第一计算设备或所述第二计算设备发送用于获取所述第一文件的变更后的元数据的请求;
    根据所述第一计算设备或所述第二计算设备对所述请求的响应获取所述第一文件的第二变更后的元数据。
  10. 根据权利要求6或7所述的方法,其特征在于,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
    所述获取所述第一文件的第二变更后的元数据,包括:
    从所述全局元数据服务获取所述第一文件的所述第二变更后的元数据。
  11. 根据权利要求4-7和10中任意一项所述的方法,其特征在于,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
    所述对所述第一文件的元数据进行第一变更,包括:
    通过所述全局元数据服务提供的服务接口来实现所述第一变更。
  12. 根据权利要求10或11所述的方法,其特征在于,所述全局元数据服务位于所述多个设备中的任意一个设备上,或者位于所述多个设备之外的任意一个设备上。
  13. 根据权利要求4-12任一项所述的方法,其特征在于,所述目标文件系统的元数据为流式结构且包含多条元数据记录,每条元数据记录包含一个节点的标识和所述一个节点的属性,其中,节点为文件或目录,所述节点的属性包含所述节点的归属信息和所述节点的存储布局信息;
    所述对所述第一文件的元数据进行第一变更,包括:
    在所述目标文件系统的元数据的末端追加第一元数据记录,所述第一元数据记录包括所述第一文件的标识和所述第一文件的变更后的归属信息,所述第一文件的变更后的归属信息指示所述第一文件归属的存储设备为所述第二存储设备。
  14. 根据权利要求1-13任一项所述的方法,其特征在于,所述确定针对第一文件的迁移任务,包括:
    根据外部事件信息,确定所述针对第一文件的迁移任务;
    所述外部事件信息包含以下信息中的一项或者多项:网络连接情况、设备健康情况或所述第一文件相关的人员调动状况。
  15. 根据权利要求1-14任一项所述的方法,其特征在于,所述确定针对第一文件的迁移任务,包括:
    根据所述第一文件的元数据的分析结果,确定所述针对第一文件的迁移任务;其中,所述分析结果包含以下一项或者多项信息:所述第一文件的冷热状态、所述第一文件的安全性或所述第一文件相关的业务。
  16. 根据权利要求1-15任一项所述的方法,其特征在于,所述确定针对第一文件的迁移任务,包括:
    根据用户输入的针对所述第一文件的迁移指示,确定所述针对第一文件的迁移任务。
  17. 根据权利要求1-16任一项所述的方法,其特征在于,所述方法还包括:
    确定针对第二文件的迁移任务;
    编排所述针对第一文件的迁移任务和所述针对第二文件的迁移任务的执行顺序。
  18. 一种数据迁移方法,其特征在于,应用于第一计算设备,所述第一计算设备位于第一存储设备中或与所述第一存储设备相连,所述第一存储设备上存储第一文件的数据,所述方法包括:
    获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
    在确定所述第一文件的归属信息指示的存储设备为第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备且包含所述第一存储设备时,将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备。
  19. 根据权利要求18所述的方法,其特征在于,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据在多个设备之间同步,所述多个设备包含所述第一计算设备。
  20. 根据权利要求19所述的方法,其特征在于,所述目标文件系统的元数据存储在全局元数据服务中且通过所述全局元数据服务在所述多个设备之间同步;
    所述获取第一文件的元数据,包括:
    从所述全局元数据服务获取所述第一文件的当前的元数据。
  21. 根据权利要求18-20任意一项所述的方法,其特征在于,所述将所述第一文件的数据从所述第一存储设备迁移到所述第二存储设备,包括:
    向共享存储区推送所述第一文件的数据,所述共享存储区与所述第一计算设备和第二计算设备相连,其中,所述第二计算设备位于所述第二存储设备中或与所述第二存储设备相连;
    对所述第一文件的元数据执行第一变更,以触发所述第二计算设备从所述共享存储区获取所述第一文件的数据并存储到所述第二存储设备中,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述共享存储区;在所述第一变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备和共享存储区,且不包含所述第二存储设备。
  22. 根据权利要求21所述的方法,其特征在于,所述第一文件的归属信息为所述第二存储设备的标识,在所述第一变更之前,所述第一文件的存储布局信息包含所述第一存储设备的标识且不包含所述第二存储设备的标识;
    所述对所述第一文件的元数据进行第一变更,包括:
    在所述第一文件的存储布局信息中增加所述共享存储区的标识。
  23. 根据权利要求21或22所述的方法,其特征在于,在所述获取第一文件的元数据之前,所述方法还包括:
    接收第一通知,所述第一通知指示所述第一文件的元数据发生了变更;
    在对所述第一文件的元数据执行第一变更后,所述方法还包括:
    发送第二通知,所述第一通知指示所述第一文件的元数据发生了变更。
  24. 根据权利要求12-17任一项所述的方法,其特征在于,所述方法还包括:
    当获取到所述第一文件的第二变更后的元数据时,删除所述第一存储设备上的所述第一文件的数据;所述第二变更指示所述第一文件的存储布局信息的变化,在所述第二变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含所述第一存储设备且包含所述第二存储设备;
    对所述第一文件的元数据执行第三变更,所述第三变更指示在所述第一文件的存储布局信息指示的存储设备中删除所述第一存储设备;在所述第三变更后,所述第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第一存储设备。
  25. 根据权利要求24所述的方法,其特征在于,在所述删除所述第一存储设备上的所述第一文件之前,所述方法还包括:
    将所述第一文件的数据标记为可删除,以使得在所述第一文件的数据处于可删除状态时执行删除所述第一文件的操作。
  26. 根据权利要求18-25任一项所述的方法,其特征在于,所述方法还包括:
    提供所述第一存储设备的本地文件视图,所述本地文件视图指示存储在所述第一存储设备上的多个文件的层次结构,所述多个文件的存储布局信息指示所述第一存储设备。
  27. 一种数据迁移方法,其特征在于,应用于第二计算设备,所述第二计算设备位于第二存储设备中或与所述第二存储设备相连,所述方法包括:
    获取第一文件的元数据,所述第一文件的元数据包含所述第一文件的归属信息和所述第一文件的存储布局信息;
    在所述第一文件的归属信息指示所述第一文件归属的存储设备为所述第二存储设备,且第一文件的存储布局信息指示存储所述第一文件的存储设备不包含所述第二存储设备时,从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备。
  28. 根据权利要求27所述的方法,其特征在于,所述方法还包括:
    对所述第一文件的元数据执行第一变更,所述第一变更指示在所述第一文件的存储布局信息指示的存储设备中增加所述第二存储设备。
  29. 根据权利要求27或28所述的方法,其特征在于,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含第一存储设备,在所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备之前,所述方法还包括:
    发送针对所述第一文件的拉取请求,所述拉取请求用于指示第一计算设备推送所述第一文件;所述第一计算设备位于第一存储设备中或与所述第一存储设备相连。
  30. 根据权利要求27-29任一项所述的方法,其特征在于,所述第一文件的存储布局信息指示存储所述第一文件的存储设备包含共享存储区,所述从存储所述第一文件的数据的设备拉取所述第一文件的数据到所述第二存储设备,包括:
    从所述共享存储区拉取所述第一文件的数据到所述第二存储设备。
  31. 根据权利要求27-30任意一项所述的方法,其特征在于,所述第一文件属于目标文件系统,所述第一文件的元数据包含于所述目标文件系统的元数据中,所述目标文件系统的元数据存储在全局元数据服务中;
    所述获取第一文件的元数据,包括:
    从所述全局元数据服务获取所述第一文件的当前的元数据。
  32. 一种迁移调度装置,其特征在于,所述迁移调度装置包含任务确定模块和元数据更新模块,所述迁移调度装置用于实现权利要求1-17任一项所述的方法。
  33. 一种计算装置,其特征在于,所述计算装置包含元数据获取模块和迁移模块,所述迁移调度装置用于实现权利要求18-26任一项所述的方法,或者用于实现权利要求27-31任一项所述的方法。
  34. 一种迁移调度装置,其特征在于,所述迁移调度装置包括处理器和存储器;
    所述处理器用于执行所述存储器中存储的指令,以使得所述计算设备实现如权利要求1-17任一项所述的方法。
  35. 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;所述存储器中存储有计算机指令,所述处理器用于调用所述存储器中存储的计算机指令,以实现权利要求18-26 任一项所述的方法。
  36. 一种存储设备,其特征在于,所述存储设备包含如权利要求35所述的计算设备以及与所述计算设备相连的存储盘。
  37. 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;所述存储器中存储有计算机指令,所述处理器用于调用所述存储器中存储的计算机指令,以实现权利要求27-31任一项所述的方法。
  38. 一种存储设备,其特征在于,所述存储设备包含如权利要求37所述的计算设备以及与所述计算设备相连的存储盘。
  39. 一种数据迁移系统,其特征在于,所述数据迁移系统包含如权利要求36所述的存储设备和如权利要求38所述的存储设备。
  40. 根据权利要求39所述的数据迁移系统,其特征在于,所述数据迁移系统还包括如权利要求34所述的迁移调度装置。
  41. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令被处理器执行时,实现如权利要求1-17任一项所述的方法,或者实现权利要求18-26任一项所述的方法,或者实现权利要求27-31任一项所述的方法。
PCT/CN2023/080091 2022-06-13 2023-03-07 数据迁移方法及相关装置 WO2023241115A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210678199 2022-06-13
CN202210678199.8 2022-06-13
CN202211102393.8 2022-09-09
CN202211102393.8A CN117234412A (zh) 2022-06-13 2022-09-09 数据迁移方法及相关装置

Publications (1)

Publication Number Publication Date
WO2023241115A1 true WO2023241115A1 (zh) 2023-12-21

Family

ID=89086793

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/080091 WO2023241115A1 (zh) 2022-06-13 2023-03-07 数据迁移方法及相关装置

Country Status (2)

Country Link
CN (1) CN117234412A (zh)
WO (1) WO2023241115A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012567A1 (en) * 2013-07-02 2015-01-08 Hitachi Data Systems Engineering UK Limited Method and apparatus for migration of a virtualized file system, data storage system for migration of a virtualized file system, and file server for use in a data storage system
US20150370845A1 (en) * 2014-06-18 2015-12-24 International Business Machines Corporation Storage device data migration
CN113836116A (zh) * 2021-09-29 2021-12-24 济南浪潮数据技术有限公司 数据迁移方法、装置、电子设备及可读存储介质
CN114064563A (zh) * 2020-07-30 2022-02-18 深圳市杉岩数据技术有限公司 一种基于对象存储的数据迁移方法和服务器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012567A1 (en) * 2013-07-02 2015-01-08 Hitachi Data Systems Engineering UK Limited Method and apparatus for migration of a virtualized file system, data storage system for migration of a virtualized file system, and file server for use in a data storage system
US20150370845A1 (en) * 2014-06-18 2015-12-24 International Business Machines Corporation Storage device data migration
CN114064563A (zh) * 2020-07-30 2022-02-18 深圳市杉岩数据技术有限公司 一种基于对象存储的数据迁移方法和服务器
CN113836116A (zh) * 2021-09-29 2021-12-24 济南浪潮数据技术有限公司 数据迁移方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN117234412A (zh) 2023-12-15

Similar Documents

Publication Publication Date Title
US11200044B2 (en) Providing access to a hybrid application offline
US20230101958A1 (en) File journal interface for synchronizing content
CN106446159B (zh) 一种存储文件的方法、第一虚拟机及名称节点
JP2015505096A (ja) 分散型アプリケーション・オブジェクトに関するアップデート通知の提供
JP7374232B2 (ja) コンテキスト付きのコンテンツ・アイテム共有
US11463446B2 (en) Team member transfer tool
WO2023160083A1 (zh) 执行交易的方法、区块链、主节点和从节点
CN109885577A (zh) 数据处理方法、装置、终端及存储介质
US20160140118A1 (en) Information management
CN107798063A (zh) 快照处理方法和快照处理装置
US20230055511A1 (en) Optimizing clustered filesystem lock ordering in multi-gateway supported hybrid cloud environment
CN113095778A (zh) 通过多个邮箱在通信应用中进行海量数据管理的架构
WO2023241115A1 (zh) 数据迁移方法及相关装置
WO2024001025A1 (zh) 一种预执行缓存数据清理方法和区块链节点
CN102867029A (zh) 一种管理分布式文件系统目录的方法及分布式文件系统
CN113051244B (zh) 数据访问方法和装置、数据获取方法和装置
JPWO2011108041A1 (ja) ストレージ装置
US11340964B2 (en) Systems and methods for efficient management of advanced functions in software defined storage systems
CN112578996B (zh) 一种存储系统的元数据的发送方法及存储系统
WO2023241116A1 (zh) 数据处理方法及相关装置
US11435948B2 (en) Methods and systems for user space storage management
CN109947704A (zh) 一种锁类型切换方法、装置及集群文件系统
US20220309050A1 (en) Method and system for managing cross data source data access requests
JP2023547439A (ja) 非同期動作のための意図トラッキング
CN117290298A (zh) 数据处理方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23822687

Country of ref document: EP

Kind code of ref document: A1