CN111258954B - Data migration method, device, equipment and storage medium - Google Patents

Data migration method, device, equipment and storage medium Download PDF

Info

Publication number
CN111258954B
CN111258954B CN202010025782.XA CN202010025782A CN111258954B CN 111258954 B CN111258954 B CN 111258954B CN 202010025782 A CN202010025782 A CN 202010025782A CN 111258954 B CN111258954 B CN 111258954B
Authority
CN
China
Prior art keywords
data
migrated
migration
directory entry
data migration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010025782.XA
Other languages
Chinese (zh)
Other versions
CN111258954A (en
Inventor
焦如松
田勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010025782.XA priority Critical patent/CN111258954B/en
Publication of CN111258954A publication Critical patent/CN111258954A/en
Application granted granted Critical
Publication of CN111258954B publication Critical patent/CN111258954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems

Abstract

The application discloses a data migration method, a device, equipment and a storage medium, and relates to the technical field of data migration. The specific implementation scheme is as follows: obtaining first verification information by verifying data to be migrated; creating a hard link in the target directory for the data to be migrated to obtain a new directory entry of the data to be migrated; acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the corresponding data of the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information, and if so, cleaning the source catalog item of the data to be migrated before migration. According to the embodiment of the application, the data migration on the single machine is realized in a hard link creating mode, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced. In addition, the data migration process can be re-entered, rolled back and asynchronous, is suitable for migration of a large number of files, and has high security.

Description

Data migration method, device, equipment and storage medium
Technical Field
The application relates to the technical field of computers, in particular to the technical field of data migration.
Background
Services are becoming more and more mainstream to run within containers, and distributed storage systems are no exception. However, for the physical machine stock storage service which already stores mass data, the availability, stability, user noninductivity and the like need to be considered, and the switching process is complex. At this time, the in-place switching of the standalone from the physical machine mode of operation to in-container operation tends to be more efficient (avoiding movement of data across the network). While the storage system is a stateful service, this stand-alone handoff process typically requires: migration of services and migration of service states (data). For migration of services, start-stop operations of physical machine and container mode service processes are generally required; whereas for migration of service states (data), it is often necessary to move the data within a stand-alone range. For clusters with many nodes, there is typically a central coordinator to schedule a stand-alone handoff task with rpc (Remote Procedure Call ) commands. In the single machine switching process, the migration of single machine data involves a plurality of data, which is important. This places some demands on stand-alone data migration:
the cluster relates to a large number of online data nodes, and a single machine migration process is required to be rolled back in an emergency manner; a large amount of user data is related in a single data node, and data migration is required to be safe and reliable; single point migration may have failure, and the migration process needs to be reentrant; the central coordinator and the data nodes communicate through the remote rpc, network jitter is likely to occur, and synchronous blocking call is easy to fail, on the other hand, the coordinator often needs to operate a plurality of nodes, so that the asynchronous mode needs to be supported.
In the prior art, for data migration on a single machine, a mobile interface of a file system is generally called to perform data migration operation. The existing data migration method is low in safety and migration efficiency, and the risk coefficient of operating the online data is high when a large number of files need to be migrated.
Disclosure of Invention
The application provides a data migration method, a device, equipment and a storage medium, which are used for improving the security and migration efficiency of a data migration process on a single machine.
The first aspect of the present application provides a data migration method, including:
checking the data to be migrated to obtain first check information;
creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated;
acquiring second check information according to the data corresponding to the new directory entry in the target directory; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
and if the data corresponding to the new directory entry is consistent with the data to be migrated, cleaning the source directory entry of the data to be migrated before migration.
In the embodiment, the data migration on the single machine is realized by creating the hard link, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced.
In one possible design, the first check information includes a storage structure and/or a check code before migration of the data to be migrated, where the check code is an inode number corresponding to a source directory entry before migration or a check code obtained by a check algorithm from the data to be migrated;
the second check information comprises a storage structure and/or check codes after the data to be migrated are migrated, wherein the check codes are inode numbers corresponding to new directory entries or check codes obtained from the data corresponding to the new directory entries through a check algorithm;
the checking whether the data corresponding to the new directory entry is consistent with the data to be migrated according to the first checking information and the second checking information comprises the following steps:
and comparing whether the first check information is identical with the second check information.
In one possible design, before the verifying the data to be migrated, the method further includes:
detecting a data migration environment;
and judging whether the data migration environment meets preset conditions or not.
In one possible design, the method further comprises:
and recording the current stage in the data migration process in real time through a state machine.
In one possible design, the method further comprises:
When the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
In one possible design, the re-entering the data migration process according to the current stage of the state machine at the interruption time includes:
if the current stage is before the hard link is created, the whole data migration process is executed by the heavy head;
if the current stage is in the verification process after the hard link is created or the hard link is created, deleting the new catalog item of the data to be migrated, and then re-executing the whole data migration process;
if the current stage is in the process of cleaning the source directory entry, continuing to execute the process of cleaning the source directory entry;
and if the current stage is in the process of completing the process of cleaning the source directory entries, returning prompt information of successful migration.
In one possible design, the method further comprises:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
In one possible design, the rolling back the data migration process according to the current stage in the state machine includes:
If the current stage is before the hard link is created, returning a prompt message of completion of rollback;
if the current stage is in the process of creating hard links or checking after creating hard links, deleting the new directory entry of the data to be migrated;
if the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated through creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process.
In one possible design, the data to be migrated includes a plurality of data, the method further comprising:
carrying out a data migration process on a plurality of data in an asynchronous mode, and acquiring the state of each data migration process by a scheduling device in a polling mode;
and controlling any data migration process according to the state of the data migration process.
A second aspect of the present application provides a data migration apparatus, comprising:
the verification module is used for verifying the data to be migrated to obtain first verification information;
the migration module is used for creating a hard link for the data to be migrated in the target directory to obtain a new directory entry of the data to be migrated;
The verification module is also used for obtaining second verification information according to the data corresponding to the new catalog item in the target catalog; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
and the cleaning module is used for cleaning the source directory entry of the data to be migrated before migration if the data corresponding to the new directory entry is consistent with the data to be migrated.
In one possible design, the first check information includes a storage structure and/or a check code before migration of the data to be migrated, where the check code is an inode number corresponding to a source directory entry before migration or a check code obtained by a check algorithm from the data to be migrated;
the second check information comprises a storage structure and/or check codes after the data to be migrated are migrated, wherein the check codes are inode numbers corresponding to new directory entries or check codes obtained from the data corresponding to the new directory entries through a check algorithm;
the verification module is configured to, when verifying, for the first verification information and the second verification information, whether the data corresponding to the new directory entry is consistent with the data to be migrated:
And comparing whether the first check information is identical with the second check information.
In one possible design, the apparatus further comprises a detection module for:
before checking the data to be migrated, detecting a data migration environment;
and judging whether the data migration environment meets preset conditions or not.
In one possible design, the apparatus further comprises a state recording module for:
and recording the current stage in the data migration process in real time through a state machine.
In one possible design, the apparatus further includes a reentry control module to:
when the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
In one possible design, the reentry control module is configured to, when the data migration process is reentered according to a current stage of the state machine at an interruption time,:
if the current stage is before the hard link is created, the whole data migration process is executed by the heavy head;
if the current stage is in the verification process after the hard link is created or the hard link is created, deleting the new catalog item of the data to be migrated, and then re-executing the whole data migration process;
If the current stage is in the process of cleaning the source directory entry, continuing to execute the process of cleaning the source directory entry;
and if the current stage is in the process of completing the process of cleaning the source directory entries, returning prompt information of successful migration.
In one possible design, the apparatus further comprises a rollback control module for:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
In one possible design, the rollback control module is configured to, when rolling back a data migration process according to a current stage in the state machine:
if the current stage is before the hard link is created, returning a prompt message of completion of rollback;
if the current stage is in the process of creating hard links or checking after creating hard links, deleting the new directory entry of the data to be migrated;
if the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated through creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process.
In one possible design, the data to be migrated includes a plurality of data, and the apparatus further includes a scheduling module configured to:
carrying out a data migration process on a plurality of data in an asynchronous mode, and acquiring the state of each data migration process by a scheduling device in a polling mode;
and controlling any data migration process according to the state of the data migration process.
A third aspect of the present application provides an electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect.
A fourth aspect of the application provides a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of the first aspect.
A fifth aspect of the application provides a computer program comprising program code for performing the method of the first aspect when the computer program runs on a computer.
A sixth aspect of the present application provides a data migration method, including:
creating a hard link in a target directory for data to be migrated to obtain a new directory entry of the data to be migrated;
and cleaning the source directory entry of the data to be migrated before migration.
One embodiment of the above application has the following advantages or benefits: obtaining first verification information by verifying data to be migrated; creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated; acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information, and if so, cleaning a source catalog item of the data to be migrated before migration. According to the embodiment of the application, the data migration on the single machine is realized by creating the hard link, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced. In addition, the data migration process can be reentrant, rollback and asynchronous, is suitable for migration of a large number of files, greatly simplifies the scheme design of cluster node data migration, and does not need to frequently consider the problems of service stability, scheduling node and migration node rpc call failure and the like in the migration process.
Other effects of the above alternative will be described below in connection with specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:
FIG. 1 is a flow chart of a data migration method according to an embodiment of the present application;
FIG. 2 is a flow chart of a data migration method according to another embodiment of the present application;
FIG. 3 is a flow chart of a data migration method according to another embodiment of the present application;
FIG. 4 is a block diagram illustrating a data migration apparatus according to an embodiment of the present application;
FIG. 5 is a block diagram of a data migration apparatus according to another embodiment of the present application;
fig. 6 is a block diagram of an electronic device for implementing a data migration method of an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
For stand-alone handover processes, it is important that the stand-alone data migration involves a lot of data. This places some demands on stand-alone data migration: the cluster relates to a large number of online data nodes, and a single machine migration process is required to be rolled back in an emergency manner; a large amount of user data is related in a single data node, and data migration is required to be safe and reliable; single point migration may have failure, and the migration process needs to be reentrant; the central coordinator and the data nodes communicate through the remote rpc, network jitter is likely to occur, and synchronous blocking call is easy to fail, on the other hand, the coordinator often needs to operate a plurality of nodes, so that the asynchronous mode needs to be supported.
In the prior art, the data migration on the single machine is realized by calling the mobile interface of the file system, which generally cannot meet the requirements, so that the data migration method is provided in the embodiment, the data migration on the single machine is realized by creating a hard link, the mobile interface of the data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, the risk of the data migration process is reduced, and in addition, the data migration process can be re-entered, rolled back and asynchronous and is suitable for migration of a large number of files. The data migration process will be described in detail with reference to specific embodiments.
An embodiment of the present application provides a data migration method, and fig. 1 is a flowchart of the data migration method provided by the embodiment of the present application. As shown in fig. 1, the data migration method specifically includes the following steps:
s101, checking the data to be migrated to obtain first check information.
In this embodiment, before migrating data, the data to be migrated may be first checked to obtain first check information, where the first check information is used to check the migrated data subsequently, so as to verify that the migrated data is consistent with the original data of the data to be migrated, and may include content consistency and/or storage structure consistency.
Optionally, the first check information may include a storage structure and/or a check code before migration of the data to be migrated, where the check code is an iNode number (iNode) corresponding to a source directory entry before migration or a check code obtained by a check algorithm for the data to be migrated.
The storage structure in this embodiment specifically includes, for example, a storage directory tree of the data to be migrated, including relationships among its parent nodes, child nodes, sibling nodes, and the like; the check code may be an enode number corresponding to a source directory entry in the source directory before the data to be migrated is migrated, where the enode is used to store the data to be migrated and basic information (such as the number of data bytes, read-write permission, storage location, etc.) of the directory where the data to be migrated is located, and if the enode numbers of the two data are the same, the two data are the same data; the check code may also be a check code obtained by processing the data to be migrated through a check algorithm such as MD5, SHA1, SHA256, that is, an MD5 value, SHA1 value, SHA256 value, and if the MD5 value, SHA1 value, or SHA256 value of the two data are the same, it is indicated that the contents of the two data are identical. Of course, other verification methods may also be adopted in this embodiment, and will not be described herein.
S102, creating a hard link for the data to be migrated in the target directory to obtain a new directory entry of the data to be migrated.
In this embodiment, when data to be migrated needs to be migrated to a certain container or other locations, a data directory corresponding to the container or other locations is taken as a target directory, and in this embodiment, a hard link is created in the target directory, so as to obtain a new directory entry of the data to be migrated, where the concept of the hard link is to enable multiple data names not in or under the same directory to be modified, and simultaneously enable the same data to be modified, where all the modified data with the hard link are modified together, that is, in this embodiment, the data to be migrated can be linked to the new directory entry in the target directory, and at this time, the source directory entry of the data to be migrated in the source directory is not affected. In this embodiment, the data migration is performed by hard linking, and there is no need to change the storage location of the data to be migrated on the physical disk.
S103, acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the data corresponding to the new directory entry is consistent with the data to be migrated or not according to the first check information and the second check information.
In this embodiment, after the creation of the hard link is completed, if the hard link is correct, the new data corresponding to the directory entry should be the data to be migrated, and at this time, the second check information may be obtained for the new data corresponding to the directory entry by referring to S101, and similarly, the second check information may include a storage structure and/or a check code after the migration of the data to be migrated, where the check code is an inode number corresponding to the new directory entry or a check code obtained for the data corresponding to the new directory entry by a check algorithm, which is not described herein in detail.
In this embodiment, if the second check information matches the first check information, it may be indicated that the data corresponding to the new directory entry matches the data to be migrated, that is, it is determined whether the migrated data matches the data before migration. In this embodiment, whether the first check information is the same as the second check information can be compared, if so, it is indicated that the two data are identical, and optionally, in this embodiment, whether the data are identical can be quickly and efficiently determined by comparing the enode numbers of the two data, or whether the data contents are identical can be more accurately determined by comparing the MD5 value, SHA1 value or SHA256 value of the two data.
And S104, if the data corresponding to the new directory entry is consistent with the data to be migrated, cleaning the source directory entry of the data to be migrated before migration.
In this embodiment, when it is determined that the data corresponding to the new directory entry is consistent with the data to be migrated, the source directory entry of the data to be migrated before migration is cleaned, so that the whole data migration process is completed, and the security and reliability of data migration are increased.
According to the data migration method provided by the embodiment, first verification information is obtained by verifying data to be migrated; creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated; acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information, and if so, cleaning a source catalog item of the data to be migrated before migration. In the embodiment, the data migration on the single machine is realized by creating the hard link, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced.
Another embodiment of the present application provides a data migration method, as shown in fig. 2, based on the above embodiment, the data migration method specifically includes the following steps:
s201, detecting a data migration environment; and judging whether the data migration environment meets preset conditions or not.
In this embodiment, the basic environment is checked first, for example, to check the state (whether the storage node can work normally) of the data to be migrated, and whether the current storage service has been stopped, so that in order to avoid the data to be migrated from changing during the migration process, the current storage service needs to be stopped to perform data migration. When the storage node state meets the expectations and the current storage service has stopped, the following S202-S205 can be directly executed; when the storage node state meets the expectations, but the current storage service is not stopped, state saving, stopping of the current storage service can be performed, and then the subsequent S202-S205 are performed.
S202, checking data to be migrated to obtain first check information when the data migration environment meets preset conditions;
s203, creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated;
S204, obtaining second check information according to the data corresponding to the new directory entry in the target directory; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
s205, if the data corresponding to the new directory entry is consistent with the data to be migrated, cleaning the source directory entry of the data to be migrated before migration.
In this embodiment, S202-S205 are the same as S101-S104 in the above embodiment, and the principle and technical effects thereof are not described herein.
On the basis of any one of the above embodiments, the method further includes:
and recording the current stage in the data migration process in real time through a state machine.
In this embodiment, when each step in the data migration process of the foregoing embodiments S101 to S104 or S201 to S205 is performed, the current stage in the data migration process may be recorded in real time by a state machine, for example, the current stage may be recorded as the pre-migration data verification stage when the current execution is performed to S101; the current stage can be recorded as a hard link creation stage when executing to S102; recording the current stage as the verification process stage after creating the hard link when executing to S103; the current stage may be recorded as the clean source directory entry process stage when executing to S104.
Based on the above embodiment, the present embodiment further provides a scheme for reentering after an interruption (or migration failure) occurs in the data migration process, which specifically includes the following steps:
when the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
In this embodiment, when the data migration process is interrupted, the user may control the data migration process to reenter, and after receiving the reentry instruction of the user, the current stage of the interruption time may be obtained from the state machine, and then different reentry modes may be adopted according to the different current stages of the interruption time.
Alternatively, if the current stage is prior to creating the hard link, the whole data migration process is performed by the heavy head. Wherein the data verification stage before the hard link is created and/or the data migration environment stage is detected, i.e. no new directory entry is created before the interruption, the whole data migration process can be directly executed from the beginning, i.e. S101-S104 or S201-S205 are re-executed when the hard link is re-executed.
Optionally, if the current stage is in the process of creating the hard link or checking after creating the hard link, deleting the new directory entry of the data to be migrated, and then re-executing the whole data migration process. Because the data to be migrated during the retransmission may have changed, which is different from the data to be migrated before the interruption, the whole data migration process needs to be re-executed after deleting the new directory entry of the data to be migrated, that is, the S101-S104 or S201-S205 is re-executed, so as to ensure that the data before and after the migration are the same.
Optionally, if the current stage is in the process of cleaning the source directory entry, the process of cleaning the source directory entry is continuously executed. Since the hard link has been created before the interrupt, the new directory entry can be linked to the data location, at which point the cleanup step need only be continued.
Optionally, if the current stage is in the process of cleaning the source directory entry, returning a prompt message of successful migration. When the process of cleaning the source directory entry is completed before interruption, that is, the S101-S104 or the S201-S205 is completed, the re-entry is performed again without executing any step of the S101-S104 or the S201-S205, and prompt information of successful migration can be directly returned.
On the basis of any one of the above embodiments, the present embodiment further provides a rollback scheme, which specifically includes:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
In this embodiment, during the data migration process or after the data migration is completed, if rollback is required, the current stage may be acquired in the state machine, and then different rollback manners are adopted according to the different stages at the current stage.
Optionally, if the current stage is before the hard link is created, a prompt for completion of rollback is returned. The method comprises a data verification stage before the hard link is created and/or a data migration environment detection stage, namely a new catalog item is not created at the moment, and only a prompt message for finishing rollback is required to be returned during rollback.
Optionally, if the current stage is in the process of creating the hard link or checking after creating the hard link, deleting the new directory entry of the data to be migrated. Since a new directory entry is created at this time, but the source directory entry before migration is not deleted, the new directory entry of the data to be migrated is deleted when rolling back.
Optionally, if the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated by creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process. Because the new directory entry has been created and the source directory entry before migration has been deleted at this time, the migrated data may be migrated again, i.e., the source directory entry of the data is recreated in the source directory by creating a hard link, with reference to the processes of S101-S104 or S201-S205, and the new directory entry of the data to be migrated in the target directory is deleted after the verification process.
On the basis of any one of the foregoing embodiments, the data to be migrated includes a plurality of data, as shown in fig. 3, and the method further includes:
s301, performing a data migration process on a plurality of data in an asynchronous mode, and acquiring the state of each data migration process by a scheduling device in a polling mode;
S302, controlling any data migration process according to the state of the data migration process.
In this embodiment, when there are multiple data migration tasks, a central coordinator (scheduling device) may sequentially start the migration tasks of each data, so as to perform each data migration process in an asynchronous manner, record the state of the migration process in real time in each data migration process, and obtain the state of each data migration process by the scheduling device in a polling manner, where the state of the migration process may include the migration process, the migration success and the migration failure, and further control any data migration process according to the state of the data migration process, for example, if the state of a certain data migration process is in the migration process for a long time (exceeds a preset threshold), the reentry after suspension or interruption may be performed, so as to improve the network jitter resistance of the migration process, avoid occurrence of synchronous blocking, and facilitate efficient management and scheduling of multiple migration tasks.
The data migration method provided by any embodiment can realize data migration on a single machine, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced. In addition, the data migration process can be reentrant, rollback and asynchronous, is suitable for migration of a large number of files, greatly simplifies the scheme design of cluster node data migration, and does not need to frequently consider the problems of service stability, scheduling node and migration node rpc call failure and the like in the migration process.
An embodiment of the present application provides a data migration device, and fig. 4 is a structural diagram of the data migration device provided by the embodiment of the present application. As shown in fig. 4, the data migration apparatus 400 specifically includes: a verification module 401, a migration module 402 and a cleaning module 403.
The verification module is used for verifying the data to be migrated to obtain first verification information;
the migration module is used for creating a hard link for the data to be migrated in the target directory to obtain a new directory entry of the data to be migrated;
the verification module is also used for obtaining second verification information according to the data corresponding to the new catalog item in the target catalog; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
and the cleaning module is used for cleaning the source directory entry of the data to be migrated before migration if the data corresponding to the new directory entry is consistent with the data to be migrated.
In one possible design, the first check information includes a storage structure and/or a check code before migration of the data to be migrated, where the check code is an inode number corresponding to a source directory entry before migration or a check code obtained by a check algorithm from the data to be migrated;
The second check information comprises a storage structure and/or check codes after the data to be migrated are migrated, wherein the check codes are inode numbers corresponding to new directory entries or check codes obtained from the data corresponding to the new directory entries through a check algorithm;
the verification module is configured to, when verifying, for the first verification information and the second verification information, whether the data corresponding to the new directory entry is consistent with the data to be migrated:
and comparing whether the first check information is identical with the second check information.
In one possible design, as shown in fig. 5, the data migration apparatus 400 further includes a detection module 404 configured to:
before checking the data to be migrated, detecting a data migration environment;
and judging whether the data migration environment meets preset conditions or not.
In one possible design, as shown in fig. 5, the data migration apparatus 400 further includes a status recording module 405 configured to:
and recording the current stage in the data migration process in real time through a state machine.
In one possible design, the data migration apparatus further includes a reentry control module 406 configured to:
when the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
In one possible design, the reentry control module 406 is configured to, when the data migration process is being re-entered according to a current stage of the state machine at an interruption time:
if the current stage is before the hard link is created, the whole data migration process is executed by the heavy head;
if the current stage is in the verification process after the hard link is created or the hard link is created, deleting the new catalog item of the data to be migrated, and then re-executing the whole data migration process;
if the current stage is in the process of cleaning the source directory entry, continuing to execute the process of cleaning the source directory entry;
and if the current stage is in the process of completing the process of cleaning the source directory entries, returning prompt information of successful migration.
In one possible design, as shown in fig. 5, the data migration apparatus 400 further includes a rollback control module 407 configured to:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
In one possible design, the rollback control module 407 is configured to, when rolling back a data migration process according to a current stage in the state machine:
if the current stage is before the hard link is created, returning a prompt message of completion of rollback;
If the current stage is in the process of creating hard links or checking after creating hard links, deleting the new directory entry of the data to be migrated;
if the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated through creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process.
In one possible design, the data to be migrated includes a plurality of data, as shown in fig. 5, and the data migration apparatus 400 further includes a scheduling module 408 configured to:
carrying out a data migration process on a plurality of data in an asynchronous mode, and acquiring the state of each data migration process by a scheduling device in a polling mode;
and controlling any data migration process according to the state of the data migration process.
The data migration apparatus provided in this embodiment may be specifically used to execute the method embodiments provided in fig. 1 to 3, and specific functions are not provided here.
The data migration device provided by the embodiment obtains first verification information by verifying the data to be migrated; creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated; acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information, and if so, cleaning a source catalog item of the data to be migrated before migration. In the embodiment, the data migration on the single machine is realized by creating the hard link, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced. In addition, the data migration process can be reentrant, rollback and asynchronous, is suitable for migration of a large number of files, greatly simplifies the scheme design of cluster node data migration, and does not need to frequently consider the problems of service stability, scheduling node and migration node rpc call failure and the like in the migration process.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
As shown in fig. 6, a block diagram of an electronic device according to a data migration method according to an embodiment of the present application is shown. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile apparatuses, such as personal digital assistants, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 6, the electronic device includes: one or more processors 601, memory 602, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 601 is illustrated in fig. 6.
The memory 602 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the data migration method provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to execute the data migration method provided by the present application.
The memory 602 is used as a non-transitory computer readable storage medium, and may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the data migration method in the embodiment of the present application (e.g., the verification module 401, the migration module 402, and the cleaning module 403 shown in fig. 4, and further the detection module 404, the state recording module 405, the reentrant control module 406, the rollback control module 407, and the scheduling module 408 shown in fig. 5). The processor 601 executes various functional applications of the server and data processing, i.e., implements the data migration method in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 602.
The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created from the use of the data migration electronic device, and the like. In addition, the memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 602 may optionally include memory located remotely from processor 601, which may be connected to the data migration electronics via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the data migration method may further include: an input device 603 and an output device 604. The processor 601, memory 602, input device 603 and output device 604 may be connected by a bus or otherwise, for example in fig. 6.
The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the data migration electronic apparatus, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output means 604 may include a display device, auxiliary lighting means (e.g., LEDs), tactile feedback means (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the first verification information is obtained by verifying the data to be migrated; creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated; acquiring second check information according to the data corresponding to the new directory entry in the target directory; and checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information, and if so, cleaning a source catalog item of the data to be migrated before migration. In the embodiment, the data migration on the single machine is realized by creating the hard link, a mobile interface of a data system is not required to be called in the data migration process, the data migration process is simple and efficient, safe and reliable, and the risk of the data migration process is reduced. In addition, the data migration process can be reentrant, rollback and asynchronous, is suitable for migration of a large number of files, greatly simplifies the scheme design of cluster node data migration, and does not need to frequently consider the problems of service stability, scheduling node and migration node rpc call failure and the like in the migration process.
The application also provides a computer program comprising program code for performing the steps of the above embodiments when the computer program is run
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (17)

1. A method of data migration, comprising:
checking the data to be migrated to obtain first check information;
creating a hard link for the data to be migrated in a target directory to obtain a new directory entry of the data to be migrated;
Acquiring second check information according to the data corresponding to the new directory entry in the target directory; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
if the data corresponding to the new catalog item is consistent with the data to be migrated, cleaning a source catalog item of the data to be migrated before migration;
recording the current stage in the data migration process in real time through a state machine;
when the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
2. The method according to claim 1, wherein the first check information includes a storage structure and/or a check code before migration of the data to be migrated, where the check code is an inode number corresponding to a source directory entry before migration or a check code obtained by a check algorithm for the data to be migrated;
the second check information comprises a storage structure and/or check codes after the data to be migrated are migrated, wherein the check codes are inode numbers corresponding to new directory entries or check codes obtained from the data corresponding to the new directory entries through a check algorithm;
The checking whether the data corresponding to the new directory entry is consistent with the data to be migrated according to the first checking information and the second checking information comprises the following steps:
and comparing whether the first check information is identical with the second check information.
3. The method of claim 1, wherein before verifying the data to be migrated, further comprising:
detecting a data migration environment;
and judging whether the data migration environment meets preset conditions or not.
4. The method according to claim 1, wherein re-entering the data migration process according to the stage at which the interruption time in the state machine is currently located comprises:
if the current stage is before the hard link is created, the whole data migration process is executed by the heavy head;
if the current stage is in the verification process after the hard link is created or the hard link is created, deleting the new catalog item of the data to be migrated, and then re-executing the whole data migration process;
if the current stage is in the process of cleaning the source directory entry, continuing to execute the process of cleaning the source directory entry;
and if the current stage is in the process of completing the process of cleaning the source directory entries, returning prompt information of successful migration.
5. A method according to any one of claims 1-3, further comprising:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
6. The method of claim 5, wherein rolling back the data migration process according to the current stage in the state machine comprises:
if the current stage is before the hard link is created, returning a prompt message of completion of rollback;
if the current stage is in the process of creating hard links or checking after creating hard links, deleting the new directory entry of the data to be migrated;
if the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated through creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process.
7. The method of claim 1, wherein the data to be migrated includes a plurality of data, the method further comprising:
carrying out a data migration process on a plurality of data in an asynchronous mode, and acquiring the state of each data migration process by a scheduling device in a polling mode;
And controlling any data migration process according to the state of the data migration process.
8. A data migration apparatus, comprising:
the verification module is used for verifying the data to be migrated to obtain first verification information;
the migration module is used for creating a hard link for the data to be migrated in the target directory to obtain a new directory entry of the data to be migrated;
the verification module is also used for obtaining second verification information according to the data corresponding to the new catalog item in the target catalog; checking whether the data corresponding to the new catalog item is consistent with the data to be migrated or not according to the first checking information and the second checking information;
the cleaning module is used for cleaning a source directory entry of the data to be migrated before migration if the data corresponding to the new directory entry is consistent with the data to be migrated;
the state recording module is used for recording the current stage in the data migration process in real time through a state machine;
and the reentry control module is used for reentering the data migration process according to the current stage of the interruption moment in the state machine when the data migration process is interrupted and the reentry instruction is received.
9. The apparatus of claim 8, wherein the first check information includes a storage structure and/or a check code before migration of the data to be migrated, where the check code is an inode number corresponding to a source directory entry before migration or a check code obtained by a check algorithm for the data to be migrated;
the second check information comprises a storage structure and/or check codes after the data to be migrated are migrated, wherein the check codes are inode numbers corresponding to new directory entries or check codes obtained from the data corresponding to the new directory entries through a check algorithm;
the verification module is configured to, when verifying, for the first verification information and the second verification information, whether the data corresponding to the new directory entry is consistent with the data to be migrated:
and comparing whether the first check information is identical with the second check information.
10. The apparatus of claim 8, further comprising a detection module to:
before checking the data to be migrated, detecting a data migration environment;
and judging whether the data migration environment meets preset conditions or not.
11. The apparatus of claim 8, wherein the reentry control module is configured to, when the data migration process is reentered according to a current stage of the state machine at an interruption time:
If the current stage is before the hard link is created, the whole data migration process is executed by the heavy head;
if the current stage is in the verification process after the hard link is created or the hard link is created, deleting the new catalog item of the data to be migrated, and then re-executing the whole data migration process;
if the current stage is in the process of cleaning the source directory entry, continuing to execute the process of cleaning the source directory entry;
and if the current stage is in the process of completing the process of cleaning the source directory entries, returning prompt information of successful migration.
12. The apparatus of any of claims 8-10, further comprising a rollback control module to:
and when a rollback instruction is received, rolling back the data migration process according to the current stage in the state machine.
13. The apparatus of claim 12, wherein the rollback control module, when rolling back a data migration process according to a current stage in the state machine, is to:
if the current stage is before the hard link is created, returning a prompt message of completion of rollback;
if the current stage is in the process of creating hard links or checking after creating hard links, deleting the new directory entry of the data to be migrated;
If the current stage is in the process of cleaning the source directory entry or the process of cleaning the source directory entry is completed, the source directory entry is re-created in the source directory before the migration of the data to be migrated through creating a hard link, and the new directory entry of the data to be migrated is deleted after the verification process.
14. The apparatus of claim 8, wherein the data to be migrated comprises a plurality of data, the apparatus further comprising a scheduling module to:
carrying out a data migration process on a plurality of data in an asynchronous mode, and recording the state of each data migration process by a scheduling device;
and controlling any data migration process according to the state of the data migration process.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
17. A method of data migration, comprising:
creating a hard link in a target directory for data to be migrated to obtain a new directory entry of the data to be migrated;
cleaning a source directory entry of the data to be migrated before migration;
recording the current stage in the data migration process in real time through a state machine;
when the data migration process is interrupted and a reentry instruction is received, reentry is carried out on the data migration process according to the current stage of the interruption moment in the state machine.
CN202010025782.XA 2020-01-10 2020-01-10 Data migration method, device, equipment and storage medium Active CN111258954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010025782.XA CN111258954B (en) 2020-01-10 2020-01-10 Data migration method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010025782.XA CN111258954B (en) 2020-01-10 2020-01-10 Data migration method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111258954A CN111258954A (en) 2020-06-09
CN111258954B true CN111258954B (en) 2023-12-05

Family

ID=70948627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010025782.XA Active CN111258954B (en) 2020-01-10 2020-01-10 Data migration method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111258954B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104603774A (en) * 2012-10-11 2015-05-06 株式会社日立制作所 Migration-destination file server and file system migration method
CN105978952A (en) * 2016-04-28 2016-09-28 中国科学院计算技术研究所 Virtualization scene flow migration method based on network function and system thereof
US9563628B1 (en) * 2012-12-11 2017-02-07 EMC IP Holding Company LLC Method and system for deletion handling for incremental file migration
CN108228813A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 The delet method and device of replica database in distributed system
CN109901786A (en) * 2017-12-08 2019-06-18 腾讯科技(深圳)有限公司 Data migration method, system, device and computer readable storage medium
CN110532225A (en) * 2019-09-03 2019-12-03 北京百度网讯科技有限公司 Storage engines switching method, device, electronic equipment and medium
CN110597609A (en) * 2019-09-17 2019-12-20 深圳市及响科技有限公司 Cluster migration and automatic recovery method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104603774A (en) * 2012-10-11 2015-05-06 株式会社日立制作所 Migration-destination file server and file system migration method
US9563628B1 (en) * 2012-12-11 2017-02-07 EMC IP Holding Company LLC Method and system for deletion handling for incremental file migration
CN105978952A (en) * 2016-04-28 2016-09-28 中国科学院计算技术研究所 Virtualization scene flow migration method based on network function and system thereof
CN109901786A (en) * 2017-12-08 2019-06-18 腾讯科技(深圳)有限公司 Data migration method, system, device and computer readable storage medium
CN108228813A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 The delet method and device of replica database in distributed system
CN110532225A (en) * 2019-09-03 2019-12-03 北京百度网讯科技有限公司 Storage engines switching method, device, electronic equipment and medium
CN110597609A (en) * 2019-09-17 2019-12-20 深圳市及响科技有限公司 Cluster migration and automatic recovery method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Yang Chen,Jie Wu.Link-based fine granularity flow migration in SDNs to reduce packet loss.2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC).2018,全文. *
刘志宽.分级存储系统中元数据管理的设计与实现.中国优秀硕士学位论文全文数据库.2009,全文. *

Also Published As

Publication number Publication date
CN111258954A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN110806923B (en) Parallel processing method and device for block chain tasks, electronic equipment and medium
US10592237B2 (en) Efficient detection of architecture related bugs during the porting process
CN112527474B (en) Task processing method and device, equipment, readable medium and computer program product
US11212175B2 (en) Configuration management for cloud storage system and method
CN112925651A (en) Application resource deployment method, device, electronic equipment and medium
CN112540914A (en) Execution method, execution device, server and storage medium for unit test
CN111782341B (en) Method and device for managing clusters
CN111290768A (en) Updating method, device, equipment and medium for containerization application system
CN111625949A (en) Simulation engine system, simulation processing method, device and medium
US8990168B1 (en) Efficient conflict resolution among stateless processes
EP3869377A1 (en) Method and apparatus for data processing based on smart contract, device and storage medium
US10108505B2 (en) Mobile agent based memory replication
CN111782147A (en) Method and apparatus for cluster scale-up
CN111782357A (en) Label control method and device, electronic equipment and readable storage medium
CN111258954B (en) Data migration method, device, equipment and storage medium
US9588831B2 (en) Preventing recurrence of deterministic failures
US10223089B1 (en) Partial redundancy elimination with a fixed number of temporaries
EP3859529B1 (en) Backup management method and system, electronic device and medium
CN110515622B (en) Button state control method and device, electronic equipment and storage medium
CN113126928A (en) File moving method and device, electronic equipment and medium
CN112527368B (en) Cluster kernel version updating method and device, electronic equipment and storage medium
CN113312362A (en) Block chain data modification method, device, equipment and storage medium
US11941432B2 (en) Processing system, processing method, higher-level system, lower-level system, higher-level program, and lower-level program
CN111835857B (en) Method and apparatus for accessing data
CN112749042B (en) Application running method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant