CN112631994A

CN112631994A - Data migration method and system

Info

Publication number: CN112631994A
Application number: CN202011603179.1A
Authority: CN
Inventors: 方满; 王英艺; 叶陆洋
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-04-09

Abstract

The embodiment of the invention provides a data migration method and a system, wherein the method is applied to a distributed system and comprises the following steps: a central service node in the distributed system determines a target service node to be migrated of target data from a source service node and a capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system; if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node; the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data; and the source service node migrates the snapshot data corresponding to the target data to the target service node. Compared with the prior art, only the snapshot data of the target data is migrated, so that the efficiency of data migration is improved.

Description

Data migration method and system

Technical Field

The embodiment of the invention relates to the technical field of data migration, in particular to a data migration method and system.

Background

A distributed system is a software system built on top of a network, typically comprising a plurality of service nodes. The distributed system has various general physical and logical resources, tasks can be dynamically distributed on a plurality of service nodes, and information exchange is realized by the dispersed physical and logical resources through a network. In some cases, the distributed system needs to migrate data on some service nodes to other service nodes of the distributed system to achieve the intended purpose.

Common distributed systems include, for example, security monitoring systems, image analysis systems, or video analysis systems. Taking a security monitoring system as an example, the security monitoring system needs to input and process a large amount of face feature data every day, and part of the face feature data needs to be stored for a long time. Along with the lapse of time, when the capacity of the original service node approaches saturation, a new service node is required to be added to realize the capacity expansion of the distributed system, and part of data of the original service node is migrated to the new service node.

In the prior art, the migration of data in a distributed system usually adopts a mode of migrating snapshot data and an operation log. The snapshot data is copy data of data to be migrated at a certain past time node, and the operation log is an operation record corresponding to the data to be migrated from the certain past time node to the current time node. After the snapshot data and the operation log are migrated from the source service node to the destination service node, the destination service node loads the snapshot data into a memory first, recovers the data of the node at a certain past time, then executes the write operation in the operation log, recovers the data to the node at the current time, and completes the migration of the data.

However, in the existing data migration method, the operation records of the data in the service node are all stored in the same operation log. When data migration is needed, the operation records corresponding to the migrated data need to be split from the operation log of the service node, and the data migration efficiency is low due to the huge number of operation records in the operation log.

Disclosure of Invention

The embodiment of the invention provides a data migration method and a data migration system, which aim to solve the problem of low data migration efficiency of a distributed system in the prior art.

In a first aspect, an embodiment of the present invention provides a data migration method, which is applied to a distributed system, and the method includes:

the central service node in the distributed system determines a target service node to be migrated of target data from the source service node and the capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system;

if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node;

the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data;

and the source service node migrates the snapshot data corresponding to the target data to the target service node.

In an optional implementation manner, the determining, from the source service node and the capacity expansion service node, a destination service node of target data in the source service node includes:

and the central service node takes the total storage capacity of the source service node and the total storage capacity of the capacity expansion service node as weights, calculates the target data of each fragment in the source service node through a smooth weighted polling scheduling algorithm, and determines the target service node of the target data of each fragment from the source service node and the capacity expansion service node.

In an optional implementation manner, the performing, by the source service node, a forced snapshot on the target data to obtain snapshot data corresponding to the target data includes:

the source service node exports the target data as a file to be migrated;

the source service node generates a metafile of the file to be migrated, wherein the metafile is used for describing the file to be migrated;

and the source service node combines the file to be migrated and the metafile into snapshot data corresponding to the target data.

In an optional implementation manner, before the source service node performs a forced snapshot on the target data to obtain snapshot data corresponding to the target data, the method further includes:

the source service node stops responding to the external access request of the target data.

In an optional implementation manner, after the source service node migrates the snapshot data corresponding to the data to be migrated to the destination service node, the method further includes:

the central service node sends a query request to the source service node, wherein the query request is used for querying the migration progress of the snapshot data;

and the source service node sends feedback information to the target service node, wherein the feedback information is used for indicating the migration progress of the target data.

In an optional implementation manner, after the central service node sends the migration request of the target data to the source service node, the method further includes:

the central service node sends an updating instruction to a second database node of the distributed system, wherein the updating instruction comprises an identifier of a target service node of the target data;

and the second database node updates the mapping relation between the target data and the target service node according to the identification of the target service node.

In an optional implementation manner, migrating, by the source service node, snapshot data corresponding to the target data to the destination service node, includes:

the source service node sends the snapshot data corresponding to the target data to a first database node of the distributed system;

and the first database node sends the snapshot data corresponding to the target data to a target service node according to the address of the target service node.

In an optional implementation manner, after the first database node sends, to a destination service node, snapshot data corresponding to the target data according to an address of the destination service node, the method further includes:

the central service node sends a reloading instruction to the destination service node, wherein the reloading instruction is used for instructing the destination service node to reload the target data;

and the target service node restores the working state of the target data in the target service node by loading the snapshot data corresponding to the target data.

In a second aspect, an embodiment of the present invention provides a data migration system, where the system includes:

the central service node is used for determining a target service node to be migrated of target data from the source service node and the capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system; if the source service node where the target data is located is different from the target service node where the target data is to be migrated, sending a migration request of the target data to the source service node;

the source service node is used for executing forced snapshot on the target data to obtain snapshot data corresponding to the target data; and migrating the snapshot data corresponding to the target data to the target service node.

In an optional implementation manner, the central service node is specifically configured to use the total storage capacity of the source service node and the total storage capacity of the capacity expansion service node as weights, calculate target data of each segment in the source service node through a smooth weighted polling scheduling algorithm, and determine a destination service node of the target data of each segment from the source service node and the capacity expansion service node.

In an optional implementation manner, the source service node is specifically configured to export the target data to a file to be migrated; generating a metafile of the file to be migrated, wherein the metafile is used for describing the file to be migrated; and combining the file to be migrated and the metafile into snapshot data corresponding to the target data.

In an optional embodiment, the source service node is further configured to stop responding to the external access request of the target data.

In an optional implementation manner, the central service node is further configured to send an inquiry request to the source service node, where the inquiry request is used to inquire a migration progress of the snapshot data;

and the source service node is used for sending feedback information to the destination service node, wherein the feedback information is used for indicating the migration progress of the target data.

In an optional embodiment, the system further comprises: a first database node;

the central service node is further configured to send an update instruction to a first database node of the distributed system, where the update instruction includes an identifier of a destination service node of the target data;

and the first database node is used for updating the mapping relation between the target data and the target service node according to the identification of the target service node.

In an optional embodiment, the system further comprises: a second database node;

the source service node is further configured to send snapshot data corresponding to the target data to a second database node of the distributed system;

and the second database node is used for sending the snapshot data corresponding to the target data to the target service node according to the address of the target service node.

In an optional embodiment, the system further comprises: a destination service node;

the central service node is further configured to send a reload instruction to the destination service node, where the reload instruction is used to instruct the destination service node to reload the target data;

A third aspect of the present invention provides an electronic apparatus comprising: a memory, a processor and a computer program, the computer program being stored in the memory, the processor running the computer program to perform the various optional data migration methods of the first aspect and the first aspect of the invention.

A fourth aspect of the present invention provides a storage medium having stored thereon a computer program for executing the first aspect and the various optional data migration methods of the first aspect.

A fifth aspect of the invention provides a computer program product comprising computer instructions which, when executed by a processor, implement the method of the first aspect.

According to the data migration method and system provided by the embodiment of the application, a central service node in a distributed system determines a target service node of target data in a source service node from the source service node and a capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system; if the source service node where the target data is located is different from the target service node of the target data, the central service node sends a migration request of the target data to the source service node; the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data; and the source service node migrates the snapshot data corresponding to the target data to the target service node. Compared with the prior art, the snapshot data corresponding to the target data is obtained before migration, so that an operation log of the target data does not need to be migrated during migration, and the data migration efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a data migration method according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 3 is a schematic flowchart of another data migration method according to an embodiment of the present application;

fig. 4 is a signaling interaction diagram of a data migration method according to an embodiment of the present application;

FIG. 5 is a system architecture diagram of a data migration system according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the prior art, the migration of data in a distributed system usually adopts a mode of migrating snapshot data and an operation log. The snapshot data is copy data of data to be migrated at a certain past time node, and the operation log is an operation record corresponding to the data to be migrated from the certain past time node to the current time node. After the snapshot data and the operation log are migrated from the source service node to the destination service node, the destination service node loads the snapshot data into a memory first, recovers the data of the node at a certain past time, then executes the write operation in the operation log, recovers the data to the node at the current time, and completes the migration of the data. However, in the existing data migration method, the operation records of the data in the service node are all stored in the same operation log. When data migration is needed, the operation records corresponding to the migrated data need to be split from the operation log of the service node, and the data migration efficiency is low due to the huge number of operation records in the operation log.

In order to solve the above problem, embodiments of the present application provide a data migration method and apparatus, so as to solve the problem that the data migration efficiency of a distributed system in the prior art is low. The invention conception of the application is as follows: the target data is migrated by sending the latest snapshot data of the target data to the target server, so that an operation log of the target data does not need to be migrated in the migration process, and the data migration efficiency is improved.

The following explains an application scenario of the present application. Fig. 1 is a schematic view of an application scenario of a data migration method according to an embodiment of the present application. As shown in fig. 1, the distributed system includes a service node 101, a service node 102, a service node 103, a service node 104, and a service node 105. The service node 105 is a central service node in the distributed system, and is configured to control other service nodes in the distributed system. When the service node 101 is not used, the service node 105 needs to redistribute the data in the distributed system, and migrate the data in the service node 101 to the service node 102, the service node 103, and the service node 104. Alternatively, when the service node 101 is a newly added service node, the service node 105 also needs to redistribute data in the distributed system, and migrate part of the data in the

service nodes

102, 103, and 104 to the service node 101 to balance the storage capacity in each service node.

The distributed system may be a security monitoring system, an image analysis system, or a video analysis system, and the like, which is not limited in the embodiment of the present application. The service node may be a server.

It should be noted that the application scenario in the technical solution of the present application may be the scenario in fig. 1, but is not limited to this, and may also be applied to other scenarios requiring data migration.

The following takes each node in the distributed system integrated or installed with the relevant execution code as an example, and details the technical solution of the embodiment of the present application with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 2 is a schematic flow chart of a data migration method provided in an embodiment of the present application, where the embodiment relates to a specific process of how a distributed system migrates target data from a source service node to a destination service node. As shown in fig. 2, the method includes:

s201, a central service node in the distributed system determines a target service node to be migrated by target data from a source service node and a capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system.

In the embodiment of the application, the central service node is a control node in the distributed system and is used for performing migration control on data in the distributed system; the source service node is an original service node in the distributed system and is used for storing target data in the distributed system; the expansion service node is a service node after the expansion of the distributed system and is also used for storing target data in the distributed system; the destination service node is a service node after the target data is migrated, and the destination service node may be any source service node or capacity expansion service node.

The target data may be operation log data corresponding to the feature data of the at least one segment stored by the source service node.

It should be understood that, in the distributed system related to the embodiment of the present application, the source service node and the subsequent newly added capacity expansion service node are only used for storing target data, such as operation log data corresponding to the feature data, the first database node in the distributed system is used for storing a mapping relationship between the target data and a node where the target data is located, and the second database node in the distributed system is used for storing the feature data and snapshot data corresponding to the target data.

The method and the device for determining the target service node of the target data from the source service node and the capacity expansion service node are not limited, and the determination can be performed according to the total storage capacity of the source service node and the total storage capacity of the capacity expansion service node.

For example, if the distributed system includes two source service nodes, each source service node includes an operation log corresponding to feature data of five segments, and then three capacity expansion service nodes are added in a capacity expansion process of the distributed system, the central service node may reallocate the destination service node to the operation logs corresponding to feature data of ten segments included in the two source service nodes in total, and allocate the operation logs corresponding to feature data of two segments in the two source service nodes and the three capacity expansion service nodes, respectively.

In some embodiments, the central service node uses the total storage capacity of the source service node and the total storage capacity of the capacity expansion service node as weights, calculates the target data of each fragment in the source service node through a smooth weighted polling scheduling algorithm, and determines the target service node of the target data of each fragment from the source service node and the capacity expansion service node.

For example, the central service node may first determine an initial value of the valid weight for all service nodes (all service nodes include the source service node and the extension service node). And then, taking the first service node with the maximum effective weight after the Mth update in all the service nodes as a target service node of the target data of the (M + 1) th fragment to be migrated in the source service node.

The effective weight after the M-th updating is determined according to the effective weight after the M-1-th updating and the configuration weight of each service node, the initial value of the effective weight of each service node is the configuration weight of each service node, and the configuration weight of each service node is related to the total storage capacity of each service node. M is an integer of 2 or more.

Correspondingly, in the smooth weighted polling method, after the target data is migrated each time, the effective weight of each service node needs to be updated. For example, the M-th updated effective weight of the first service node may be subtracted by the sum of the configuration weights of all service nodes to obtain an intermediate value of the effective weight of the first service node when the M + 1-th update is performed. Subsequently, the configuration weight of the first service node may be added to the intermediate value of the effective weight of the first service node at the time of the M +1 th update, so as to obtain the M +1 th updated effective weight of the first service node. Finally, the M times of updated effective weights of other service nodes except the first service node may be added to the configuration weight of each other service node to obtain the M +1 th time of updated effective weight of each other service node.

In the embodiment of the application, by means of the smooth weighted polling, the destination service node where the target data should be stored after capacity expansion can be determined based on the total storage capacity of the source service node and the total storage capacity of the capacity expansion service node, so that the data volume stored by each service node after capacity expansion is more average.

S202, if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node.

In this step, after the central service node determines the destination service node of the target data, the central service node may compare the source service node where the target data is located with the destination service node to which the target data is to be migrated, and if the source service node where the target data is located is different from the destination service node to which the target data is to be migrated, it may be determined that the target data needs to be migrated from the source service node to the destination service node. At this time, the central service node may send a migration request of the target data to the source service node.

It should be understood that if the source service node where the target data is located and the target service node where the target data is to be migrated are the same service node, it may be determined that the target data is still stored in the source service node after the data migration, and the target data does not need to be migrated.

In some embodiments, the migration request of the target data may further include an identifier of the destination service node, so that the source service node may determine the destination service node to which the target data is to be migrated after receiving the migration request of the target data.

S203, the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data.

In this step, after receiving the migration request of the target data, the source service node may perform a forced snapshot on the target data to obtain snapshot data corresponding to the target data.

The snapshot is a state record of the target data at a certain moment. In the embodiment of the application, the obtained snapshot data corresponding to the target data is the snapshot data in the latest state of the current target data.

In some embodiments, the source service node may export the target data to a file to be migrated first. And then, the source service node generates a metafile of the file to be migrated, wherein the metafile is used for describing the file to be migrated. And finally, the source service node combines the file to be migrated and the metafile into snapshot data corresponding to the target data.

In the embodiment of the application, the states of the image, the video and the like in the distributed system are composed of the target data corresponding to the operation log and the snapshot of the target data. The snapshot is usually generated periodically, and reflects the state at a certain time point, after which the system receives a write operation request, which causes the state of the system to change, and the operation causing the state change is recorded in the target data. Because the migration of the snapshot is relatively simple and the migration of the target data is relatively complex during the data migration, in some embodiments, the source service node may stop responding to the external access request of the target data before the source service node performs the forced snapshot on the target data to obtain the snapshot data corresponding to the target data.

The embodiment of the application is not limited to how to stop responding to the external access request, and for example, external access to a distributed system such as a city-level image and a video analysis system can be shielded from a request access layer.

In other embodiments, in the process of performing forced snapshot, after the source service node migrates the snapshot data corresponding to the data to be migrated to the destination service node, the central service node may further send a query request to the source service node to query the migration progress of the snapshot data. And subsequently, the source service node sends feedback information to the target service node, so that the migration progress of the target data is indicated.

For example, the central service node may query the migration execution progress of the forced snapshot of each service node by sending a query request (grpc smapshootget) to the source service node until the target data completes the migration.

And S204, the source service node sends snapshot data corresponding to the target data to the destination service node.

In this step, after the source service node obtains the snapshot data corresponding to the target data, the snapshot data corresponding to the target data may be migrated from the source service node to the destination service node.

The embodiment of the application does not limit how to migrate the snapshot data of the target data from the source service node to the target service node, and a proper migration mode can be selected according to actual conditions.

In some optional embodiments, the source service node may send snapshot data corresponding to the target data to a second database node of the distributed system. And then, the second database node sends the snapshot data corresponding to the target data to the target service node according to the address of the target service node.

The address of the destination service node may be sent to the second database node by the central service node, or may be sent to the second database node by the source service node. In addition, the embodiment of the present application also does not limit the type of the second database node, and may be, for example, a Minio database.

In some embodiments, the second database node may also generate a metafile describing snapshot information for each service node after completing the migration of the target data.

It should be understood that the first database node in the distributed system holds the mapping relationship between the target data and the node where the target data resides. Therefore, in some embodiments, when performing data migration, the central service node may further send an update indication to the first database node of the distributed system, where the update indication includes an identifier of a destination service node of the target data. And then, the first database node updates the mapping relation between the target data and the target service node according to the identification of the target service node.

In addition, in some embodiments, after the migration of the target data is completed, the central service node may send a reload instruction to the destination service node, where the reload instruction is used to instruct the destination service node to cache snapshot data corresponding to the target data in the memory and establish a mapping relationship between the target data and the destination service node.

Illustratively, after receiving the reload instruction, the destination service node may move the snapshot data of the fragmented data to be migrated to a corresponding position of the destination service node, and after the snapshot data of all the fragmented data is migrated, regenerate the metafile describing the snapshot information of each service node.

In some embodiments, after the migration is completed, the central service node sends a reload indication to the destination service node, where the reload indication is used to instruct the destination service node to reload the target data. And subsequently, the target service node restores the working state of the target data in the target service node by loading the snapshot data corresponding to the target data.

It should be understood that, since the source service node stops responding to the external access request of the target data before the forced snapshot is performed, the generated snapshot data can completely reflect the state of the service node. Therefore, the target data has no new write operation after the corresponding time point of the snapshot, so that the target service node can directly restore the state according to the snapshot when restarting, and target data such as an operation log and the like do not need to be put.

According to the data migration method provided by the embodiment of the application, a central service node in a distributed system determines a target service node to be migrated of target data from a source service node and a capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system; if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node; the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data; and the source service node migrates the snapshot data corresponding to the target data to the target service node. Compared with the prior art, the snapshot data corresponding to the target data is obtained before migration, so that an operation log of the target data does not need to be migrated during migration, and the data migration efficiency is improved.

On the basis of the above embodiments, in order to better balance the data amount stored by each service node, before data migration is performed, target data to be migrated and a destination service node of the target data need to be determined. The method and the device for determining the target service node of the target data to be migrated in the source service node can determine the target service node of the target data to be migrated in the source service node in a smooth weighted polling mode. Fig. 3 is a schematic flow diagram of another data migration method provided in an embodiment of the present application, where an execution subject of the embodiment is a central server, and as shown in fig. 3, the data migration method includes:

s301, determining the initial value of the effective weight of all service nodes.

S302, the first service node with the largest effective weight after the Mth updating in all the service nodes is used as a target service node of the target data of the (M + 1) th fragment to be migrated in the source service node.

The first service node may be a source service node or a capacity expansion service node, which is not limited in this embodiment of the application.

In this embodiment of the present application, the updated effective weight of the mth time is determined according to the updated effective weight of the mth-1 time and the configuration weight of each service node, an initial value of the effective weight of each service node is the configuration weight of each service node, and the configuration weight of each service node is related to the total storage capacity of each service node. M is an integer of 2 or more.

S303, subtracting the sum of the configuration weights of all the service nodes from the Mth updated effective weight of the first service node to obtain a middle value of the effective weight of the first service node in the M +1 th update.

S304, adding the configuration weight of the first service node to the middle value of the effective weight of the first service node in the M +1 th updating to obtain the M +1 th updated effective weight of the first service node.

S305, adding the M updated effective weights of the other service nodes except the first service node to the configuration weight of the other service nodes to obtain the M +1 th updated effective weight of the other service nodes.

In the embodiment of the present application, steps S301-S305 are processes of selecting a destination service node and updating effective weights in the smooth weighted polling. Smooth weighted polling is illustrated below by way of example.

Illustratively, if there are n serving nodes S ═ S₁,S₂,…,S_nEach service node i except for the existence of a configuration weight W_iIn addition, there is a current effective weight CW_iAnd CW_iIs initialized to W_iThe configuration weight W corresponding to the service node is { W ═ W₁,W₂,…,W_nThe effective weight CW corresponding to the service node is { CW₁,CW₂,…,CW_n}. The indicator variable (CurrentPos) represents the identity of the currently selected service node, the sum of the configuration weights of all instances being WeightSum.

The current effective weight CW of each serving node i may be used when first determining the destination serving node for the target data_iTo configure the weight W_iAnd calculates the configuration weight sum WeightSum.

Then, each time the destination service node of the target data is determined, the effective weight of each service node is updated. The source service node may determine, based on the effective weight of each service node updated last time, a service node with the largest effective weight, and determine that the service node is a destination service node of the current target data by pointing to the service node through CurrentPos.

Subsequently, the source serving node may CW the first serving node's current effective weight_iSubtracting the weights of all service nodes and WeightSum to obtain the middle value of the effective weight of the first service node at the time of the M +1 th update, and CurrentPos can point to the position. And adding the configuration weight of the first service node to the middle value of the effective weight of the first service node in the M +1 th updating to obtain the M +1 th updated effective weight of the first service node. And adding the M times of updated effective weights of other service nodes except the first service node to the configuration weights of other service nodes to obtain the M +1 times of updated effective weights of other service nodes, thereby completing the update of the M +1 times of effective weights.

In the application, the target service node of the target data is determined based on the smooth weighted polling computation migration scheme, and the capacity of each service node is taken as the weight, and the fragments are redistributed among the service nodes, so that sufficient balance is achieved, and the storage resources and the computation resources of the newly added service nodes can be fully utilized for the newly added service nodes.

On the basis of the above embodiment, how to inform the destination service node to reload the target data after the target data is migrated is described below. Fig. 4 is a signaling interaction diagram of a data migration method according to an embodiment of the present application, and as shown in fig. 4, the data migration method includes:

s401, the central service node in the distributed system determines a target service node to be migrated by the target data from the source service node and the capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system.

S402, if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node.

S403, the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data.

S404, the source service node sends the snapshot data corresponding to the target data to a second database node of the distributed system.

S405, the second database node sends the snapshot data corresponding to the target data to the target service node according to the address of the target service node.

The technical terms, technical effects, technical features, and alternative embodiments of S401 to S405 can be understood with reference to S201 to S204 shown in fig. 2, and repeated descriptions thereof will not be repeated here.

S406, the central service node sends a reloading instruction to the destination service node, wherein the reloading instruction is used for instructing the destination service node to reload the target data.

S407, the target service node restores the working state of the target data at the target service node by loading the snapshot data corresponding to the target data.

In the application, after the target data is migrated, the target data needs to be reloaded to the memory by each service node to finally complete data migration, and the service is normally provided. Therefore, after receiving the migration completion message sent by the target service node, the central service node may instruct the restart of the target service node.

According to the data migration method provided by the embodiment of the application, a central service node in a distributed system determines a service node, which is to be migrated, of target data from a source service node and a capacity expansion service node according to the total storage capacity of the source service node in the distributed system and the total storage capacity of the capacity expansion service node in the distributed system; if the source service node where the target data is located is different from the target service node where the target data is to be migrated, the central service node sends a migration request of the target data to the source service node; the source service node executes forced snapshot on the target data to obtain snapshot data corresponding to the target data; and the source service node migrates the snapshot data corresponding to the target data to the target service node. Compared with the prior art, the snapshot data corresponding to the target data is obtained before migration, so that an operation log of the target data does not need to be migrated during migration, and the data migration efficiency is improved.

S408, the central service node sends an updating instruction to a first database node of the distributed system, wherein the updating instruction comprises the identification of a target service node of the target data;

s409, the first database node updates the mapping relation between the target data and the target service node according to the identification of the target service node.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Fig. 5 is a system architecture diagram of a data migration system according to an embodiment of the present application. The data migration system includes: the system comprises a central service node 501, a source service node 502, a destination service node 503, a first database node 504, a second database node 505 and a capacity expansion service node 506.

The central service node 501 is configured to determine a destination service node 503 to which target data is to be migrated from the source service node 502 and the capacity expansion service node 506 according to the total storage capacity of the source service node 502 in the distributed system and the total storage capacity of the capacity expansion service node 506 in the distributed system; if the source service node 502 where the target data is located is different from the target service node 503 where the target data is to be migrated, sending a migration request of the target data to the source service node 502;

the source service node 502 is configured to perform forced snapshot on the target data to obtain snapshot data corresponding to the target data; and migrating the snapshot data corresponding to the target data to the destination service node 503.

In an optional implementation manner, the central service node 501 is specifically configured to calculate target data of each segment in the source service node 502 by using a scheduling algorithm of smooth weighted polling with total storage capacity of the source service node 502 and total storage capacity of the capacity expansion service node 506 as weights, and determine the target service node 503 of the target data of each segment from the source service node 502 and the capacity expansion service node 506.

In an optional embodiment, the source service node 502 is specifically configured to export target data into a file to be migrated; generating a metafile of the file to be migrated, wherein the metafile is used for describing the file to be migrated; and combining the file to be migrated and the metafile into snapshot data corresponding to the target data.

In an alternative embodiment, the source service node 502 is further configured to stop responding to external access requests for the target data.

In an optional implementation manner, the central service node 501 is further configured to send a query request to the source service node 502, where the query request is used to query the migration progress of the snapshot data;

the source service node 502 is configured to send feedback information to the destination service node 503, where the feedback information is used to indicate a migration progress of the target data.

In an optional embodiment, the system further comprises: a first database node 504;

the central service node 501 is further configured to send an update instruction to the first database node 504 of the distributed system, where the update instruction includes an identifier of a destination service node 503 of the target data;

the first database node 504 is configured to update the mapping relationship between the target data and the destination service node 503 according to the identifier of the destination service node 503.

In an optional embodiment, the system further comprises: a second database node 505;

the source service node 502 is further configured to send snapshot data corresponding to the target data to a second database node 505 of the distributed system;

and the second database node 505 is configured to send the snapshot data corresponding to the target data to the destination service node 503 according to the address of the destination service node 503.

In an optional embodiment, the system further comprises: a destination service node 503;

the central service node 501 is further configured to send a reload instruction to the destination service node 503, where the reload instruction is used to instruct the destination service node 503 to reload the target data;

and the destination service node 503 restores the working state of the target data at the destination service node 503 by loading the snapshot data corresponding to the target data.

The data migration apparatus provided in the embodiment of the present application may perform the actions of the data migration method in the foregoing method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 6, the electronic device may include: at least one processor 601 and memory 602. Fig. 6 shows an electronic device as an example of a processor.

A memory 602 for storing programs. In particular, the program may include program code including computer operating instructions.

The memory 602 may comprise high-speed RAM memory, and may also include non-volatile memory (MoM-volatile memory), such as at least one disk memory.

The processor 601 is configured to execute computer-executable instructions stored in the memory 602 to implement a data migration method on a source service node side, or to implement a data migration method on a destination service node side, or to implement a data migration method on a central service node side.

The processor 601 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application.

Alternatively, in a specific implementation, if the communication interface, the memory 602 and the processor 601 are implemented independently, the communication interface, the memory 602 and the processor 601 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. Buses may be classified as address buses, data buses, control buses, etc., but do not represent only one bus or type of bus.

Alternatively, in a specific implementation, if the communication interface, the memory 602 and the processor 601 are integrated into a chip, the communication interface, the memory 602 and the processor 601 may complete communication through an internal interface.

The present invention also provides a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and in particular, the computer-readable storage medium stores program instructions for the method in the above embodiments.

The present invention also provides a computer program product comprising computer instructions which, when executed by a processor, implement the data migration method in the above embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A data migration method is applied to a distributed system, and the method comprises the following steps:

2. The method of claim 1, wherein the determining a destination service node of the target data in the source service node from the source service node and the capacity expansion service node comprises:

3. The method of claim 1, wherein the performing, by the source service node, a forced snapshot on the target data to obtain snapshot data corresponding to the target data comprises:

the source service node exports the target data as a file to be migrated;

4. The method according to any one of claims 1 to 3, wherein before the source service node performs the forced snapshot on the target data to obtain snapshot data corresponding to the target data, the method further comprises:

5. The method according to any one of claims 1 to 3, wherein after the source service node migrates the snapshot data corresponding to the data to be migrated to the destination service node, the method further comprises:

6. The method of claim 1, wherein after the central service node sends the migration request of the target data to the source service node, the method further comprises:

the central service node sends an updating instruction to a first database node of the distributed system, wherein the updating instruction comprises an identifier of a target service node of the target data;

and the first database node updates the mapping relation between the target data and the target service node according to the identification of the target service node.

7. The method according to claim 6, wherein the migrating the snapshot data corresponding to the target data to the destination service node by the source service node comprises:

the source service node sends the snapshot data corresponding to the target data to a second database node of the distributed system;

and the second database node sends the snapshot data corresponding to the target data to a target service node according to the address of the target service node.

8. The method of claim 7, wherein after the first database node sends the snapshot data corresponding to the target data to the destination service node according to the address of the destination service node, the method further comprises:

9. A data migration system, the system comprising:

10. An electronic device, comprising: a memory, a processor, and a transceiver;

the processor is used for being coupled with the memory, reading and executing the instructions in the memory to realize the method of any one of claims 1 to 8;

the transceiver is coupled to the processor, and the processor controls the transceiver to transmit and receive messages.

11. A computer-readable storage medium having stored thereon computer instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1-8.

12. A computer program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the method of any one of claims 1 to 8.