CN114024956B

CN114024956B - Data migration method, device, server and storage medium

Info

Publication number: CN114024956B
Application number: CN202010693729.7A
Authority: CN
Inventors: 洪亮; 陈春斌; 陈林; 王金龙; 赵博; 胡德祺
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2024-03-12
Anticipated expiration: 2040-07-17
Also published as: CN114024956A

Abstract

The disclosure relates to a data migration method, a device, a server and a storage medium, which are applied to a control center outside a cluster, wherein the method comprises the following steps: responding to a data migration request, wherein the data migration request carries a source cluster identifier and a target cluster identifier, and a data migration instruction is generated; transmitting a data migration instruction to a target cluster corresponding to the target cluster identifier, wherein the data migration instruction is used for instructing the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier; metadata information for a source cluster and a target cluster is received when data migration is complete. According to the method, the control center is introduced, and after data migration is completed, metadata information corresponding to the subset group in the cluster is uniformly saved in the control center, so that the accuracy of the metadata information can be ensured. The control center uniformly processes the metadata request of the client, so that the data migration flow can be simplified, and the cost of manpower, time and the like consumed by data migration can be saved.

Description

Data migration method, device, server and storage medium

Technical Field

The disclosure relates to the technical field of data processing, and in particular relates to a data migration method, a data migration device, a server and a storage medium.

Background

The message queue distribution service (kafka) of a general enterprise is distributed among a plurality of rooms, each containing a plurality of clusters. For the same cluster, as the data volume in the cluster increases, the cluster needs to be expanded or data migrated, so that the stress of the cluster is relieved.

In some scenarios, because the machine room in which the cluster is located often cannot expand the cluster due to insufficient rack position setting, the problem of excessive data volume in the cluster can be solved by adopting a topic (theme) migration mode. However, in the related art, after a theme is migrated from a source cluster (a cluster of outgoing data) to a target cluster (a cluster of incoming data), metadata information in the source cluster and the target cluster is changed accordingly, so that a source cluster address configured in a client subscribing to the theme needs to be changed to a target cluster address separately, so that the client can obtain correct metadata information, which has the problems of troublesome operation and high labor and time costs.

Disclosure of Invention

The disclosure provides a data migration method, a data migration device, a server and a storage medium, so as to at least solve the problems that in the related art, the migration operation of data in a cluster is troublesome and the manpower and time cost is high. The technical scheme of the present disclosure is as follows:

According to a first aspect of an embodiment of the present disclosure, there is provided a data migration method, applied to a control center outside a cluster, the method including:

responding to a data migration request, wherein the data migration request carries a source cluster identifier and a target cluster identifier, and a data migration instruction is generated;

transmitting a data migration instruction to a target cluster corresponding to the target cluster identifier, wherein the data migration instruction is used for instructing the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier;

metadata information for a source cluster and a target cluster is received when data migration is complete.

In one embodiment, the data migration request further carries a theme identifier to be migrated; sending the data migration instruction to the target cluster corresponding to the target cluster identifier, including:

creating a target theme for storing the data to be migrated in the target cluster according to the theme identification to be migrated;

and sending a data migration instruction to the target cluster, wherein the data migration instruction is used for instructing the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster, and storing the data to be migrated in the target theme.

In one embodiment, during the data migration process, the state of the subject to be migrated is a read-write state, and the state of the target subject is a read-only state.

In one embodiment, the method further comprises:

and when the data migration is completed, updating the state of the target subject in the target cluster to be a read-write state, and updating the state of the subject to be migrated in the source cluster to be a forbidden state.

In one embodiment, receiving metadata information for a source cluster and a target cluster when data migration is complete includes:

in the data migration process, receiving a data synchronization result of a to-be-migrated theme and a target theme, which are sent by a source cluster;

and when the message offset of the theme to be migrated is consistent with that of the target theme according to the data synchronization result, receiving metadata information of the source cluster and the target cluster.

In one embodiment, the method further comprises:

in the data migration process, when the message offset of the theme to be migrated and the target theme meets the preset requirement, updating the theme to be migrated to be in a read-write forbidden state.

In one embodiment, after receiving metadata information of the source cluster and the target cluster when the data migration is completed, further comprising:

acquiring metadata information of a subset group in the cluster at fixed time;

and verifying the stored metadata information according to the acquired metadata information.

updating consumption groups corresponding to the source cluster and the target cluster;

and storing the mapping relation among the consumption group, the source cluster and the target cluster.

receiving a client metadata request sent by a subset group in the cluster, wherein the client metadata request carries a topic identifier to be accessed;

acquiring metadata information of a theme to be accessed according to the mark of the theme to be accessed;

and returning the acquired metadata information to the sub-cluster.

According to a second aspect of the embodiments of the present disclosure, there is provided a data migration apparatus, applied to a control center outside a cluster, the apparatus including:

the instruction generation module is configured to execute a response to a data migration request, wherein the data migration request carries a source cluster identifier and a target cluster identifier and generates a data migration instruction;

the data migration module is configured to send a data migration instruction to a target cluster corresponding to the target cluster identifier, wherein the data migration instruction is used for instructing the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier;

And a receiving module configured to perform receiving metadata information of the source cluster and the target cluster when the data migration is completed.

In one embodiment, the data migration request further carries a theme identifier to be migrated; a data migration module, comprising:

a theme creation unit configured to execute creating a target theme for storing data to be migrated in the target cluster according to the theme identification to be migrated;

the data migration unit is configured to send a data migration instruction to the target cluster, and the data migration instruction is used for instructing the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster and store the data to be migrated in the target theme.

In one embodiment, the apparatus further comprises:

the first state updating module is configured to execute the steps of updating the state of the target theme to be a read-write state and updating the state of the theme to be migrated to be a disabled state when data migration is completed.

In one embodiment, the receiving module is configured to perform:

In the data migration process, receiving a data synchronization result of a to-be-migrated theme and a target theme, which are sent by a source cluster; and when the message offset of the theme to be migrated is consistent with that of the target theme according to the data synchronization result, receiving metadata information of the source cluster and the target cluster.

In one embodiment, the apparatus further comprises:

and the second state updating module is configured to update the theme to be migrated to a read-write forbidden state when the message offset of the theme to be migrated and the target theme meets the preset requirement in the data migration process.

In one embodiment, the apparatus further comprises:

an acquisition module configured to perform timing acquisition of metadata information of a subset group in the cluster;

and the verification module is configured to perform verification on the stored metadata information according to the acquired metadata information.

In one embodiment, the apparatus further comprises:

the consumption group updating module is configured to update consumption groups corresponding to the source cluster and the target cluster;

and the storage module is configured to execute the mapping relation among the storage consumption group, the source cluster and the target cluster.

In one embodiment, the receiving module is further configured to execute a client metadata request sent by a subset group in the receiving cluster, where the client metadata request carries a topic identifier to be accessed;

The apparatus further comprises: the metadata query module is configured to acquire metadata information of the theme to be accessed according to the theme identification to be accessed;

the sending module is configured to return the acquired metadata information to the sub-cluster.

According to a third aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program stored in a readable storage medium, from which at least one processor of a device reads and executes the computer program, causing the device to perform the data migration method as described in any one of the embodiments of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a server comprising:

a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the data migration method as described in any one of the embodiments of the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a storage medium, which when executed by a processor of a server, enables the server to perform the data migration method described in any one of the embodiments of the first aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the method comprises the steps that the method is applied to a control center, the control center responds to a data migration request, a data migration instruction is generated, and a target cluster is indicated to acquire data to be migrated from a source cluster corresponding to a source cluster identifier; metadata information for a source cluster and a target cluster is received when data migration is complete. By introducing a control center for controlling data migration and managing metadata information, after data migration is completed, metadata information corresponding to a subset group in the cluster is saved in the control center, and accuracy of the metadata information can be ensured. When the client sends a client metadata request to the proxy server of the subset group, the client metadata request forwarded by the proxy server is uniformly processed through the control center, and the cluster address does not need to be independently changed in the client, so that the data migration flow can be simplified, and the time, labor and other costs consumed by data migration are saved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

FIG. 1 is an application environment diagram illustrating a data migration method according to an example embodiment.

FIG. 2 is a flow chart illustrating a method of data migration according to an exemplary embodiment.

FIG. 3 is a flowchart illustrating a data migration step according to an exemplary embodiment.

Fig. 4 is a flowchart illustrating a step of receiving metadata information according to an exemplary embodiment.

FIG. 5 is a flowchart illustrating the steps of processing a metadata request for a client according to an exemplary embodiment.

FIG. 6 is an application environment diagram illustrating a data migration method according to an example embodiment.

FIG. 7 is a flowchart illustrating a method of data migration, according to an example embodiment.

Fig. 8 is a block diagram illustrating a data migration apparatus according to an example embodiment.

Fig. 9 is an internal structural diagram of a server shown according to an exemplary embodiment.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

The data migration method provided by the disclosure can be applied to an application environment as shown in fig. 1. Wherein the control center 110 interacts with the clusters through a network. The cluster includes a plurality of sub-clusters. The source cluster 120 and the target cluster 130 are data migration sub-clusters for any two of the clusters. The control center 110 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers. The control center 110 may have at least one component service deployed therein, through which data migration of the clusters, metadata information of the managed clusters, and the like are controlled. When the control center 110 obtains a data migration request triggered by a user, a data migration instruction is generated in response to the data migration request. The control center 110 sends a data migration instruction to the target cluster 130 corresponding to the target cluster identifier, and instructs the target cluster 130 to acquire data to be migrated from the source cluster 120 corresponding to the source cluster identifier; when the data migration is completed, the source cluster 120 and the target cluster 130 send metadata update requests (metadata update request) to the control center 110, where the metadata update requests carry metadata information after the data migration. The control center 110 receives metadata information for the source cluster 120 and the target cluster 130.

Implementations of the present disclosure may be applied to data migration between kafka clusters. The following briefly describes some of the contents of the kafka cluster. kafka is a high-throughput distributed publish-subscribe messaging system that can handle all action flow data in consumer-scale websites.

The kafka core comprises:

producer (message producer): the generated message will be sent to a certain topic;

consumer: a client that reads a message to Kafka broker. Consumer obtains data from brooker and processes it. The content of the message consumed by the consumer comes from a certain topic;

topic: messages are categorized according to topic, which is essentially a directory, i.e., messages of the same topic are categorized into the same directory;

broker (proxy): each kafka server node is a browser, and one browser can have a plurality of topics;

consumer group: each consumer belongs to a specific consumer group, a group name can be designated for each consumer, and if the group name is not designated, the consumer belongs to a default group;

in kafka, if kafka is considered as a database, a topic is understood as a table in the database, and the name of the topic is the name of the table. The data in the theme is partitioned into one or more parts (partitions). There is at least one partition for each topic. Each partition has multiple copies, one of which is a leader and the other of which is a follower. The leader is currently responsible for reading and writing data, and the follower actively acquires the data from the leader.

Each kafka copy has two important attributes: LEO and HW. LEO, log end offset, records the offset value of the next message in the duplicate underlying log. That is, if leo=10, this indicates that the copy holds 10 messages, the displacement value range is [0,9]. HW, high water level value (High Watermark). For the same copy, its HW value will not be greater than the LEO value. All messages less than or equal to the HW value are considered "backed up".

Fig. 2 is a flowchart illustrating a data migration method, as shown in fig. 2, for use in the control center 110, according to an exemplary embodiment, including the following steps.

In step S210, in response to the data migration request, the data migration request carries a source cluster identifier and a target cluster identifier, and a data migration instruction is generated.

The data migration request may be triggered by the terminal. The source cluster identification is used to uniquely distinguish the source clusters and the target cluster identification is used to uniquely distinguish the target clusters. The control center can find a source cluster and a target cluster which need to carry out data migration according to the source cluster identifier and the target cluster identifier carried in the data migration request, so that related instructions are sent to the source cluster and the target cluster, and related data are received from the source cluster or the target cluster. Specifically, when data migration is required to be performed on the source cluster, a data migration request can be triggered by inputting a source cluster identifier and a target cluster identifier through the terminal. The control center obtains the data migration request and generates a data migration instruction according to the data migration request.

In step S220, a data migration instruction is sent to a target cluster corresponding to the target cluster identifier, where the data migration instruction is used to instruct the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier.

Specifically, the control center sends the generated data migration request to the target cluster corresponding to the target cluster identifier. The data migration instruction may also carry a source cluster identifier. After receiving the data migration instruction, the target cluster can pull data from the source cluster corresponding to the source cluster identifier according to the data migration instruction.

In step S230, metadata information of the source cluster and the target cluster is received when data migration is completed.

Specifically, after the target cluster finishes pulling data from the source cluster, the target cluster or the source cluster may send a data migration completion notification to the control center. The target cluster and the source cluster respectively send metadata update requests (metadata update request) to the control center, wherein the metadata update requests carry metadata information after data migration. And the control center receives the metadata update request and stores metadata information of the source cluster and the target cluster after data migration.

According to the data migration method, the control center for controlling data migration and managing metadata information is introduced, and after data migration is completed, metadata information corresponding to the sub-groups in the clusters is saved in the control center, so that accuracy of the metadata information can be guaranteed. When the client sends a client metadata request to the proxy server of the subset group, the client metadata request forwarded by the proxy server is uniformly processed through the control center, and the cluster address does not need to be independently changed in the client, so that the data migration flow is simplified, and the time, labor and other costs consumed by data migration are saved.

In an exemplary embodiment, the data migration request further carries a theme identifier to be migrated; as shown in fig. 3, in step S220, a data migration instruction is sent to a target cluster corresponding to a target cluster identifier, and the target cluster is instructed to acquire data to be migrated from a source cluster corresponding to a source cluster identifier, which may be specifically implemented by the following steps:

in step S221, a target topic for storing data to be migrated is created in the target cluster according to the topic identification to be migrated.

In step S222, a data migration instruction is sent to the target cluster, where the data migration instruction is used to instruct the target cluster to obtain data to be migrated corresponding to the identifier of the topic to be migrated from the source cluster, and store the data to be migrated in the target topic.

The theme to be migrated refers to a theme (topic) to be migrated in the source cluster. In this embodiment, data migration may be performed for topics in the source cluster. The data migration request acquired by the control center can also carry the theme identification to be migrated. The identification of the subject to be migrated may be the name of the subject to be migrated. Specifically, the control center is used as the only entry of the operation theme, such as operations of creating the theme, deleting the theme, expanding the partition in the theme, increasing and decreasing the copy, and the like. After receiving the data migration request, the control center determines the corresponding source cluster, target cluster and theme to be migrated, which need to be migrated, in the source cluster according to the source cluster identifier, target cluster identifier and theme identifier to be migrated carried in the data migration request. And the control center creates a mirror topic (mirror topic) corresponding to the topic to be migrated in the target cluster according to the topic to be migrated, and takes the created mirror topic as a target topic. Creating the mirror theme can use Kafka MirrorMaker cross-cluster synchronization tools, not specifically described herein. After creating a target theme in the target cluster, the control center sends a data migration instruction to the target cluster, and the target cluster is instructed to acquire data to be migrated from a leader of the theme to be migrated in real time according to the data quantity of the theme to be migrated. The amount of data for the subject to be migrated may be determined by the HW, which identifies a particular message offset (offset) before which the consumer can only retrieve messages.

In this embodiment, by introducing the control center to manage data migration among clusters, when the data volume of the clusters is too large, the problem of cluster data migration caused by incapacity of expanding a machine room can be effectively handled, so that the data migration process can be simplified, and the time and labor costs required by data migration can be reduced.

In an exemplary embodiment, in the data migration process, the state of the theme to be migrated is a read-write state, and the state of the target theme is a read-only state.

Specifically, in the process that the target cluster acquires data from the theme to be migrated of the source cluster, the state of the theme to be migrated can be kept in a read-write state, and the state of the target theme in the target cluster is in a read-only state. In the data migration process, the data can still be read and written into the theme to be migrated of the source cluster, so that the usability of the theme to be migrated is maintained in the data migration process, and the usability guarantee is provided for the theme to be migrated.

In this embodiment, the status of the theme to be migrated is in a read-write status in the data migration process, so that the availability of the theme to be migrated (especially, the theme to be migrated with high priority and core) can be ensured, the theme to be migrated can continuously provide services such as reading and writing of data, and no perception is realized in the data migration process; on the other hand, by putting the target subject in the read-only state, the target subject is prohibited from providing services such as reading and writing of data before the data migration is completed, so that the uniqueness of the subject that can provide the reading and writing services of data can be ensured.

In an exemplary embodiment, when data migration is completed, the state of the target subject in the target cluster is updated to be a read-write state, and the state of the subject to be migrated in the source cluster is updated to be a disabled state.

Specifically, after the control center determines that the target cluster completes data migration, a first state update instruction may be sent to the target cluster, where the first state update instruction is used to instruct the target cluster to update the state of the target topic into a read-write state; meanwhile, the control center can also send a second state update instruction to the source cluster, wherein the second state update instruction is used for indicating the source cluster to update the state of the theme to be migrated to a disabled state, namely, the theme to be migrated is marked as unavailable, so that the cluster switching is completed.

In the embodiment, the states of the subject to be migrated in the source cluster and the target subject in the target cluster are respectively updated after the data migration is completed, so that the target subject can provide services such as reading and writing of data, and the data migration process is not perceived; in addition, after the data migration is completed, the theme to be migrated is marked as unavailable, so that the theme to be migrated can be prevented from continuously providing services such as reading and writing of the data.

In an exemplary embodiment, as shown in fig. 4, in step S220, receiving metadata information of a source cluster and a target cluster when data migration is completed may be specifically implemented by the following steps:

in step S231, in the data migration process, a data synchronization result of the subject to be migrated and the target subject sent by the source cluster is received.

In step S232, when it is determined that the message offset of the subject to be migrated is consistent with that of the target subject according to the data synchronization result, metadata information of the source cluster and the target cluster is received.

Specifically, kafka uses an ISR mechanism in order to ensure consistency of data. The control center may receive an ISR (IN-SYNC notification, duplicate synchronization queue) report of a topic to be migrated sent by the source cluster. During the data migration process, the source cluster reports ISR progress (i.e., data synchronization results) to the control center. And when the control center determines that the message offset of the theme to be migrated is consistent with that of the target theme according to the ISR progress report of the source cluster, determining that the data synchronization is completed. And the control center can receive metadata update requests sent by the source cluster and the target cluster, wherein the metadata update requests carry metadata information after data migration.

In this embodiment, through receiving the ISR progress report of the source cluster, it is finally determined that data migration is completed under the condition that the data consistency of the to-be-migrated theme and the target theme is maintained, so that the client can be ensured to obtain the access result promised by the system, and the availability of the theme is ensured.

In an exemplary embodiment, in the data migration process, when the message offset between the to-be-migrated theme and the target theme meets the preset requirement, updating the to-be-migrated theme to be in a read-write forbidden state.

The preset requirement may refer to that the amount of data acquired by the target cluster from the subject to be migrated of the source cluster reaches a certain threshold (near perfect agreement). Specifically, when the control center receives an ISR progress report from the source cluster controller and determines that the data volume acquired by the target cluster from the to-be-migrated subject of the source cluster reaches a certain threshold, the control center sends an instruction to the source cluster to instruct the source cluster to change the state of the to-be-migrated subject into a read-write forbidden state, and block the read-write of the to-be-migrated subject.

In this embodiment, when the amount of data in the obtained target theme reaches a certain amount, the control center controls the reading and writing of the theme to be migrated of the blocking source cluster, and prepares to switch the clusters, so that the target theme can be ensured to provide services such as reading and writing of data in time, and no perception is realized in the data migration process.

In an exemplary embodiment, after receiving metadata information of the source cluster and the target cluster when the data migration is completed, the method further comprises the steps of: acquiring metadata information of a subset group in the cluster at fixed time; and verifying the stored metadata information according to the acquired metadata information.

Specifically, the control center can actively obtain metadata information of each sub-cluster in the corresponding cluster at regular time. And further checks the stored metadata information. And when the stored metadata information is not consistent with the metadata information actively acquired at fixed time, checking and correcting the stored metadata information. Meanwhile, the control center can also return an alarm message to prompt the problem of inconsistent metadata information of the user when judging that the stored metadata information is inconsistent with the actively acquired original data information.

In this embodiment, by actively checking the metadata information at regular time, accuracy of the stored metadata information can be ensured, and the situation that the corresponding relationship between the metadata information and the cluster is inconsistent after data migration is avoided.

In an exemplary embodiment, after receiving metadata information of the source cluster and the target cluster when the data migration is completed, the method further comprises the steps of: updating consumption groups corresponding to the source cluster and the target cluster; and storing the mapping relation among the consumption group, the source cluster and the target cluster.

Specifically, the control center may count the consumer groups of all sub-clusters under the cluster. When the control center detects that the repeated consumption groups exist in the source cluster and the target cluster, the repeated consumption groups can be updated, for example, the repeated consumption groups in the source cluster are migrated to the target cluster, or the repeated consumption groups are migrated to a predefined fixed cluster. Further, the control center may save a mapping relationship between the consumption group and the subset group, and fix the Coordinator through the mapping relationship.

In this embodiment, the coordinator service is adopted to unify ConsumerCoordinators (one member variable of the Kafka Consumer Consumer), so that the ConsumerCoordinators will not drift between different clusters after data migration.

In an exemplary embodiment, as shown in fig. 5, after receiving metadata information of the source cluster and the target cluster when data migration is completed, the following steps are further included:

in step S510, a client metadata request sent by a subset group in the cluster is received, where the client metadata request carries a topic identification to be accessed.

In step S520, metadata information of the topic to be accessed is acquired according to the topic to be accessed identifier.

In step S530, the acquired metadata information is returned to the sub-cluster.

Specifically, after the data migration is completed, when a brooker of a certain sub-cluster receives a client metadata request sent by a client, the client metadata request may be forwarded to a control center. The client metadata requests forwarded by the subset brooker are serviced by the control center. That is, according to the topic identification to be accessed carried in the client metadata request, metadata information corresponding to the topic identification to be accessed is obtained, and the metadata information is returned to the brooker of the sub-cluster. And the brooker of the sub-cluster sends the metadata information to the client.

In this embodiment, the control center is introduced to uniformly process the metadata request of the client, so that after data migration, the cluster address in the client does not need to be independently changed, the data migration process can be simplified, and the time and labor costs consumed by data migration are reduced.

In an exemplary embodiment, the data migration method is described by a specific embodiment, and the data migration method may be applied to an application environment as shown in fig. 6. Three component services may be deployed in the control center: super controllers, metadata services, and find coordinator service (coordinator services). The super controller can be used as a unique entry for operating the theme, such as creating the theme, deleting the theme, expanding partitions in the theme, increasing and decreasing copies and the like; the metadata service is not limited to use in managing metadata information in the cluster, periodically acquiring metadata verification information from the master node of the super controller, and servicing client metadata requests forwarded by the subset brooker; the coordinator service is not limited to a mapping relationship for unifying consumer groups and sub-clusters. As shown in fig. 6, the source cluster and the target cluster correspond to a source cluster zk (zookeeper) and a target cluster zk, respectively, the zookeeper cluster does not belong to a component within kafka, but the kafka relies on the zookeeper cluster to save information. As shown in fig. 7, the data migration method may be specifically implemented by:

In step S701, the super controller generates a data migration instruction in response to the data migration request. The data migration request carries a source cluster identifier, a target cluster identifier and a theme identifier to be migrated.

In step S702, the super controller creates a target topic for storing data to be migrated in the target cluster according to the topic identification to be migrated.

In step S703, the super controller sends a data migration instruction to the target cluster. The data migration instruction is used for indicating the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster, and storing the data to be migrated into the target theme.

In the data migration process, the state of the theme to be migrated is a read-write state, and the state of the target theme is a read-only state. And in the data migration process, the super controller can receive ISR progress reports of the to-be-migrated subject and the target subject sent by the source cluster.

In step S704, the super controller receives the ISR schedule report, and when determining that the message offset between the to-be-migrated theme and the target theme meets the preset requirement according to the ISR schedule report, the super controller issues an instruction to the source cluster controller to instruct the source cluster controller to update the to-be-migrated theme to a read-write disabled state, thereby blocking the read-write of the to-be-migrated theme. The cluster controller takes a special role for a particular browser in the cluster. The states maintained by the cluster controller fall into two categories: managing corresponding partition copies on each browser; the state of each topic partition is managed.

In step S705, the super controller receives the ISR progress report, and determines that the message offset of the theme to be migrated is consistent with the message offset of the target theme.

In step S706, the super controller issues a first status update instruction to the target cluster, where the first status update instruction is used to instruct the target cluster to update the status of the target topic into a read-write status; meanwhile, the super controller sends a second state update instruction to the source cluster, wherein the second state update instruction is used for indicating the source cluster to update the state of the theme to be migrated to be in a disabled state, namely the theme to be migrated is marked as unavailable.

In step S707, the super controller receives the metadata update requests transmitted by the source cluster controller and the target cluster controller, and transmits the metadata update requests to the metadata service.

The super controller may include a main node and a plurality of standby nodes, and the main node is used to receive the metadata update request and send the metadata update request to the metadata service and the plurality of standby nodes.

In step S708, the metadata service stores metadata information of the source cluster and the target cluster after data migration according to the received metadata update request.

In step S709, the super controller periodically acquires metadata information of a subset group in the cluster, and sends the acquired metadata information of the subset group to the metadata service, to instruct the metadata service to verify the stored metadata information.

In step S710, the metadata service receives a client metadata request sent by a subset of the clusters brooker. The client metadata request carries the topic identification to be accessed.

In step S711, the metadata service acquires metadata information of the topic to be accessed according to the topic to be accessed identifier, and returns the acquired metadata information to the subset brooker.

In step S712, the coordinator services the consumption groups of the sub-clusters under the statistics cluster, and stores the mapping relationship of the consumption groups and the sub-clusters.

In particular, the coordinator service may count the consumption groups of all sub-clusters under the cluster. When the coordinator service detects that there are duplicate consumption groups in the source cluster and the target cluster, the duplicate consumption groups may be updated, for example, by migrating the duplicate consumption groups in the source cluster to the target cluster, or by migrating the duplicate consumption groups to a predefined one of the fixed clusters. Further, the control center may save a mapping relationship between the consumption group and the subset group, and fix the Coordinator through the mapping relationship. Further, after a corresponding cluster can be selected for the newly added consumption group through a pre-deployed strategy, the mapping relation between the consumption group and the cluster is updated through the coordinator service.

It should be understood that, although the steps in the flowcharts of fig. 1-7 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in FIGS. 1-7 may include multiple steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the steps or stages in other steps or other steps.

Fig. 8 is a block diagram illustrating a data migration apparatus 800 according to an example embodiment. Referring to fig. 8, the apparatus includes an instruction generation module 801, a data migration module 802, and a reception module 803.

An instruction generating module 801 configured to execute a data migration instruction in response to a data migration request, where the data migration request carries a source cluster identifier and a target cluster identifier;

The data migration module 802 is configured to send a data migration instruction to a target cluster corresponding to the target cluster identifier, where the data migration instruction is used to instruct the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier;

the receiving module 803 is configured to perform receiving metadata information of the source cluster and the target cluster when the data migration is completed.

In an exemplary embodiment, the data migration request further carries a theme identifier to be migrated; the data migration module 802 includes: a theme creation unit configured to execute creating a target theme for storing data to be migrated in the target cluster according to the theme identification to be migrated; the data migration unit is configured to send a data migration instruction to the target cluster, and the data migration instruction is used for instructing the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster and store the data to be migrated in the target theme.

In an exemplary embodiment, the data migration apparatus 800 further includes: the first state updating module is configured to execute the steps of updating the state of the target theme to be a read-write state and updating the state of the theme to be migrated to be a disabled state when data migration is completed.

In an exemplary embodiment, the receiving module 803 is configured to receive, during the data migration process, a data synchronization result of the subject to be migrated and the target subject sent by the source cluster; and when the message offset of the theme to be migrated is consistent with that of the target theme according to the data synchronization result, receiving metadata information of the source cluster and the target cluster.

In an exemplary embodiment, the data migration apparatus 800 further includes: and the second state updating module is configured to update the theme to be migrated to a read-write forbidden state when the message offset of the theme to be migrated and the target theme meets the preset requirement in the data migration process.

In an exemplary embodiment, the data migration apparatus 800 further includes: an acquisition module configured to perform timing acquisition of metadata information of a subset group in the cluster; and the verification module is configured to perform verification on the stored metadata information according to the acquired metadata information.

In an exemplary embodiment, the data migration apparatus 800 further includes: the consumption group updating module is configured to update consumption groups corresponding to the source cluster and the target cluster; and the storage module is configured to execute the mapping relation among the storage consumption group, the source cluster and the target cluster.

In an exemplary embodiment, the receiving module 803 is further configured to execute a client metadata request sent by a subset group in the receiving cluster, where the client metadata request carries a topic identification to be accessed; the data migration apparatus 800 further includes: the metadata query module is configured to acquire metadata information of the theme to be accessed according to the theme identification to be accessed; the sending module is configured to return the acquired metadata information to the sub-cluster.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

FIG. 9 is a block diagram illustrating an apparatus 900 for data migration according to an example embodiment. For example, device 900 may be a server. Referring to FIG. 9, device 900 includes a processing component 920 that further includes one or more processors, and memory resources represented by memory 922, for storing instructions, such as applications, executable by processing component 920. The application programs stored in memory 922 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 920 is configured to execute instructions to perform the methods of data migration described above.

The device 900 may also include a power supply component 924 configured to perform power management of the device 900, a wired or wireless network interface 926 configured to connect the device 900 to a network, and an input output (I/O) interface 928. The device 900 may operate based on an operating system stored in memory 922, such as Windows Server, mac OS X, unix, linux, freeBSD, or the like.

In an exemplary embodiment, a storage medium is also provided, such as memory 922 including instructions executable by a processor of device 900 to perform the above-described method. The storage medium may be a non-transitory computer readable storage medium, which may be, for example, ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A data migration method, applied to a control center outside a cluster, the method comprising:

the data migration instruction is sent to a target cluster corresponding to the target cluster identifier, and the data migration instruction is used for indicating the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier;

receiving metadata information of the source cluster and the target cluster when data migration is completed;

receiving a client metadata request sent by a subset group in the cluster, wherein the client metadata request carries a topic identification to be accessed, and the client metadata request is sent to the subset group by a client;

acquiring metadata information of the theme to be accessed according to the theme identifier to be accessed;

And returning the acquired metadata information to the subset group, wherein the subset group is used for sending the metadata information to the client.

2. The data migration method according to claim 1, wherein the data migration request further carries a theme identifier to be migrated; the sending the data migration instruction to the target cluster corresponding to the target cluster identifier includes:

and sending the data migration instruction to the target cluster, wherein the data migration instruction is used for indicating the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster, and storing the data to be migrated into the target theme.

3. The data migration method according to claim 2, wherein in the data migration process, a state of a subject to be migrated is a read-write state, and a state of the target subject is a read-only state.

4. A data migration method according to claim 3, wherein the method further comprises:

and when the data migration is completed, updating the state of the target theme into a read-write state, and updating the state of the theme to be migrated into a forbidden state.

5. The data migration method of claim 2, wherein the receiving metadata information of the source cluster and the target cluster when data migration is completed comprises:

in the data migration process, receiving a data synchronization result of a theme to be migrated and the target theme, which are sent by the source cluster;

6. The data migration method of claim 5, further comprising:

in the data migration process, when the information offset of the theme to be migrated and the target theme meets the preset requirement, updating the theme to be migrated to be in a read-write forbidden state.

7. The data migration method of claim 1, further comprising, after receiving metadata information for the source cluster and the target cluster when the data migration is completed:

acquiring metadata information of a subset group in the cluster at fixed time;

8. The data migration method of claim 1, further comprising, after receiving metadata information for the source cluster and the target cluster when the data migration is completed:

9. A data migration apparatus for use in a control center outside a cluster, the apparatus comprising:

the data migration module is configured to send the data migration instruction to a target cluster corresponding to the target cluster identifier, and the data migration instruction is used for indicating the target cluster to acquire data to be migrated from a source cluster corresponding to the source cluster identifier;

a receiving module configured to perform receiving metadata information of the source cluster and the target cluster when data migration is completed; receiving a client metadata request sent by a subset group in the cluster, wherein the client metadata request carries a topic identification to be accessed, and the client metadata request is sent to the subset group by a client;

The metadata query module is configured to acquire metadata information of the theme to be accessed according to the theme identification to be accessed;

and the sending module is configured to return the acquired metadata information to the subset group, and the subset group is used for sending the metadata information to the client.

10. The data migration apparatus of claim 9, wherein the data migration request further carries a theme identifier to be migrated; the data migration module comprises:

a theme creation unit configured to execute creating a target theme for storing the data to be migrated in the target cluster according to the theme identification to be migrated;

the data migration unit is configured to send the data migration instruction to the target cluster, and the data migration instruction is used for indicating the target cluster to acquire data to be migrated corresponding to the theme identification to be migrated from the source cluster, and storing the data to be migrated in the target theme.

11. The data migration apparatus according to claim 10, wherein in the data migration process, the state of the subject to be migrated is a read-write state, and the state of the target subject is a read-only state.

12. The data migration apparatus of claim 11, wherein the apparatus further comprises:

and the first state updating module is configured to update the state of the target theme into a read-write state and update the state of the theme to be migrated into a forbidden state when data migration is completed.

13. The data migration apparatus of claim 10, wherein the receiving module is configured to perform:

in the data migration process, receiving a data synchronization result of a theme to be migrated and the target theme, which are sent by the source cluster; and when the message offset of the theme to be migrated is consistent with that of the target theme according to the data synchronization result, receiving metadata information of the source cluster and the target cluster.

14. The data migration apparatus of claim 13, wherein the apparatus further comprises:

15. The data migration apparatus of claim 9, wherein the apparatus further comprises:

16. The data migration apparatus of claim 9, wherein the apparatus further comprises:

the consumption group updating module is configured to update the consumption groups corresponding to the source cluster and the target cluster;

and the storage module is configured to store the mapping relation among the consumption group, the source cluster and the target cluster.

17. A server, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the data migration method of any one of claims 1 to 9.

18. A storage medium, wherein instructions in the storage medium, when executed by a processor of a server, enable the server to perform the data migration method of any one of claims 1 to 9.