WO2012051845A1 - Data transfer method and system - Google Patents

Data transfer method and system Download PDF

Info

Publication number
WO2012051845A1
WO2012051845A1 PCT/CN2011/073485 CN2011073485W WO2012051845A1 WO 2012051845 A1 WO2012051845 A1 WO 2012051845A1 CN 2011073485 W CN2011073485 W CN 2011073485W WO 2012051845 A1 WO2012051845 A1 WO 2012051845A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
migration
node
redo log
removal
Prior art date
Application number
PCT/CN2011/073485
Other languages
French (fr)
Chinese (zh)
Inventor
陈典强
郭斌
韩银俊
Original Assignee
刘建
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 刘建 filed Critical 刘建
Publication of WO2012051845A1 publication Critical patent/WO2012051845A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Definitions

  • the present invention relates to the field of data distributed caching, and more particularly to a method and system for data migration. Background technique
  • Cloud computing is the product of the integration of traditional computer technology and network technology such as grid computing, distributed computing, parallel computing, utility computing, network storage, virtualization, load balancing, etc. It aims to bring multiple relatively low costs through the network. Computational entities are integrated into a system with powerful computing power. Distributed caching is an area in the field of cloud computing. Its role is to provide distributed storage services for massive amounts of data in cloud computing and high-speed read and write access.
  • a distributed caching system that provides distributed caching functions consists of a number of server nodes, hereinafter referred to as nodes, and clients interconnected.
  • nodes server nodes
  • clients interconnected.
  • the data consists of a key (Key) and a value (Value).
  • Key is equivalent to the index of the data
  • Value is the data content represented by the Key.
  • Logically, Key and Value are one-to-one relationships.
  • Distributed cache system may have problems such as uneven node load, unstable nodes, and excessive system load during long-term work. In this case, some data on nodes with higher or unstable load needs to be migrated to On a node with a lower load; or add a new node to migrate part of the data in the system to the new node for maintenance. In a distributed cache system, how to migrate data without affecting the normal operation of the system is a key problem that is difficult to solve.
  • Solution 1 stop the normal operation of the distributed cache system, prompt the client to distribute the cache system Updating, and then data migration of the distributed cache system;
  • Solution 2 In the normal working condition of the distributed cache system, the client traverses the data in the node that reads the data to be migrated, and then writes the node to be moved into the data;
  • the method in the above scheme 1 stops the normal operation of the distributed cache system, and affects the normal operation of the distributed cache system; the method in the scheme 2, the data is read by the traversal method, which causes the data migration speed to be slow, and In the case that the distributed cache system works normally, the newly generated data may be missed, resulting in inconsistent data between the nodes that migrated data and the nodes that migrate data.
  • the main purpose of the present invention is to provide a data migration method and system, which can improve the data migration speed while ensuring the normal operation of the system, and ensure the consistency of data during the migration process.
  • the present invention provides a method for data migration, the method comprising:
  • the data removal node sends the migration data and the redo log directly to the data migration node; the data migration node receives the migration data and saves it, and performs the calibration and update of the migration data according to the redo log.
  • the notification includes: a data entry node address, and a virtual node address where the migration data is located.
  • the data removal node directly sends the migration data and the redo log to the data migration node, including: the data removal node moves the node address according to the data, and the migration data and the redo log in the virtual node address are in the form of data packets.
  • the data is sent to the node, and after the data packet of the current redo is sent, the management platform is notified, and the control platform suspends the service of the data removal node.
  • the verifying and updating the migration data according to the redo log includes: the data migration node determines, according to the received redo log, that the data corresponding to the operation recorded in the redo log is different from the migration data, and is performed again. The operation.
  • the method further includes: the data migration node notifies the management platform that the re-operation is completed, and the control platform notifies all nodes and clients, and the data is moved into the node instead of the data.
  • the removal node provides the client with the relevant services for migrating data, and at the same time clears the migration data from the data removal node.
  • the present invention also provides a system for data migration, the system comprising: a data migration management module, a data removal processing module, and a data migration processing module;
  • the data migration management module is configured to migrate the data according to the user selection notification data removal processing module
  • the data removal processing module is configured to send the migration data and the redo log to the data migration processing module
  • the data is moved into the processing module, set to receive the migrated data and saved, and the received migration data is checked and updated according to the redo log.
  • the data migration management module is configured to send a notification including the data entry node address and the virtual node address where the migration data is located to the data removal processing module according to the user selection.
  • the data removal processing module is configured to: according to the data movement into the node address, send the migration data and the redo log in the virtual node to the data migration processing module in the form of data packets, and determine the data packet of the current redo log. After the sending is completed, the data migration management module is notified; the data migration management module is configured to suspend the service of the data removal node according to the notification of the data removal processing module.
  • the data migration processing module is configured to notify the data migration management module after the re-operation is completed; correspondingly, the data migration management module is configured to notify all nodes and The client, by the data moving in node instead of the data removal node, provides the client with the relevant service for migrating data, and clears the migrated data in the data removal node.
  • the migration data is directly sent from the data removal node to the data migration node, thereby improving the speed of data migration; the migration data is migrated after replication, thereby effectively ensuring the normal function of the system;
  • the data moving in node performs re-operation according to the redo log to ensure the consistency of the data migration between the data removal node and the data moving in node when the client sends a request to the data migration node during the data migration.
  • FIG. 1 is a schematic flowchart of a method for implementing data migration according to the present invention
  • FIG. 2 is a schematic diagram of a system composition for implementing data migration according to the present invention. detailed description
  • the basic idea of the present invention is: according to the user selection notification data removal node for data migration, the data removal node sends data and redo (Redo) logs to the data migration node, and the data migration node receives the migration data and saves, according to the redo log pair.
  • the received migration data is checked and updated.
  • a method of data migration includes the following steps:
  • Step 101 When the system is overburdened or the load is uneven, the data is migrated according to the user selection notification data to move out of the node;
  • the node that needs to migrate data that is, the data removal node is selected through the management platform; and the node that receives the migration data, that is, the data is moved into the node, and the data is removed from the node for data.
  • Migration here, the system refers to a distributed cache system
  • the notification is: the management platform sends data migration information to the data removal node.
  • the data migration information includes a virtual node that needs to be migrated, a data move-in node address, and the like; the virtual node is a node that stores data in different virtual nodes.
  • the data moving in node may be a lightly loaded node, or may be a newly added node, and the newly added node accesses the system, and sends its own information to the control platform.
  • the control platform requests other nodes and new according to the information sent by the newly added node.
  • the node is added to establish a connection, or the new node is required to establish a connection with other nodes; the self information includes its own address and the like.
  • Step 102 The data removal node directly sends the data and the Redo log to the data moving in node; the data removal node selects the data in the virtual node according to the notification of the management platform, and moves the data to the node address according to the data, and sends the data to the node. node.
  • the client sends a request to the data removal node at this time, the data can still be served to the client because the data of the data removal node is still there.
  • the data of the data removal node changes, and the data removal node is sent according to the client during the data migration period during the data migration.
  • the request the data moves out of the response made by the node, that is, the operation of the data removal node is written into the Redo log, and after the data removal node sends the migration data to the data migration node, the Redo log is sent to the data migration node in the form of a data packet.
  • the Redo is used to record the operation of the data removal node, and the operation of the data removal node includes the data removal node performing actions and corresponding data, such as deleting and deleting data, adding and adding data, modifying and modifying data, etc.
  • the function of the Redo log can be selected by the management module to be enabled.
  • the data removal node notifies the management platform when the current Redo packet transmission is completed.
  • the control platform requires the data removal node to stop the service, the data removal node stops the service, and the Redo log generated from the notification management platform to the stop service is sent to the data migration node in the form of a data packet, and the stop service refers to stopping the service to the system. provide support.
  • the data is stored in the form of Key and Value.
  • a Key and the corresponding Value are stored in multiple nodes, one of which is called a collaborative server, that is, a collaborative node, and the other is called a replica server, that is, a replica node, a collaborative server.
  • a collaborative server that is, a collaborative node
  • replica server that is, a replica node, a collaborative server.
  • the correspondence between the Key and the node address is stored in the routing table.
  • One Key corresponds to multiple node addresses. By default, the first node address is selected as the coordination node address for processing the Key, and the other node addresses are used as the replica node address.
  • the client When the client sends a request to the node, it will query the locally saved routing table according to the Key, obtain the collaborative node address, and send a request to the node according to the node address. If the coordinating node stops the service, the next node address corresponding to the key is searched according to the locally saved routing table, and the request sent by the client is sent to the next node, and the node processes the request related to the key as a collaborative server, and therefore, the data After the node is removed from the service, it does not affect the normal operation of the system and the normal use of the client.
  • Step 103 The data moving in node receives the migration data, and performs verification and update on the migration data according to the Redo log to ensure data consistency.
  • the data moving in node receives the data sent by the data removal node and saves, receives the Redo log sent by the data removal node, and checks and updates the migration data according to the log record, where the checksum update refers to the data moving in the node according to Receive the Redo log, obtain the operation in the Redo log, and compare the data corresponding to the operation with the migrated data. If the migration data is the same, no operation is performed. If the migration data is different, the operation is performed again to ensure the data.
  • the management and control platform is notified, and the control platform sends a node replacement notification message to all nodes and clients in the network, and the node replacement notification message refers to the service for migrating data previously provided by the data removal node.
  • the data is moved into the node; and, in the routing table, the node address corresponding to the key in the migrated data, the original data is moved out of the node address, replaced with the address of the data moving in node, and the data is moved into the node service.
  • the management platform clears the migration data from the data removal node. At this point, the data migration work is completed.
  • the present invention also provides a system for data migration, as described in FIG. 2, the system includes: According to the migration management module 201, the data removal processing module 202, and the data migration processing module 203, the data migration management module 201 is located in the management module, and is configured to perform data migration according to the user selection notification data migration processing module 202; the notification includes a data migration node. The address and the virtual node where the data is migrated;
  • the data removal processing module 202 is located at the data removal node, and is configured to
  • the Redo log is sent to the data migration processing module 203 in the form of a data packet
  • the data movement processing module 203 is located at the data moving in node, and is configured to receive and save the data packet, receive the Redo log, and check and update the received migration data according to the Redo log.
  • the data removal processing module 202 is configured to receive a request from the client, and move the data out of the response of the node, that is, the operation of moving the data out of the node, and record the data in the Redo log.
  • the data removal processing module 202 is further configured to: after determining that the data transmission of the current Redo log is completed, notify the data migration management module 201, and send the data of the Redo log generated between the sending of the log and the stopping of the service to the data migration processing module 203. ;
  • the corresponding data migration management module 201 is configured to suspend the service of the data removal node.
  • the data migration processing module 203 is further configured to: according to the received Redo log, obtain an operation recorded in the Redo log, compare the data in the operation with the migrated data, and if the migration data is the same, do not perform any operation, if the data is migrated If the operation is performed again, the data consistency is ensured. After the operation of the Redo log record is performed, the data migration management module 201 is notified. The data migration management module 201 is configured to notify all the notifications according to the data migration processing module 203.
  • the node and the client, the migration data related service originally provided by the data removal node is changed to be served by the data migration node, and the data in the node address corresponding to the Key in the routing table is moved out of the node address, and replaced by the data migration node address.
  • the data migration management module 201 is further configured to clear the migrated data in the data out of the node.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data transfer method is disclosed in the present invention, and the method includes: according to the user selection, a data transfer-out node is notified to perform data transfer; the data transfer-out node sends the transferred data and a redoing log to a data transfer-in node; the data transfer-in node receives the transferred data and stores it, receives the redoing log and performs verification and update for the received transferred data according to the redoing log. A data transfer system is also disclosed in the present invention. The method and system in the present invention can improve the data transfer speed and ensure the data consistency in the transfer process, while also ensuring the normal operation of the system.

Description

一种数据迁移的方法及系统 技术领域  Method and system for data migration
本发明涉及数据分布式緩存领域, 特别是指一种数据迁移的方法及系 统。 背景技术  The present invention relates to the field of data distributed caching, and more particularly to a method and system for data migration. Background technique
云计算是网格计算、 分布式计算、 并行计算、 效用计算、 网络存储、 虚拟化、 负载均衡等传统计算机技术和网络技术发展融合的产物; 它旨在 通过网络将多个成本相对较低的计算实体整合成一个具有强大计算能力的 系统。 分布式緩存是云计算范畴中的一个领域, 其作用是提供云计算中海 量数据的分布式存储服务以及高速读写访问的能力。  Cloud computing is the product of the integration of traditional computer technology and network technology such as grid computing, distributed computing, parallel computing, utility computing, network storage, virtualization, load balancing, etc. It aims to bring multiple relatively low costs through the network. Computational entities are integrated into a system with powerful computing power. Distributed caching is an area in the field of cloud computing. Its role is to provide distributed storage services for massive amounts of data in cloud computing and high-speed read and write access.
提供分布式緩存功能的分布式緩存系统由若干服务器节点以下简称节 点、 以及客户端互相连接构成。 一般来说, 为保证数据的安全性, 写入节 点的数据不可能只保存在单个节点上, 而是在多台节点上保存同一个数据 的副本, 互为备份。 所述数据由键 ( Key )和值(Value )构成, Key相当 于数据的索引, Value是 Key所代表的数据内容。逻辑上 Key和 Value是一 对一的关系。  A distributed caching system that provides distributed caching functions consists of a number of server nodes, hereinafter referred to as nodes, and clients interconnected. In general, to ensure data security, data written to a node cannot be stored on a single node. Instead, multiple copies of the same data are stored on multiple nodes, which are backups of each other. The data consists of a key (Key) and a value (Value). The Key is equivalent to the index of the data, and the Value is the data content represented by the Key. Logically, Key and Value are one-to-one relationships.
分布式緩存系统在长期工作过程中可能出现节点负载不均匀、 某些节 点不稳定、 系统负荷过大等等的问题, 这时就需要将负载较高或不稳定的 节点上的部分数据迁移到负载较低的节点上; 或者添加新的节点, 将系统 中的部分数据迁移至新节点上维护。 在分布式緩存系统中, 如何在不影响 系统正常工作的前提下迁移数据是比较难以解决的关键问题。  Distributed cache system may have problems such as uneven node load, unstable nodes, and excessive system load during long-term work. In this case, some data on nodes with higher or unstable load needs to be migrated to On a node with a lower load; or add a new node to migrate part of the data in the system to the new node for maintenance. In a distributed cache system, how to migrate data without affecting the normal operation of the system is a key problem that is difficult to solve.
现有技术中, 数据迁移一般有两种解决方法:  In the prior art, there are generally two solutions to data migration:
方案 1 ,停止分布式緩存系统的正常工作,提示客户端分布式緩存系统 正在更新, 然后进行分布式緩存系统的数据迁移; Solution 1, stop the normal operation of the distributed cache system, prompt the client to distribute the cache system Updating, and then data migration of the distributed cache system;
方案 2, 分布式緩存系统正常工作情况下,客户端通过遍历读取要迁出 数据的节点中的数据, 然后写入要迁入数据的节点;  Solution 2: In the normal working condition of the distributed cache system, the client traverses the data in the node that reads the data to be migrated, and then writes the node to be moved into the data;
上述方案 1 中的方法, 要停止分布式緩存系统的正常工作, 影响了分 布式緩存系统的正常运行; 方案 2 中的方法, 通过遍历的方法读取数据, 会导致数据迁移速度緩慢, 并且, 在分布式緩存系统正常工作的情况下, 新产生的数据可能会被遗漏, 导致迁移的数据在迁出数据的节点与迁入数 据的节点之间不一致。 发明内容  The method in the above scheme 1 stops the normal operation of the distributed cache system, and affects the normal operation of the distributed cache system; the method in the scheme 2, the data is read by the traversal method, which causes the data migration speed to be slow, and In the case that the distributed cache system works normally, the newly generated data may be missed, resulting in inconsistent data between the nodes that migrated data and the nodes that migrate data. Summary of the invention
有鉴于此, 本发明的主要目的在于提供一种数据迁移的方法及系统, 可以在保证系统正常工作的同时, 提高数据的迁移速度, 保证迁移过程中 数据的一致性。  In view of this, the main purpose of the present invention is to provide a data migration method and system, which can improve the data migration speed while ensuring the normal operation of the system, and ensure the consistency of data during the migration process.
为达到上述目的, 本发明的技术方案是这样实现的:  In order to achieve the above object, the technical solution of the present invention is achieved as follows:
本发明提供了一种数据迁移的方法, 该方法包括:  The present invention provides a method for data migration, the method comprising:
根据用户选择通知数据移出节点进行数据迁移;  Data migration based on user selection notification data removal node;
数据移出节点将迁移数据及重做日志直接发送给数据移入节点; 数据移入节点接收迁移数据并保存, 根据重做日志对迁移数据进行校 马全和更新。  The data removal node sends the migration data and the redo log directly to the data migration node; the data migration node receives the migration data and saves it, and performs the calibration and update of the migration data according to the redo log.
上述方案中, 所述通知包括: 数据移入节点地址、 迁移数据所在的虚 节点地址。  In the above solution, the notification includes: a data entry node address, and a virtual node address where the migration data is located.
上述方案中, 所述数据移出节点将迁移数据及重做日志直接发送给数 据移入节点包括: 数据移出节点根据数据移入节点地址, 将虚节点地址中 的迁移数据及重做日志以数据包的形式发送给数据移入节点, 在当前重做 曰志的数据包发送完成后, 通知管控平台, 管控平台暂停数据移出节点的 服务。 上述方案中, 所述根据重做日志对迁移数据进行校验和更新包括: 数 据移入节点根据接收到的重做日志, 确定重做日志中记录的操作对应的数 据与迁移数据不同, 重新执行一次该操作。 In the above solution, the data removal node directly sends the migration data and the redo log to the data migration node, including: the data removal node moves the node address according to the data, and the migration data and the redo log in the virtual node address are in the form of data packets. The data is sent to the node, and after the data packet of the current redo is sent, the management platform is notified, and the control platform suspends the service of the data removal node. In the foregoing solution, the verifying and updating the migration data according to the redo log includes: the data migration node determines, according to the received redo log, that the data corresponding to the operation recorded in the redo log is different from the migration data, and is performed again. The operation.
上述方案中, 所述根据重做日志对迁移数据进行校验和更新之后, 该 方法还包括: 数据移入节点通知管控平台重操作完成, 管控平台通知所有 节点及客户端, 由数据移入节点代替数据移出节点为客户端提供迁移数据 的相关服务, 同时清除数据移出节点中的迁移数据。  In the above solution, after the verification and update of the migration data according to the redo log, the method further includes: the data migration node notifies the management platform that the re-operation is completed, and the control platform notifies all nodes and clients, and the data is moved into the node instead of the data. The removal node provides the client with the relevant services for migrating data, and at the same time clears the migration data from the data removal node.
本发明还提供了一种数据迁移的系统, 该系统包括: 数据迁移管控模 块、 数据移出处理模块、 数据移入处理模块;  The present invention also provides a system for data migration, the system comprising: a data migration management module, a data removal processing module, and a data migration processing module;
数据迁移管控模块, 设置为根据用户选择通知数据移出处理模块迁移 数据;  The data migration management module is configured to migrate the data according to the user selection notification data removal processing module;
数据移出处理模块, 设置为将迁移数据及重做日志发送给数据移入处 理模块;  The data removal processing module is configured to send the migration data and the redo log to the data migration processing module;
数据移入处理模块, 设置为接收迁移数据并保存, 根据重做日志对接 收的迁移数据进行校验和更新。  The data is moved into the processing module, set to receive the migrated data and saved, and the received migration data is checked and updated according to the redo log.
上述方案中, 所述数据迁移管控模块设置为, 根据用户选择, 将包含 数据移入节点地址、 迁移数据所在的虚节点地址的通知发送给数据移出处 理模块。  In the above solution, the data migration management module is configured to send a notification including the data entry node address and the virtual node address where the migration data is located to the data removal processing module according to the user selection.
上述方案中, 所述数据移出处理模块设置为, 根据数据移入节点地址, 将虚节点中的迁移数据及重做日志以数据包的形式发送给数据移入处理模 块, 确定当前重做日志的数据包发送完成, 通知数据迁移管控模块; 所述 数据迁移管控模块设置为, 根据数据移出处理模块的通知, 暂停数据移出 节点的服务。  In the above solution, the data removal processing module is configured to: according to the data movement into the node address, send the migration data and the redo log in the virtual node to the data migration processing module in the form of data packets, and determine the data packet of the current redo log. After the sending is completed, the data migration management module is notified; the data migration management module is configured to suspend the service of the data removal node according to the notification of the data removal processing module.
上述方案中, 所述数据移入处理模块设置为, 重操作完成后通知数据 迁移管控模块; 相应的, 所述数据迁移管控模块设置为, 通知所有节点及 客户端, 由数据移入节点代替数据移出节点为客户端提供迁移数据的相关 服务, 清除数据移出节点中的迁移数据。 In the above solution, the data migration processing module is configured to notify the data migration management module after the re-operation is completed; correspondingly, the data migration management module is configured to notify all nodes and The client, by the data moving in node instead of the data removal node, provides the client with the relevant service for migrating data, and clears the migrated data in the data removal node.
由此可见, 釆用本发明所述的方法及系统, 将迁移数据从数据移出节 点直接发送到数据移入节点, 提高数据迁移的速度; 迁移数据在复制后进 行迁移, 有效保证系统的正常功能; 数据移入节点根据重做日志进行重操 作, 保证数据迁移期间, 客户端向数据移入节点发送请求时, 迁移数据发 生变化后, 数据移出节点与数据移入节点之间迁移数据的一致性。 附图说明  It can be seen that, by using the method and system of the present invention, the migration data is directly sent from the data removal node to the data migration node, thereby improving the speed of data migration; the migration data is migrated after replication, thereby effectively ensuring the normal function of the system; The data moving in node performs re-operation according to the redo log to ensure the consistency of the data migration between the data removal node and the data moving in node when the client sends a request to the data migration node during the data migration. DRAWINGS
图 1为本发明实现数据迁移的方法流程示意图;  1 is a schematic flowchart of a method for implementing data migration according to the present invention;
图 2为本发明实现数据迁移的系统组成示意图。 具体实施方式  FIG. 2 is a schematic diagram of a system composition for implementing data migration according to the present invention. detailed description
本发明的基本思想是: 根据用户选择通知数据移出节点进行数据迁移, 数据移出节点将数据及重做( Redo ) 日志发送给数据移入节点, 数据移入 节点接收迁移数据并保存, 根据重做日志对接收的迁移数据进行校验和更 新。  The basic idea of the present invention is: according to the user selection notification data removal node for data migration, the data removal node sends data and redo (Redo) logs to the data migration node, and the data migration node receives the migration data and saves, according to the redo log pair. The received migration data is checked and updated.
下面通过具体实施例与附图来对本发明进行详细说明。  The invention will now be described in detail by way of specific embodiments and drawings.
一种数据迁移的方法, 如图 1所示, 包括以下步骤:  A method of data migration, as shown in Figure 1, includes the following steps:
步骤 101、 系统负担过重或负载不均时, 根据用户选择通知数据移出节 点进行数据迁移;  Step 101: When the system is overburdened or the load is uneven, the data is migrated according to the user selection notification data to move out of the node;
用户发现系统负载过重或负载不均时, 需要启动数据迁移, 通过管控 平台选择需要迁移数据的节点, 即数据移出节点; 以及接收迁移数据的节 点, 即数据移入节点, 通知数据移出节点进行数据迁移; 这里, 所述系统 指分布式緩存系统;  When the user finds that the system is overloaded or the load is uneven, the data migration needs to be started. The node that needs to migrate data, that is, the data removal node is selected through the management platform; and the node that receives the migration data, that is, the data is moved into the node, and the data is removed from the node for data. Migration; here, the system refers to a distributed cache system;
其中, 所述通知为: 管控平台向数据移出节点发送数据迁移信息, 所 述数据迁移信息包含需要迁移的虚节点、 数据移入节点地址等; 所述虚节 点是节点将数据进行分组保存在不同的虚节点中。 所述数据移入节点可以 是负载较轻的节点, 也可以是新增节点, 新增节点接入系统, 向管控平台 发送自身信息, 管控平台根据新增节点发送的自身信息, 要求其他节点与 新增节点建立连接, 或者要求新增节点与其他节点建立连接; 所述自身信 息包含自身地址等。 The notification is: the management platform sends data migration information to the data removal node. The data migration information includes a virtual node that needs to be migrated, a data move-in node address, and the like; the virtual node is a node that stores data in different virtual nodes. The data moving in node may be a lightly loaded node, or may be a newly added node, and the newly added node accesses the system, and sends its own information to the control platform. The control platform requests other nodes and new according to the information sent by the newly added node. The node is added to establish a connection, or the new node is required to establish a connection with other nodes; the self information includes its own address and the like.
步骤 102、数据移出节点将数据及 Redo日志直接发送给数据移入节点; 数据移出节点根据管控平台的通知, 选取虚节点中的数据, 以数据包 的形式, 根据数据移入节点地址, 发送给数据移入节点。  Step 102: The data removal node directly sends the data and the Redo log to the data moving in node; the data removal node selects the data in the virtual node according to the notification of the management platform, and moves the data to the node address according to the data, and sends the data to the node. node.
如果这时客户端向数据移出节点发送请求, 因为数据移出节点的数据 仍在, 仍可以向客户端提供服务。 同时, 因为客户端向数据移出节点发送 请求, 会导致数据移出节点的数据发生变化, 为保存数据移出节点与数据 移入节点中数据的一致性, 数据移出节点在数据迁移期间, 将根据客户端 发送的请求, 数据移出节点做出的响应, 即数据移出节点的操作写入 Redo 日志, 在数据移出节点将迁移数据发送给数据移入节点后, 将 Redo日志以 数据包的形式发送给数据移入节点,所述 Redo用于记录数据移出节点的操 作, 所述数据移出节点的操作包括数据移出节点做动作及对应的数据, 例 如删除及要删除数据、 增加及要增加的数据、 修改及修改的数据等, 所述 Redo日志的功能可通过管控模块选择是否开启。  If the client sends a request to the data removal node at this time, the data can still be served to the client because the data of the data removal node is still there. At the same time, because the client sends a request to the data removal node, the data of the data removal node changes, and the data removal node is sent according to the client during the data migration period during the data migration. The request, the data moves out of the response made by the node, that is, the operation of the data removal node is written into the Redo log, and after the data removal node sends the migration data to the data migration node, the Redo log is sent to the data migration node in the form of a data packet. The Redo is used to record the operation of the data removal node, and the operation of the data removal node includes the data removal node performing actions and corresponding data, such as deleting and deleting data, adding and adding data, modifying and modifying data, etc. The function of the Redo log can be selected by the management module to be enabled.
因为客户端会不时的向数据移出节点发送请求,所以 Redo日志中的记 录会持续增加,这样会导致迁移无法结束,所以,数据移出节点在当前 Redo 曰志的数据包发送完成时, 通知管控平台, 管控平台要求数据移出节点停 止服务, 数据移出节点停止服务, 同时将从通知管控平台到停止服务期间 产生的 Redo日志以数据包的形式发送给数据移入节点,所述停止服务指停 止对系统服务提供支持。 数据是以 Key和 Value的形式保存的,一个 Key及对应的 Value保存在 多个节点中, 其中一个称之为协同服务器, 即协同节点, 其他的称之为副 本服务器, 即副本节点, 协同服务器与副本服务器之间存在连接关系, 协 同节点和副本节点的功能相同。 Key及节点地址的对应关系保存在路由表 中,一个 Key对应多个节点地址,默认选取第一个节点地址作为处理该 Key 的协同节点地址, 其他节点地址作为副本节点地址。 客户端向节点发送请 求, 会根据 Key查询本地保存的路由表, 获取协同节点地址, 根据节点地 址, 向节点发送请求。 如果协同节点停止服务, 会根据本地保存的路由表 查找 Key对应的下一个节点地址,将客户端发送的请求发送给下一个节点, 由该节点作为协同服务器处理该 Key相关的请求, 因此, 数据移出节点暂 停服务后, 不影响系统的正常工作和客户端的正常使用。 Because the client will send requests to the data removal node from time to time, the records in the Redo log will continue to increase, which will cause the migration to end. Therefore, the data removal node notifies the management platform when the current Redo packet transmission is completed. The control platform requires the data removal node to stop the service, the data removal node stops the service, and the Redo log generated from the notification management platform to the stop service is sent to the data migration node in the form of a data packet, and the stop service refers to stopping the service to the system. provide support. The data is stored in the form of Key and Value. A Key and the corresponding Value are stored in multiple nodes, one of which is called a collaborative server, that is, a collaborative node, and the other is called a replica server, that is, a replica node, a collaborative server. There is a connection relationship with the replica server, and the function of the collaboration node and the replica node are the same. The correspondence between the Key and the node address is stored in the routing table. One Key corresponds to multiple node addresses. By default, the first node address is selected as the coordination node address for processing the Key, and the other node addresses are used as the replica node address. When the client sends a request to the node, it will query the locally saved routing table according to the Key, obtain the collaborative node address, and send a request to the node according to the node address. If the coordinating node stops the service, the next node address corresponding to the key is searched according to the locally saved routing table, and the request sent by the client is sent to the next node, and the node processes the request related to the key as a collaborative server, and therefore, the data After the node is removed from the service, it does not affect the normal operation of the system and the normal use of the client.
步骤 103、 数据移入节点接收迁移数据, 根据 Redo日志对迁移数据进 行校验和更新, 保证数据一致性。  Step 103: The data moving in node receives the migration data, and performs verification and update on the migration data according to the Redo log to ensure data consistency.
这里, 数据移入节点接收数据移出节点发送的数据并保存, 接收数据 移出节点发送的 Redo日志,根据日志的记录,对迁移数据进行校验和更新, 所述校验和更新是指数据移入节点根据接收的 Redo 日志, 获取 Redo 日志 中的操作, 将操作对应的数据同迁移数据比较, 若与迁移数据相同, 则不 做任何操作, 若与迁移数据不同, 则重新执行一次所述操作, 保证数据一 致性, 校验和更新完成后, 通知管控平台, 管控平台向网络中所有节点及 客户端发送节点更换通知消息, 所述节点更换通知消息是指之前由数据移 出节点提供的关于迁移数据的服务, 现在由数据移入节点提供; 并且, 将 路由表中迁移数据中 Key对应的节点地址, 将原来数据移出节点的地址, 替换为数据移入节点的地址, 并启动数据移入节点的服务。 同时, 管控平 台清除数据移出节点中的迁移数据。 至此, 数据迁移工作全部完成。  Here, the data moving in node receives the data sent by the data removal node and saves, receives the Redo log sent by the data removal node, and checks and updates the migration data according to the log record, where the checksum update refers to the data moving in the node according to Receive the Redo log, obtain the operation in the Redo log, and compare the data corresponding to the operation with the migrated data. If the migration data is the same, no operation is performed. If the migration data is different, the operation is performed again to ensure the data. After the consistency, the verification and the update are completed, the management and control platform is notified, and the control platform sends a node replacement notification message to all nodes and clients in the network, and the node replacement notification message refers to the service for migrating data previously provided by the data removal node. Now, the data is moved into the node; and, in the routing table, the node address corresponding to the key in the migrated data, the original data is moved out of the node address, replaced with the address of the data moving in node, and the data is moved into the node service. At the same time, the management platform clears the migration data from the data removal node. At this point, the data migration work is completed.
本发明还提供了一种数据迁移的系统, 如图 2所述, 该系统包括: 数 据迁移管控模块 201、 数据移出处理模块 202以及数据移入处理模块 203; 数据迁移管控模块 201 ,位于管控模块, 用于根据用户选择通知数据移 入处理模块 202进行数据迁移; 所述通知包括数据移入节点地址、 迁移数 据所在的虚节点; The present invention also provides a system for data migration, as described in FIG. 2, the system includes: According to the migration management module 201, the data removal processing module 202, and the data migration processing module 203, the data migration management module 201 is located in the management module, and is configured to perform data migration according to the user selection notification data migration processing module 202; the notification includes a data migration node. The address and the virtual node where the data is migrated;
数据移出处理模块 202, 位于数据移出节点, 用于根据通知将数据及 The data removal processing module 202 is located at the data removal node, and is configured to
Redo日志以数据包的形式发送给数据移入处理模块 203; The Redo log is sent to the data migration processing module 203 in the form of a data packet;
数据移入处理模块 203 , 位于数据移入节点, 用于接收数据包并保存, 接收 Redo日志, 并根据 Redo日志对接收的迁移数据进行校验和更新。  The data movement processing module 203 is located at the data moving in node, and is configured to receive and save the data packet, receive the Redo log, and check and update the received migration data according to the Redo log.
数据移出处理模块 202用于, 接收客户端的请求, 将数据移出节点的 响应, 即数据移出节点的操作, 记录在 Redo日志中。  The data removal processing module 202 is configured to receive a request from the client, and move the data out of the response of the node, that is, the operation of moving the data out of the node, and record the data in the Redo log.
数据移出处理模块 202进一步用于,确定当前 Redo日志的数据包发送 完成后, 通知数据迁移管控模块 201 , 同时将发送完日志到停止服务之间产 生的 Redo日志的数据发送给数据移入处理模块 203;  The data removal processing module 202 is further configured to: after determining that the data transmission of the current Redo log is completed, notify the data migration management module 201, and send the data of the Redo log generated between the sending of the log and the stopping of the service to the data migration processing module 203. ;
相应的数据迁移管控模块 201 , 用于暂停数据移出节点的服务。  The corresponding data migration management module 201 is configured to suspend the service of the data removal node.
数据移入处理模块 203进一步用于,根据接收的 Redo日志,获取 Redo 曰志中记录的操作, 将操作中的数据同迁移数据比较, 若与迁移数据相同, 则不做任何操作, 若与迁移数据不同, 则重新执行一次所述操作, 保证数 据一致性, 执行完 Redo日志记录的操作后, 通知数据迁移管控模块 201 ; 数据迁移管控模块 201 , 用于根据数据移入处理模块 203的通知, 通知 所有节点及客户端, 原先由数据移出节点提供的迁移数据相关服务, 更改 为由数据移入节点提供服务, 并将路由表中 Key对应的节点地址中数据移 出节点地址, 替换为数据移入节点地址。  The data migration processing module 203 is further configured to: according to the received Redo log, obtain an operation recorded in the Redo log, compare the data in the operation with the migrated data, and if the migration data is the same, do not perform any operation, if the data is migrated If the operation is performed again, the data consistency is ensured. After the operation of the Redo log record is performed, the data migration management module 201 is notified. The data migration management module 201 is configured to notify all the notifications according to the data migration processing module 203. The node and the client, the migration data related service originally provided by the data removal node is changed to be served by the data migration node, and the data in the node address corresponding to the Key in the routing table is moved out of the node address, and replaced by the data migration node address.
数据迁移管控模块 201 进一步用于, 将数据移出节点中的迁移数据清 除。  The data migration management module 201 is further configured to clear the migrated data in the data out of the node.
以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围, 凡在本发明的精神和原则之内所作的任何修改、 等同替换和改进 等, 均应包含在本发明的保护范围之内。 The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. All modifications, equivalent substitutions and improvements made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

权利要求书 Claim
1、 一种数据迁移的方法, 该方法包括:  1. A method of data migration, the method comprising:
根据用户选择通知数据移出节点进行数据迁移;  Data migration based on user selection notification data removal node;
数据移出节点将迁移数据及重做日志直接发送给数据移入节点; 数据移入节点接收迁移数据并保存, 根据重做日志对迁移数据进行校 马全和更新。  The data removal node sends the migration data and the redo log directly to the data migration node; the data migration node receives the migration data and saves it, and performs the calibration and update of the migration data according to the redo log.
2、 根据权利要求 1所述的方法, 其中, 所述通知包括: 数据移入节点 地址、 迁移数据所在的虚节点地址。  2. The method according to claim 1, wherein the notification comprises: a data movement into a node address, and a virtual node address where the migration data is located.
3、 根据权利要求 2所述的方法, 其中, 所述数据移出节点将迁移数据 及重做日志直接发送给数据移入节点包括:  3. The method according to claim 2, wherein the data removal node directly transmitting the migration data and the redo log to the data migration node comprises:
数据移出节点根据数据移入节点地址, 将虚节点地址中的迁移数据及 重做日志以数据包的形式发送给数据移入节点, 在当前重做日志的数据包 发送完成后, 通知管控平台, 管控平台暂停数据移出节点的服务。  The data removal node sends the migration data and the redo log in the virtual node address to the data migration node in the form of data packets according to the data movement into the node address, and after the data packet of the current redo log is sent, the management platform is notified, and the control platform is notified. Pause data to move out of the node's service.
4、 根据权利要求 1所述的方法, 其中, 所述根据重做日志对迁移数据 进行校验和更新包括:  4. The method according to claim 1, wherein the verifying and updating the migration data according to the redo log comprises:
数据移入节点根据接收到的重做日志, 确定重做日志中记录的操作对 应的数据与迁移数据不同, 重新执行一次该操作。  The data migration node determines that the data corresponding to the operation recorded in the redo log is different from the migration data according to the received redo log, and performs the operation again.
5、 根据权利要求 1至 4任一项所述的方法, 其中, 所述根据重做日志 对迁移数据进行校验和更新之后, 该方法还包括:  The method according to any one of claims 1 to 4, wherein after the checking and updating the migration data according to the redo log, the method further includes:
数据移入节点通知管控平台重操作完成, 管控平台通知所有节点及客 户端, 由数据移入节点代替数据移出节点为客户端提供迁移数据的相关服 务, 同时清除数据移出节点中的迁移数据。  The data moving in node notifies the management platform that the re-operation is completed, and the control platform notifies all nodes and clients that the data moving in node replaces the data removal node to provide the client with the relevant services for migrating data, and at the same time clears the migration data in the data removal node.
6、 一种数据迁移的系统, 该系统包括: 数据迁移管控模块、 数据移出 处理模块、 数据移入处理模块;  6. A data migration system, the system comprising: a data migration management module, a data removal processing module, and a data migration processing module;
数据迁移管控模块, 设置为根据用户选择通知数据移出处理模块迁移 数据; The data migration management module is configured to migrate the data removal processing module according to the user selection notification Data
数据移出处理模块, 设置为将迁移数据及重做日志发送给数据移入处 理模块;  The data removal processing module is configured to send the migration data and the redo log to the data migration processing module;
数据移入处理模块, 设置为接收迁移数据并保存, 根据重做日志对接 收的迁移数据进行校验和更新。  The data is moved into the processing module, set to receive the migrated data and saved, and the received migration data is checked and updated according to the redo log.
7、 根据权利要求 6所述的系统, 其中,  7. The system according to claim 6, wherein
所述数据迁移管控模块设置为, 根据用户选择, 将包含数据移入节点 地址、 迁移数据所在的虚节点地址的通知发送给数据移出处理模块。  The data migration management module is configured to send, to the data removal processing module, a notification that the data is moved into the node address and the virtual node address where the data is migrated according to the user selection.
8、 根据权利要求 7所述的系统, 其中,  8. The system according to claim 7, wherein
所述数据移出处理模块设置为, 根据数据移入节点地址, 将虚节点中 的迁移数据及重做日志以数据包的形式发送给数据移入处理模块, 确定当 前重做日志的数据包发送完成, 通知数据迁移管控模块;  The data removal processing module is configured to: according to the data movement into the node address, send the migration data and the redo log in the virtual node to the data migration processing module in the form of a data packet, and determine that the data packet of the current redo log is sent, and the notification is complete. Data migration management module;
所述数据迁移管控模块设置为, 根据数据移出处理模块的通知, 暂停 数据移出节点的服务。  The data migration management module is configured to suspend the service of the data removal node according to the notification of the data removal processing module.
9、 根据权利要求 6所述的系统, 其中,  9. The system according to claim 6, wherein
所述数据移入处理模块设置为, 读取接收的重做日志中记录的操作, 确定所述操作对应的数据与迁移数据不同, 重新执行一次该操作。  The data movement processing module is configured to read an operation recorded in the received redo log, determine that the data corresponding to the operation is different from the migration data, and perform the operation again.
10、 根据权利要求 6至 9任一项所述的系统, 其中,  10. The system according to any one of claims 6 to 9, wherein
所述数据移入处理模块设置为, 重操作完成后通知数据迁移管控模块; 相应的, 所述数据迁移管控模块设置为, 通知所有节点及客户端, 由 数据移入节点代替数据移出节点为客户端提供迁移数据的相关服务, 清除 数据移出节点中的迁移数据。  The data migration processing module is configured to notify the data migration management module after the re-operation is completed; correspondingly, the data migration management module is configured to notify all nodes and clients, and the data migration node replaces the data removal node to provide the client with The services related to migrating data, clearing the migration data from the data removal node.
PCT/CN2011/073485 2010-10-21 2011-04-28 Data transfer method and system WO2012051845A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010514650.XA CN101997911B (en) 2010-10-21 2010-10-21 Data migration method and system
CN201010514650.X 2010-10-21

Publications (1)

Publication Number Publication Date
WO2012051845A1 true WO2012051845A1 (en) 2012-04-26

Family

ID=43787482

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/073485 WO2012051845A1 (en) 2010-10-21 2011-04-28 Data transfer method and system

Country Status (2)

Country Link
CN (1) CN101997911B (en)
WO (1) WO2012051845A1 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101997911B (en) * 2010-10-21 2015-07-22 中兴通讯股份有限公司 Data migration method and system
CN103067433B (en) * 2011-10-24 2016-04-20 阿里巴巴集团控股有限公司 A kind of data migration method of distributed memory system, equipment and system
CN103166860A (en) * 2011-12-19 2013-06-19 中兴通讯股份有限公司 Method and device for peer-to-peer (P2P) overlay network data migration
CN103491124B (en) * 2012-06-14 2018-08-14 南京中兴软件有限责任公司 A kind of method that multimedia message data is handled and distributed cache system
CN102780777B (en) * 2012-07-19 2015-04-08 北京蓝汛通信技术有限责任公司 Log collecting method and log collecting system
CN102855299A (en) * 2012-08-16 2013-01-02 上海引跑信息科技有限公司 Method for realizing iterative migration of distributed database without interrupting service
US9378068B2 (en) 2013-03-13 2016-06-28 International Business Machines Corporation Load balancing for a virtual networking system
US9438670B2 (en) 2013-03-13 2016-09-06 International Business Machines Corporation Data replication for a virtual networking system
CN104468674B (en) * 2013-09-25 2020-01-14 南京中兴新软件有限责任公司 Data migration method and device
CN105528368B (en) * 2014-09-30 2019-03-12 北京金山云网络技术有限公司 A kind of database migration method and device
CN104486373A (en) * 2014-11-21 2015-04-01 华为技术有限公司 Lock resource migration method, nodes and distributed system
CN106034080A (en) * 2015-03-10 2016-10-19 中兴通讯股份有限公司 Metadata migration method and metadata migration device in distributed system
CN105227366B (en) * 2015-10-15 2018-08-31 深圳市金证科技股份有限公司 Safeguard the method and system of the consistency of distributed data
CN106843745A (en) * 2015-12-03 2017-06-13 南京中兴新软件有限责任公司 Capacity expansion method and device
CN105574217B (en) * 2016-03-16 2019-04-30 中国联合网络通信集团有限公司 The method of data synchronization and device of distributed relation database
CN106210041B (en) * 2016-07-05 2019-09-20 杭州华为数字技术有限公司 A kind of method for writing data and server end network interface card
CN108121719B (en) * 2016-11-28 2020-06-30 北京国双科技有限公司 Method and device for realizing data extraction conversion loading ETL
CN107153512B (en) * 2017-04-01 2020-05-08 华为技术有限公司 Data migration method and device
CN107153680B (en) * 2017-04-18 2021-02-02 北京思特奇信息技术股份有限公司 Method and system for on-line node expansion of distributed memory database
CN111338806B (en) * 2020-05-20 2020-09-04 腾讯科技(深圳)有限公司 Service control method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079902A (en) * 2007-06-29 2007-11-28 清华大学 A great magnitude of data hierarchical storage method
CN101442435A (en) * 2008-12-25 2009-05-27 华为技术有限公司 Method and apparatus for managing business data of distributed system and distributed system
CN101727504A (en) * 2010-01-29 2010-06-09 成都市华为赛门铁克科技有限公司 Method and device for migrating data of file system
CN101997911A (en) * 2010-10-21 2011-03-30 中兴通讯股份有限公司 Data migration method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079902A (en) * 2007-06-29 2007-11-28 清华大学 A great magnitude of data hierarchical storage method
CN101442435A (en) * 2008-12-25 2009-05-27 华为技术有限公司 Method and apparatus for managing business data of distributed system and distributed system
CN101727504A (en) * 2010-01-29 2010-06-09 成都市华为赛门铁克科技有限公司 Method and device for migrating data of file system
CN101997911A (en) * 2010-10-21 2011-03-30 中兴通讯股份有限公司 Data migration method and system

Also Published As

Publication number Publication date
CN101997911A (en) 2011-03-30
CN101997911B (en) 2015-07-22

Similar Documents

Publication Publication Date Title
WO2012051845A1 (en) Data transfer method and system
US20200358848A1 (en) Methods, systems, and media for providing distributed database access during a network split
US7783763B2 (en) Managing stateful data in a partitioned application server environment
US9031910B2 (en) System and method for maintaining a cluster setup
US20200356277A1 (en) De-duplication of client-side data cache for virtual disks
WO2017097059A1 (en) Distributed database system and self-adaptation method therefor
US9330108B2 (en) Multi-site heat map management
US10097659B1 (en) High performance geographically distributed data storage, retrieval and update
US20120159102A1 (en) Distributed storage system, distributed storage method, and program and storage node for distributed storage
WO2020005808A1 (en) Multi-table partitions in a key-value database
WO2012045245A1 (en) Method and system for maintaining data consistency
US20230367494A1 (en) Reseeding a mediator of a cross-site storage solution
US10970190B2 (en) Hybrid log viewer with thin memory usage
CN103218175A (en) Multi-tenant cloud storage platform access control system
CN103516549B (en) A kind of file system metadata log mechanism based on shared object storage
US8621260B1 (en) Site-level sub-cluster dependencies
JP2012008934A (en) Distributed file system and redundancy method in distributed file system
US11341001B1 (en) Unlimited database change capture for online database restores
US11461192B1 (en) Automatic recovery from detected data errors in database systems
CN112540827A (en) Load balancing system based on k8s platform and implementation method
US20230169093A1 (en) Fast database scaling utilizing a decoupled storage and compute architecture
WO2023097229A1 (en) Fast database scaling utilizing a decoupled storage and compute architecture
US10346299B1 (en) Reference tracking garbage collection for geographically distributed storage system
CN103246716B (en) Based on object copies efficient management and the system of object cluster file system
US20200226097A1 (en) Sand timer algorithm for tracking in-flight data storage requests for data replication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11833751

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11833751

Country of ref document: EP

Kind code of ref document: A1