WO2020238858A1 - 数据迁移方法、装置、及计算机可读存储介质 - Google Patents

数据迁移方法、装置、及计算机可读存储介质 Download PDF

Info

Publication number
WO2020238858A1
WO2020238858A1 PCT/CN2020/092128 CN2020092128W WO2020238858A1 WO 2020238858 A1 WO2020238858 A1 WO 2020238858A1 CN 2020092128 W CN2020092128 W CN 2020092128W WO 2020238858 A1 WO2020238858 A1 WO 2020238858A1
Authority
WO
WIPO (PCT)
Prior art keywords
migration
partition
data
cluster server
preset
Prior art date
Application number
PCT/CN2020/092128
Other languages
English (en)
French (fr)
Inventor
周伟
曾岩
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2020238858A1 publication Critical patent/WO2020238858A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Definitions

  • This application relates to the technical field of Fintech, and in particular to a data migration method, device, and computer-readable storage medium.
  • Hadoop as an open source project with distributed storage and computing capabilities, uses a parallel computing framework for efficient distributed computing, and has its own distributed file system HDFS, which can provide scalable and robust data storage As a result, it quickly received attention from various industries and was widely used in finance, commerce, education and other fields.
  • a distributed system is an important problem that needs to be solved is to determine the data distribution strategy in the cluster.
  • data migration is required. The way to relieve the storage pressure and load pressure of the original server.
  • hdfs data file and hive a data warehouse tool based on Hadoop
  • distcp distributed copy
  • the main purpose of this application is to provide a data migration method, device, and computer-readable storage medium, aiming to solve the related problems of low data migration efficiency and inability to migrate in batches on demand.
  • this application provides a data migration method, the data migration method includes:
  • the executable table creation statement is executed, the corresponding hive table is created, and the corresponding data file is migrated to the hive table based on the migration information.
  • the migration information includes partition information and data file storage locations, and the step of migrating corresponding data files to the hive table based on the migration information includes:
  • partition information is that there is no partition
  • call a preset interface to obtain a data file corresponding to the storage location of the data file, and migrate the obtained data file to the hive table.
  • the method further includes:
  • partition information is that there is a partition
  • call the first preset framework to remotely log in to the migration cluster server, execute the query of the partition value of the library table on the migration cluster server, and configure the Output the partition value to the designated directory under the pre-designated file of the migrated cluster server;
  • the preset interface is called, the data file corresponding to the storage location of the data file is obtained, and the obtained data file is migrated to the partition corresponding to the hive table.
  • the data migration method further includes:
  • the first preset framework is an expect+ssh framework
  • the second preset framework is an expect+scp framework
  • the preset interface is a distributed copy distcp interface.
  • the present application also provides a data migration device, the data migration device includes: a memory, a processor, and a data migration program stored on the memory and running on the processor, so The following steps are implemented when the data migration program is executed by the processor:
  • the executable table creation statement is executed, the corresponding hive table is created, and the corresponding data file is migrated to the hive table based on the migration information.
  • the migration information includes partition information and data file storage locations
  • the data migration program implements the following steps when executed by the processor:
  • partition information is that there is no partition
  • call a preset interface to obtain a data file corresponding to the storage location of the data file, and migrate the obtained data file to the hive table.
  • partition information is that there is a partition
  • call the first preset framework to remotely log in to the migration cluster server, execute the query of the partition value of the library table on the migration cluster server, and configure the Output the partition value to the designated directory under the pre-designated file of the migrated cluster server;
  • the preset interface is called, the data file corresponding to the storage location of the data file is obtained, and the obtained data file is migrated to the partition corresponding to the hive table.
  • the present application also provides a computer-readable storage medium on which a data migration program is stored, and when the data migration program is executed by a processor, the data migration as described above is realized. Method steps.
  • This application provides a data migration method, device, and computer-readable storage medium, which read a preset configuration file cyclically, and obtain the migration cluster server in the migration message from reading to migration message;
  • the first preset framework remotely logs in to the migration out cluster server, executes the table creation statement in the database table in the migration message on the migration out cluster server, and outputs the table creation statement to the specified under the pre-designated file of the migration out cluster server In the directory; call the second preset framework to synchronize the table construction statement on the migrated out cluster server, and perform table formatting analysis on the synchronized table construction statement to obtain migration information and executable table construction statement; execute the executable
  • the executed table creation statement creates the corresponding hive table, and migrates the corresponding data file to the created hive table based on the migration information.
  • this application can realize the complete cross-cluster migration of hive tables, and can intelligently realize on-demand batch migration by directly reading the migration message in the configuration file, compared to the manual input in stages during the related data migration process For clusters and database tables to be migrated, this application can reduce human intervention, thereby improving the overall efficiency of migration.
  • FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the application
  • FIG. 2 is a schematic flowchart of the first embodiment of the data migration method of this application.
  • FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the application.
  • the data migration device in the embodiment of the application may be a smart phone, or a terminal device such as a PC (Personal Computer, personal computer), a tablet computer, and a portable computer.
  • a terminal device such as a PC (Personal Computer, personal computer), a tablet computer, and a portable computer.
  • the data migration device may include a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection and communication between these components.
  • the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a Wi-Fi interface).
  • the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
  • FIG. 1 does not constitute a limitation on the data migration device, and may include more or less components than shown in the figure, or a combination of certain components, or different components Layout.
  • a memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a data migration program.
  • the network interface 1004 is mainly used to connect to a back-end server and communicate with the back-end server;
  • the user interface 1003 is mainly used to connect to a client and communicate with the client;
  • the processor 1001 can be used to Call the data migration program stored in the memory 1005, and execute each step of the following data migration method.
  • This application provides a data migration method.
  • FIG. 2 is a schematic flowchart of a first embodiment of a data migration method according to this application.
  • the data migration method includes:
  • Step S10 read a preset configuration file cyclically, read a migration message from the preset configuration file, and obtain a migration cluster server in the migration message;
  • the data migration method of this embodiment is implemented by a data migration device.
  • the device is described by taking migration into a cluster server as an example, where the cluster server is a kind of cluster that can interact with hadoop (a distributed system infrastructure) cluster
  • the server migration into the cluster server refers to the server corresponding to the target hadoop cluster to which the data is migrated.
  • the server corresponding to the A hadoop cluster is called the migrated cluster server
  • the server corresponding to the B hadoop cluster is called the migrated cluster server.
  • each cluster server reads the preset configuration file cyclically.
  • the migration message is read from the preset configuration file, it is read that the data of another cluster is to be migrated to the migration.
  • the migration out cluster server in the migration message is obtained.
  • the preset configuration file is used to record the cluster information and library table information that need to be migrated. It can be manually set in batches, or it can be automatically performed according to the data migration request when the staff initiates a data migration request through the software. set up.
  • the corresponding when it is necessary to migrate the tables a, b, and c under the B library from the B hadoop cluster to the tables a, b, and c corresponding to the E library in the A hadoop cluster, the corresponding can be set in the preset configuration file Medium configuration:
  • the cluster server A can read the preset configuration file cyclically, and when the move-in message is read from the preset configuration file, the move-out message in the move-in message can be obtained.
  • the cluster server is the B cluster server.
  • Step S20 call the first preset framework to remotely log in to the migration out cluster server, execute on the migration out cluster server a table creation statement querying the database table in the migration message, and output the table creation statement To the designated directory under the pre-designated file of the migrated cluster server;
  • the first preset framework After obtaining the migration out cluster server, call the first preset framework to remotely log in to the migration out cluster server to execute the table creation statement querying the database table in the migration message on the migration out cluster server, and output the table creation statement to Move out of the designated directory under the pre-designated file of the cluster server.
  • the first preset framework is expect+ssh framework, expect is a software used to realize automatic interaction function
  • ssh is the abbreviation of secure shell, which is an encrypted network protocol
  • its common applications include remote command line login And remote command execution, but any network service can be secured through ssh.
  • Under liunx an operating system
  • expect can be used to realize ssh automatic login and execute scripts.
  • Step S30 invoking the second preset framework to synchronize the table building statement on the migrated out cluster server, and performing table formatting analysis on the synchronized table building statement to obtain migration information and executable table building statement;
  • the second preset framework is called to synchronize the table building statements on the migrated out cluster server, and the synchronized table building statements are formatted and analyzed to obtain migration information and executable table building statements.
  • the second preset framework is expect+scp framework, expect is a software used to realize automatic interaction function, scp is the abbreviation of secure copy, is a transmission command used for cross-server encryption, you can use expect+scp To transfer files across machines and synchronize files automatically.
  • liunx's sed a Linux command
  • awk a text processing tool
  • the migration information can also be parsed, where the migration information includes partition information and data file storage locations.
  • Step S40 Execute the executable table building statement, create a corresponding hive table, and migrate the corresponding data file to the hive table based on the migration information.
  • the migration information includes partition information and data file storage locations, and the step of "migrating corresponding data files to the hive table based on the migration information" includes:
  • Step a1 judging whether the partition information is a non-existent partition or an existing partition
  • Step a2 If the partition information does not exist, call a preset interface to obtain a data file corresponding to the storage location of the data file, and migrate the obtained data file to the hive table.
  • the preset interface is called to obtain the data file corresponding to the data file storage location (that is, the data file stored in the data file storage location), and then migrate the obtained data file to hive table.
  • the preset interface is a distcp (distributed copy) interface.
  • the data file can also be migrated through disk copy.
  • step a2 it also includes:
  • Step a3 If the partition information indicates that there are partitions, call the first preset framework to remotely log in to the migration cluster server, execute the query of the partition value of the library table on the migration cluster server, and Output the partition value to a designated directory under the pre-designated file of the migrated cluster server;
  • the first preset framework is expect+ssh framework
  • expect is a software used to realize automatic interaction function
  • ssh is the abbreviation of secure shell, which is an encrypted network protocol
  • its common applications include remote command line login And remote command execution, but any network service can be secured through ssh.
  • liunx an operating system
  • expect can be used to realize ssh automatic login and execute scripts.
  • Step a4 calling the second preset framework to synchronize the partition value on the outgoing cluster server, and perform partition formatting analysis on the synchronized partition value to obtain an executable command to add a partition;
  • the second preset framework is called to synchronize the partition value on the migrated out cluster server, and partition formatting analysis is performed on the synchronized partition value to obtain an executable command for adding a partition.
  • the second preset frame is expect+scp frame.
  • the partition value found on table a in cluster B is:
  • the partition After processing the contents of the partition file by calling the expect+scp framework, the partition can be converted into:
  • Step a5 execute the command for adding partitions, and add corresponding partitions in the hive table
  • Step a6 call the preset interface, obtain the data file corresponding to the storage location of the data file, and migrate the obtained data file to the partition corresponding to the hive table.
  • the command for adding partitions After obtaining the executable command for adding partitions, execute the command for adding partitions to add corresponding partitions in the created hive table. Furthermore, call the preset interface to obtain the data file corresponding to your data file storage location, and migrate the obtained data file to the partition corresponding to the hive table. Among them, the preset interface is the distcp interface.
  • the embodiment of the present application provides a data migration method, which reads a preset configuration file cyclically, and obtains the migration cluster server in the migration message from reading to the migration message; calls the first preset framework for remote login To move out of the cluster server, execute the table creation statement querying the database table in the move-in message on the migrated cluster server, and output the table creation statement to the specified directory under the pre-designated file of the migrated cluster server; call the second The preset framework synchronizes the table construction statement on the migrated out cluster server, and performs table formatting analysis on the synchronized table construction statement to obtain migration information and executable table construction statement; execute the executable table construction statement , To create the corresponding hive table, and migrate the corresponding data file to the created hive table based on the migration information.
  • the embodiment of the present application can realize complete hive table cross-cluster migration, and can directly read the migration message in the configuration file to intelligently realize the on-demand batch migration, which is compared with the manual analysis required in the related data migration process.
  • the embodiment of the present application can reduce human intervention, thereby improving the overall efficiency of migration.
  • the data migration method further includes:
  • Step A obtaining the size of the data file before migration and the size of the data file after migration through the preset interface
  • the size of the data file before and after the migration can be checked to detect whether the data is completely migrated.
  • the size of the data file before migration and the size of the data file after migration can be obtained through the preset interface, where the preset interface is a dictcp interface.
  • Step B Determine that the size of the data file before migration and the size of the data file after migration are inconsistent, and generate corresponding error message.
  • This application can be specifically applied to the data migration process in financial institutions. For example, when a banking institution is migrating a computer room, it needs to migrate data from the old cluster in the original computer room to the new cluster in the new computer room, which can be preset Set the migration information in the file, and then start the main program of the cluster server corresponding to each new cluster in the new computer room, so that each cluster server reads the preset configuration file in a loop, and when the migration message is read from the preset configuration file , Obtain the migration out cluster server in the migration message; call the first preset framework to remotely log in to the migration out cluster server to execute the table creation statement querying the database table in the migration message on the migration out cluster server, and build The table statement is output to the specified directory under the pre-designated file of the migrated cluster server; the second preset framework is called to synchronize the table creation statement on the migrated cluster server, and the synchronized table creation statement is formatted and analyzed , To obtain migration information and executable table building statements; execute the executable table building statements
  • the application also provides a data migration device.
  • the data migration device includes: a memory, a processor, and a data migration program stored on the memory and running on the processor, and the data migration program implements the following steps when executed by the processor:
  • the executable table building statement is executed, the corresponding hive table is created, and the corresponding data file is migrated to the hive table based on the migration information.
  • the migration information includes partition information and data file storage locations
  • the data migration program implements the following steps when executed by the processor:
  • partition information is that there is no partition
  • call a preset interface to obtain a data file corresponding to the storage location of the data file, and migrate the obtained data file to the hive table.
  • partition information is that there is a partition
  • call the first preset framework to remotely log in to the migration cluster server, execute the query of the partition value of the library table on the migration cluster server, and configure the Output the partition value to the designated directory under the pre-designated file of the migrated cluster server;
  • the preset interface is called, the data file corresponding to the storage location of the data file is obtained, and the obtained data file is migrated to the partition corresponding to the hive table.
  • the first preset frame is an expect+ssh frame
  • the second preset frame is an expect+scp frame
  • the preset interface is a distributed copy distcp interface.
  • the method implemented when the data migration program is executed corresponds to the steps in the embodiment of the data migration method, and its functions and implementation processes will not be repeated here.
  • the present application also provides a computer-readable storage medium on which a data migration program is stored.
  • the data migration program is executed by a processor, the data migration method as described in any of the above embodiments is implemented. step.
  • the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. ⁇
  • the technical solution of this application essentially or the part that contributes to the related technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, Disks, optical disks) include a number of instructions to enable a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种数据迁移方法、装置、及计算机可读存储介质。该方法包括:循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取迁出集群服务器;调用第一预设框架远程登录至迁出集群服务器,在迁出集群服务器上执行查询迁入消息中库表的建表语句,并将建表语句输出至迁出集群服务器的预先指定文件下的指定目录中;调用第二预设框架对迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;执行可执行的建表语句创建对应的hive表,并基于迁移信息将对应的数据文件迁移至hive表中。

Description

数据迁移方法、装置、及计算机可读存储介质
本申请要求2019年5月30日申请的,申请号为201910461350.0,名称为“数据迁移方法、装置、设备及计算机可读存储介质”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本申请涉及金融科技(Fintech)技术领域,尤其涉及一种数据迁移方法、装置、及计算机可读存储介质。
背景技术
随着计算机技术的发展,越来越多的技术(大数据、分布式、区块链Blockchain、人工智能等)应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,但由于金融行业的安全性、实时性要求,也对技术提出了更高的要求。
随着信息技术的不断发展,信息的数据量持续增长,各种企业的数据也迎来了爆发式增长,因此,迫切需要运算处理大规模数据的能力。而Hadoop(海杜普)作为具有分布式存储和计算能力的开源项目,其采用并行计算框架进行高效的分布式计算,并拥有自己的分布式文件系统HDFS,可提供可扩展、健壮的数据存储,因而很快就得到各个行业的重视,被广泛应用于金融、商业、教育等领域。
分布式系统作为海量数据存储系统,需要解决的一个重要问题便是决定数据在集群中的分布策略,当某一数据库集群的存储能力和处理能力达到集群能力的上限,此时则需要通过数据迁移的方式缓解原有服务器的存储压力和负载压力。目前,在数据迁移过程中,通常是分别对hdfs数据文件和hive(是基于Hadoop的一个数据仓库工具)元数据两部分进行迁移的,具体的,先通过磁盘拷贝、distcp(分布式拷贝)等方式来跨大数据集群迁移dhfs数据文件,然后从hivemetastore(一种存储hive元数据的服务)中批量导出迁移hive表结构和分区值(即hive元数据)。在上述数据迁移过程需人工分次输入待迁移的集群和库表,导致数据迁移效率也较低,无法做到按需批量迁移。
发明概述
技术问题
问题的解决方案
技术解决方案
本申请的主要目的在于提供一种数据迁移方法、装置、及计算机可读存储介质,旨在解决相关的数据迁移效率低、无法按需批量迁移的问题。
为实现上述目的,本申请提供一种数据迁移方法,所述数据迁移方法包括:
循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取所述迁入消息中的迁出集群服务器;
调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;
执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移信息将对应的数据文件迁移至所述hive表中。
在一实施例中,所述迁移信息包括分区信息和数据文件存储位置,所述基于所述迁移信息将对应的数据文件迁移至所述hive表中的步骤包括:
判断所述分区信息为不存在分区还是为存在分区;
若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
在一实施例中,所述判断所述分区信息为不存在分区还是为存在分区的步骤之后,还包括:
若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添加分区的命令;
执行所述添加分区的命令,在所述hive表中添加对应的分区;
调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
在一实施例中,所述数据迁移方法还包括:
通过所述预设接口获取迁移前数据文件的大小和迁移后数据文件的大小;
确定所述迁移前数据文件的大小和所述迁移后数据文件的大小不一致,生成对应的错误提示信息。
在一实施例中,所述第一预设框架为expect+ssh框架,所述第二预设框架为expect+scp框架,所述预设接口为分布式拷贝distcp接口。
此外,为实现上述目的,本申请还提供一种数据迁移装置,所述数据迁移装置包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的数据迁移程序,所述数据迁移程序被所述处理器执行时实现以下步骤:
循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取所述迁入消息中的迁出集群服务器;
调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;
执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移信息将对应的数据文件迁移至所述hive表中。
在一实施例中,所述迁移信息包括分区信息和数据文件存储位置,所述数据迁移程序被所述处理器执行时实现以下步骤:
判断所述分区信息为不存在分区还是为存在分区;
若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
在一实施例中,所述数据迁移程序被所述处理器执行时实现以下步骤:
若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集 群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添加分区的命令;
执行所述添加分区的命令,在所述hive表中添加对应的分区;
调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有数据迁移程序,所述数据迁移程序被处理器执行时实现如上所述的数据迁移方法的步骤。
本申请提供一种数据迁移方法、装置、及计算机可读存储介质,通过循环读取预设配置文件,在从读取到迁入消息时,获取该迁入消息中的迁出集群服务器;调用第一预设框架远程登录至迁出集群服务器,在迁出集群服务器上执行查询迁入消息中库表的建表语句,并将建表语句输出至迁出集群服务器的预先指定文件下的指定目录中;调用第二预设框架对迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;执行该可执行的建表语句,创建对应的hive表,并基于该迁移信息将对应的数据文件迁移至该创建得到的hive表中。通过上述方式,本申请可实现完整的hive表跨集群迁移,且可直接通过读取配置文件中的迁入消息而智能实现按需批量迁移,相比于相关数据迁移过程中需人工分次输入待迁移的集群和库表,本申请可减少人为干预,从而提升迁移的整体效率。
发明的有益效果
对附图的简要说明
附图说明
图1为本申请实施例方案涉及的硬件运行环境的设备结构示意图;
图2为本申请数据迁移方法第一实施例的流程示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
发明实施例
本发明的实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的设备结构示意图。
本申请实施例数据迁移设备可以是智能手机,也可以是PC(Personal Computer,个人计算机)、平板电脑、便携计算机等终端设备。
如图1所示,该数据迁移设备可以包括:处理器1001,例如CPU,通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如Wi-Fi接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的数据迁移设备结构并不构成对数据迁移设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及数据迁移程序。
在图1所示的终端中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端,与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的数据迁移程序,并执行以下数据迁移方法的各个步骤。
基于上述硬件结构,提出本申请数据迁移方法的各实施例。
本申请提供一种数据迁移方法。
参照图2,图2为本申请数据迁移方法第一实施例的流程示意图。
在本实施例中,该数据迁移方法包括:
步骤S10,循环读取预设配置文件,从所述预设配置文件中读取到迁入消息, 获取所述迁入消息中的迁出集群服务器;
计算机技术的发展和大数据的来临,计算机技术在金融机构(如银行、保险、证券、理财机构)的应用也越来越广,现在的金融机构中在进行数据迁移,对迁移效率、迁移准确性的要求更高。由于数据量越来越多,目前的金融机构进行数据迁移时,通常是分别对hdfs数据文件和hive(是基于Hadoop的一个数据仓库工具)元数据两部分进行迁移的,其迁移方案是隔离的、不完整的,且数据迁移过程需人工分次输入待迁移的集群和库表,导致数据迁移效率也较低,无法做到按需批量迁移,这种情况严重不符合金融机构的要求。
本实施例的数据迁移方法是由数据迁移设备实现的,该设备以迁入集群服务器为例进行说明,其中,集群服务器是一种可以与hadoop(一种分布式系统的基础架构)集群进行交互的服务器,迁入集群服务器是指数据所迁移至的目标hadoop集群所对应的服务器,例如,从B hadoop集群将B库下的a、b、c表迁移至Ahadoop集群中E库对应的a、b、c表,此时,则将A hadoop集群所对应的服务器称作迁入集群服务器,将B hadoop集群所对应的服务器称作迁出集群服务器。在本实施例中,各集群服务器会循环读取预设配置文件,当从预设配置文件中读取到迁入消息时,即读取到要将另一集群的数据迁入至该迁入集群服务器的迁入消息时,此时,则获取该迁入消息中的迁出集群服务器。其中,该预设配置文件用于记录需要进行数据迁移的集群信息和库表信息,可人为进行批量设定,也可以在工作人员通过软件发起数据迁移请求时,根据该数据迁移请求来自动进行设定。例如,上述例中,当需要从B hadoop集群将B库下的a、b、c表迁移至A hadoop集群中E库对应的a、b、c表,对应的,可在该预设配置文件中配置:
B a E a
B b E b
B c E c
对应的,A集群服务器在启动主程序后,可循环读取该预设配置文件,当从所述预设配置文件中读取到迁入消息时,可获取得到该迁入消息中的迁出集群服务器为B集群服务器。
步骤S20,调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
在获取到迁出集群服务器之后,调用第一预设框架远程登录至迁出集群服务器,以在迁出集群服务器上执行查询迁入消息中库表的建表语句,并将建表语句输出至迁出集群服务器的预先指定文件下的指定目录中。其中,第一预设框架为expect+ssh框架,expect是一种用来实现自动交互功能的软件,ssh为secure shell的缩写,是一种加密网络协议,其常见的应用程序包括远程命令行登录和远程命令执行,但任何网络服务都可以通过ssh进行安全保护,liunx(一种操作系统)下可用expect实现ssh自动登录并执行脚本。
步骤S30,调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;
然后,调用第二预设框架对迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,以得到迁移信息和可执行的建表语句。其中,第二预设框架为expect+scp框架,expect是一种用来实现自动交互功能的软件,scp为secure copy的缩写,是一种用于跨服务器加密的传输命令,可使用expect+scp去跨机器传输文件、自动同步文件。在对同步后的建表语句进行表格式化解析时,可采用liunx的sed(一项Linux指令)或awk(一种文本处理工具)文本处理工具进行表格式化解析。具体的,为了防止表在所要迁入的集群(如上例中a集群)已存在而报错,需先通过sed将create table字符替换成create table if not exists,然后判断表a是否要指定存储路径,若已指定,则将建表语句中的location后的路径替换为指定的路径;若未指定,则将建表语句后的location这行之后的属性全部删除,使用默认的路径与表其他属性,即可得到可执行的建表语句。此外,在进行表格式化解析后,还可解析得到迁移信息,其中迁移信息包括分区信息和数据文件存储位置。
步骤S40,执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移 信息将对应的数据文件迁移至所述hive表中。
最后,执行该可执行的建表语句,以创建对应的hive表,并基于该迁移信息将对应的数据文件迁移至该创建得到的hive表中。
其中,所述迁移信息包括分区信息和数据文件存储位置,步骤“基于所述迁移信息将对应的数据文件迁移至所述hive表中”包括:
步骤a1,判断所述分区信息为不存在分区还是为存在分区;
步骤a2,若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
若分区信息为不存在分区,此时,则调用预设接口,获取与数据文件存储位置对应的数据文件(即存储于数据文件存储位置下的数据文件),进而将获取到的数据文件迁移至hive表中。其中,该预设接口为distcp(分布式拷贝)接口。当然,在具体实施例中,还可以通过磁盘拷贝的方式来进行数据文件的迁移。
此外,在步骤a2之后,还包括:
步骤a3,若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
若分区信息为存在分区,此时,则需调用第一预设框架远程登录至迁出集群服务器,以在迁出集群服务器上执行查询库表的分区值,并将分区值输出至迁出集群服务器的预先指定文件下的指定目录中。其中,第一预设框架为expect+ssh框架,expect是一种用来实现自动交互功能的软件,ssh为secure shell的缩写,是一种加密网络协议,其常见的应用程序包括远程命令行登录和远程命令执行,但任何网络服务都可以通过ssh进行安全保护,liunx(一种操作系统)下可用expect实现ssh自动登录并执行脚本。
步骤a4,调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添加分区的命令;
然后,调用第二预设框架对迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,以得到可执行的添加分区的命令。其中,第二预设框架为expect+scp框架。在对同步后的分区值进行分区格式化解析时,可采 用liunx的sed或awk文本处理工具进行表格式化解析。具体的,先处理分区文件中内容:1)先删除分区文件中的第一行,2)将/替换成为′,3)将=替换成为=′,4)在每一行前面添加partition(,5)在每一行末尾添加);然后,再拼接useE;alter table A add if not exists字符,就格式化成了hive的可执行的添加分区的命令了。例如上例中,在B集群a表上查出来的分区值为:
partition
ds=20150529/city=sz
ds=20150529/city=sh
通过调用expect+scp框架处理完分区文件中的内容后,可将分区转换成:
partition(ds=′20150529′,city=′sz′)
partition(ds=′20190519′,city=′sh′)
最后,可得到在A集群中可执行添加语句格式为:
use E;alter table A add if not exists partition(ds=′20150529′,city=′sz′)
partition(ds=′20190519′,city=′sh′)
步骤a5,执行所述添加分区的命令,在所述hive表中添加对应的分区;
步骤a6,调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
在得到可执行的添加分区的命令后,执行该添加分区的命令,以在创建得到的hive表中添加对应的分区。进而,调用预设接口,获取与数据文件存储位置对应你的数据文件,并将获取到的数据文件迁移至该hive表对应的分区中。其中,该预设接口为distcp接口。
本申请实施例提供一种数据迁移方法,通过循环读取预设配置文件,在从读取到迁入消息时,获取该迁入消息中的迁出集群服务器;调用第一预设框架远程登录至迁出集群服务器,以在迁出集群服务器上执行查询迁入消息中库表的建表语句,并将建表语句输出至迁出集群服务器的预先指定文件下的指定目录中;调用第二预设框架对迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,以得到迁移信息和可执行的建表语句;执行该可执行的建表语句,以创建对应的hive表,并基于该迁移信息将对应的数据文件迁 移至该创建得到的hive表中。通过上述方式,本申请实施例可实现完整的hive表跨集群迁移,且可直接通过读取配置文件中的迁入消息而智能实现按需批量迁移,相比于相关数据迁移过程中需人工分次输入待迁移的集群和库表,本申请实施例可减少人为干预,从而提升迁移的整体效率。
进一步的,基于图2所示的第一实施例,提出本申请数据迁移方法的第二实施例。
在本实施例中,该数据迁移方法还包括:
步骤A,通过所述预设接口获取迁移前数据文件的大小和迁移后数据文件的大小;
在本实施例中,在数据迁移完成后,可对迁移前后的数据文件大小进行校验,以检测数据是否被完整迁移。具体的,可通过该预设接口获取迁移前数据文件的大小和迁移后数据文件的大小,其中,该预设接口为dictcp接口。
步骤B,确定所述迁移前数据文件的大小和所述迁移后数据文件的大小不一致,生成对应的错误提示信息。
然后,检测迁移前数据文件的大小和迁移后数据文件的大小是否相一致,若迁移前数据文件的大小和迁移后数据文件的大小不一致,则说明数据迁移过程中存在问题,此时,则生成对应的错误提示信息,具体的,可在日志文件中显示该错误提示信息。
在相关技术中,需另写脚本来校验迁移前后数据文件的大小,以检测数据迁移是否成功,而本实施例中,在通过distcp接口进行迁移时,可直接获取到迁移前后的数据文件大小,进而自动进行校验,因而可实现智能自动完成数据校验,以检测数据迁移是否成功。
本申请具体可应用在金融机构中的数据迁移过程中,例如银行机构在进行机房的迁移时,需把原机房的旧集群上的数据迁移至新机房的新集群上,可预先在预设配置文件中设定迁移信息,然后启动新机房各新集群对应的集群服务器的主程序,进而使得各集群服务器通过循环读取预设配置文件,在从预设配置文件中读取到迁入消息时,获取该迁入消息中的迁出集群服务器;调用第一预设框架远程登录至迁出集群服务器,以在迁出集群服务器上执行查询迁入消息中 库表的建表语句,并将建表语句输出至迁出集群服务器的预先指定文件下的指定目录中;调用第二预设框架对迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,以得到迁移信息和可执行的建表语句;执行该可执行的建表语句,以创建对应的hive表,并基于该迁移信息将对应的数据文件迁移至该创建得到的hive表中。通过上述方式,可实现完整的hive表跨集群迁移,且可直接通过读取配置文件中的迁入消息而智能实现按需批量迁移,从而将旧集群上的数据迁移至新集群上,相比于相关数据迁移过程中需人工分次输入待迁移的集群和库表,本申请实施例可减少人为干预,从而提升迁移的整体效率。
本申请还提供一种数据迁移装置。
所述数据迁移装置包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的数据迁移程序,所述数据迁移程序被所述处理器执行时实现以下步骤:
循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取所述迁入消息中的迁出集群服务器;
调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;
执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移信息将对应的数据文件迁移至所述hive表中。
进一步地,所述迁移信息包括分区信息和数据文件存储位置,所述数据迁移程序被所述处理器执行时实现以下步骤:
判断所述分区信息为不存在分区还是为存在分区;
若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
进一步地,所述数据迁移程序被所述处理器执行时实现以下步骤:
若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添加分区的命令;
执行所述添加分区的命令,在所述hive表中添加对应的分区;
调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
进一步地,所述数据迁移程序被所述处理器执行时实现以下步骤:
通过所述预设接口获取迁移前数据文件的大小和迁移后数据文件的大小;
确定所述迁移前数据文件的大小和所述迁移后数据文件的大小不一致,生成对应的错误提示信息。
进一步地,所述第一预设框架为expect+ssh框架,所述第二预设框架为expect+scp框架,所述预设接口为分布式拷贝distcp接口。
其中,上述数据迁移程序被执行时所实现的方法与上述数据迁移方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质上存储有数据迁移程序,所述数据迁移程序被处理器执行时实现如以上任一项实施例所述的数据迁移方法的步骤。
本申请计算机可读存储介质的具体实施例与上述数据迁移方法各实施例基本相同,在此不作赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个......”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (9)

  1. 一种数据迁移方法,其中,所述数据迁移方法包括:
    循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取所述迁入消息中的迁出集群服务器;
    调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
    调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;以及
    执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移信息将对应的数据文件迁移至所述hive表中。
  2. 如权利要求1所述的数据迁移方法,其中,所述迁移信息包括分区信息和数据文件存储位置,所述基于所述迁移信息将对应的数据文件迁移至所述hive表中的步骤包括:
    判断所述分区信息为不存在分区还是为存在分区;以及
    若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
  3. 如权利要求2所述的数据迁移方法,其中,所述判断所述分区信息为不存在分区还是为存在分区的步骤之后,还包括:
    若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
    调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添 加分区的命令;
    执行所述添加分区的命令,在所述hive表中添加对应的分区;以及调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
  4. 如权利要求2所述的数据迁移方法,其中,所述数据迁移方法还包括:
    通过所述预设接口获取迁移前数据文件的大小和迁移后数据文件的大小;
    确定所述迁移前数据文件的大小和所述迁移后数据文件的大小不一致,生成对应的错误提示信息。
  5. 如权利要求2至4中任一项所述的数据迁移方法,其中,所述第一预设框架为expect+ssh框架,所述第二预设框架为expect+scp框架,所述预设接口为分布式拷贝distcp接口。
  6. 一种数据迁移装置,其中,所述数据迁移装置包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的数据迁移程序,所述数据迁移程序被所述处理器执行时实现以下步骤:
    循环读取预设配置文件,从所述预设配置文件中读取到迁入消息,获取所述迁入消息中的迁出集群服务器;
    调用第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述迁入消息中库表的建表语句,并将所述建表语句输出至所述迁出集群服务器的预先指定文件下的指定目录中;
    调用第二预设框架对所述迁出集群服务器上的建表语句进行同步,并对同步后的建表语句进行表格式化解析,得到迁移信息和可执行的建表语句;以及
    执行所述可执行的建表语句,创建对应的hive表,并基于所述迁移信息将对应的数据文件迁移至所述hive表中。
  7. 如权利要求6所述的数据迁移装置,其中,所述迁移信息包括分区 信息和数据文件存储位置,所述数据迁移程序被所述处理器执行时实现以下步骤:
    判断所述分区信息为不存在分区还是为存在分区;以及
    若所述分区信息为不存在分区,则调用预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表中。
  8. 如权利要求7所述的数据迁移装置,其中,所述数据迁移程序被所述处理器执行时实现以下步骤:
    若所述分区信息为存在分区,则调用所述第一预设框架远程登录至所述迁出集群服务器,在所述迁出集群服务器上执行查询所述库表的分区值,并将所述分区值输出至所述迁出集群服务器的预先指定文件下的指定目录中;
    调用所述第二预设框架对所述迁出集群服务器上的分区值进行同步,并对同步后的分区值进行分区格式化解析,得到可执行的添加分区的命令;
    执行所述添加分区的命令,在所述hive表中添加对应的分区;以及调用所述预设接口,获取与所述数据文件存储位置对应的数据文件,并将获取到的数据文件迁移至所述hive表对应的分区中。
  9. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有数据迁移程序,所述数据迁移程序被处理器执行时实现如权利要求1至5中任一项所述的数据迁移方法的步骤。
PCT/CN2020/092128 2019-05-30 2020-05-25 数据迁移方法、装置、及计算机可读存储介质 WO2020238858A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910461350.0 2019-05-30
CN201910461350.0A CN110162517A (zh) 2019-05-30 2019-05-30 数据迁移方法、装置、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020238858A1 true WO2020238858A1 (zh) 2020-12-03

Family

ID=67630112

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092128 WO2020238858A1 (zh) 2019-05-30 2020-05-25 数据迁移方法、装置、及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN110162517A (zh)
WO (1) WO2020238858A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277840A (zh) * 2022-03-18 2022-11-01 中国建设银行股份有限公司 一种数据迁移方法、装置、电子设备及计算机可读介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162517A (zh) * 2019-05-30 2019-08-23 深圳前海微众银行股份有限公司 数据迁移方法、装置、设备及计算机可读存储介质
CN110543520B (zh) * 2019-08-30 2022-02-01 京东科技控股股份有限公司 一种数据迁移的方法和装置
CN111241203B (zh) * 2020-02-10 2022-10-04 江苏满运软件科技有限公司 Hive数据仓库同步方法、系统、设备及存储介质
CN111274213B (zh) * 2020-02-13 2022-07-15 苏州浪潮智能科技有限公司 一种分布式文件系统HDFS跨Insight集群实时数据传输方法与系统
CN112597127A (zh) * 2020-12-15 2021-04-02 深圳市汉云科技有限公司 跨集群的访问方法、装置、设备及存储介质
CN115113798B (zh) * 2021-03-17 2024-03-19 中国移动通信集团山东有限公司 一种应用于分布式存储的数据迁移方法、系统及设备
CN113742319A (zh) * 2021-09-14 2021-12-03 中电福富信息科技有限公司 一种基于expect的不同数据库管理系统的数据迁移方法
CN115242538A (zh) * 2022-07-28 2022-10-25 天翼云科技有限公司 一种数据传输方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103354923A (zh) * 2012-02-09 2013-10-16 华为技术有限公司 一种数据重建方法、装置和系统
CN105468473A (zh) * 2014-07-16 2016-04-06 北京奇虎科技有限公司 数据迁移方法及数据迁移装置
US20160191369A1 (en) * 2014-12-26 2016-06-30 Hitachi, Ltd. Monitoring support system, monitoring support method, and recording medium
CN106095940A (zh) * 2016-06-14 2016-11-09 齐鲁工业大学 一种基于任务负载的数据迁移方法
CN107992512A (zh) * 2017-10-20 2018-05-04 中国建设银行股份有限公司上海市分行 一种数据迁移的方法、系统及计算机可读存储介质
CN110162517A (zh) * 2019-05-30 2019-08-23 深圳前海微众银行股份有限公司 数据迁移方法、装置、设备及计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103354923A (zh) * 2012-02-09 2013-10-16 华为技术有限公司 一种数据重建方法、装置和系统
CN105468473A (zh) * 2014-07-16 2016-04-06 北京奇虎科技有限公司 数据迁移方法及数据迁移装置
US20160191369A1 (en) * 2014-12-26 2016-06-30 Hitachi, Ltd. Monitoring support system, monitoring support method, and recording medium
CN106095940A (zh) * 2016-06-14 2016-11-09 齐鲁工业大学 一种基于任务负载的数据迁移方法
CN107992512A (zh) * 2017-10-20 2018-05-04 中国建设银行股份有限公司上海市分行 一种数据迁移的方法、系统及计算机可读存储介质
CN110162517A (zh) * 2019-05-30 2019-08-23 深圳前海微众银行股份有限公司 数据迁移方法、装置、设备及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277840A (zh) * 2022-03-18 2022-11-01 中国建设银行股份有限公司 一种数据迁移方法、装置、电子设备及计算机可读介质
CN115277840B (zh) * 2022-03-18 2024-04-23 中国建设银行股份有限公司 一种数据迁移方法、装置、电子设备及计算机可读介质

Also Published As

Publication number Publication date
CN110162517A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2020238858A1 (zh) 数据迁移方法、装置、及计算机可读存储介质
US11797558B2 (en) Generating data transformation workflows
US10509696B1 (en) Error detection and mitigation during data migrations
US20160018962A1 (en) User-interface for developing applications that apply machine learning
US9646041B2 (en) Testing of inactive design-time artifacts
WO2020155776A1 (zh) 一种基于核心模块的应用程序的生成方法及设备
WO2021184725A1 (zh) 用户界面测试方法、装置、存储介质及计算机设备
US20140195514A1 (en) Unified interface for querying data in legacy databases and current databases
US10042849B2 (en) Simplifying invocation of import procedures to transfer data from data sources to data targets
US10740286B1 (en) Migration task validation before data migration
EP3933581A2 (en) Evm-based transaction processing method and apparatus, device, program and medium
CN110795499A (zh) 基于大数据的集群数据同步方法、装置、设备及存储介质
JP2022545303A (ja) 概念データモデルからのソフトウェアアーチファクトの生成
US10942910B1 (en) Journal queries of a ledger-based database
CN113254534A (zh) 数据同步方法、装置及计算机存储介质
US11372826B2 (en) Dynamic inclusion of custom columns into a logical model
US10831529B2 (en) Replication of batch jobs of computing systems
JP7291764B2 (ja) イーサリアム仮想マシンのトランザクション処理方法、装置、機器、プログラムおよび媒体
EP2904520B1 (en) Reference data segmentation from single to multiple tables
CN109614271B (zh) 多个集群数据一致性的控制方法、装置、设备及存储介质
US11449461B2 (en) Metadata-driven distributed dynamic reader and writer
TWI571754B (zh) 用來進行檔案同步控制之方法與裝置
US20220277009A1 (en) Processing database queries based on external tables
US20220382791A1 (en) Executing services across multiple trusted domains for data analysis
EP3910877B1 (en) Evm-based transaction processing method and apparatus, device, program and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20813776

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20813776

Country of ref document: EP

Kind code of ref document: A1