CN115277840B - Data migration method, device, electronic equipment and computer readable medium - Google Patents

Data migration method, device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN115277840B
CN115277840B CN202210271343.6A CN202210271343A CN115277840B CN 115277840 B CN115277840 B CN 115277840B CN 202210271343 A CN202210271343 A CN 202210271343A CN 115277840 B CN115277840 B CN 115277840B
Authority
CN
China
Prior art keywords
data
attached storage
classified
data migration
storage file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210271343.6A
Other languages
Chinese (zh)
Other versions
CN115277840A (en
Inventor
刘宗国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202210271343.6A priority Critical patent/CN115277840B/en
Publication of CN115277840A publication Critical patent/CN115277840A/en
Application granted granted Critical
Publication of CN115277840B publication Critical patent/CN115277840B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The application discloses a data migration method, a device, electronic equipment and a computer readable medium, and relates to the technical field of data transmission, wherein the method comprises the following steps: receiving a data migration request, and acquiring corresponding data to be migrated and a classification identifier; dividing the data to be migrated based on the classification identification to obtain various classification data; respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark; and migrating the classified data to a target distributed database. By setting the data migration flow to a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of data migration is realized. And data transcoding and data processing are put on a server cluster to run, and the original IBM mainframe is only responsible for data generation and data transmission, so that the data migration efficiency is maximized.

Description

Data migration method, device, electronic equipment and computer readable medium
Technical Field
The present application relates to the field of data transmission technologies, and in particular, to a data migration method, a data migration device, an electronic device, and a computer readable medium.
Background
Currently, more and more banking core systems migrate VSAM/QSAM files on conventional IBM mainframes to distributed databases. The bank system relates to fund transaction, so that the requirements on customer property safety and data accuracy are extremely high, and in the distributed transformation test process, the operation results of the new and old systems are ensured to be consistent through the stage verification of the parallel account following test of the new and old systems, and the final party can put into production and switch.
In the process of implementing the present application, the inventor finds that at least the following problems exist in the prior art:
The tasks of data transcoding and data processing are processed on an IBM mainframe, so that mainframe pressure is increased, batch execution efficiency of the existing system is affected, workload of data migration is large, and instantaneity and accuracy of data migration are low.
Disclosure of Invention
In view of this, the embodiments of the present application provide a data migration method, apparatus, electronic device, and computer readable medium, which can solve the problems that the existing task of processing data transcoding and data processing on an IBM mainframe increases mainframe pressure, affects the batch execution efficiency of the existing system, has large workload of data migration, and has low instantaneity and accuracy of data migration.
To achieve the above object, according to an aspect of an embodiment of the present application, there is provided a data migration method, including:
receiving a data migration request, and acquiring corresponding data to be migrated and a classification identifier;
Dividing the data to be migrated based on the classification identification to obtain various classification data;
Respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark;
and migrating the classified data to a target distributed database.
Optionally, before migrating the classification data to the target distributed database, the method further comprises:
The server cluster is invoked to transcode and process the classified data.
Optionally, migrating the classification data to the target distributed database includes:
Generating a loading task based on the classification data and the target distributed database;
And when the current time reaches the execution time corresponding to the loading task, executing the loading task reaching the execution time to load the corresponding classified data into the target distributed database.
Optionally, migrating the classification data to the target distributed database includes:
naming the classified data based on a date mode, and then loading the named classified data into a target distributed database.
Optionally, after migrating each classified data to a corresponding network attached storage file, the method further comprises:
determining the stay time of each classified data in the network attached storage file;
And triggering and executing the timing cleaning task in response to the stay time exceeding the preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
Optionally, after migrating the classification data to the target distributed database, the method further comprises:
Acquiring script execution logs in the process of migrating the classified data to the target distributed database;
And automatically compressing and storing the script execution log.
Optionally, migrating each classified data to a corresponding network attached storage file, including:
Creating a flexible connection to a plurality of network attached storage files;
Determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established;
And migrating each classified data to a corresponding target network attached storage file.
Optionally, the network attached storage file is mounted to a server cluster.
In addition, the application also provides a data migration device, which comprises:
the receiving unit is configured to receive the data migration request and acquire corresponding data to be migrated and classification identifiers;
the dividing unit is configured to divide the data to be migrated based on the classification identification so as to obtain various classification data;
The data migration unit is configured to migrate each classified data to the corresponding network-attached storage file respectively, and call the server cluster to read the corresponding classified data from the network-attached storage file based on the data migration end identifier in response to detecting the data migration end identifier;
and the data migration unit is configured to migrate the classified data to the target distributed database.
Optionally, the apparatus further comprises a data processing unit configured to:
The server cluster is invoked to transcode and process the classified data.
Optionally, the data migration unit is further configured to:
Generating a loading task based on the classification data and the target distributed database;
And when the current time reaches the execution time corresponding to the loading task, executing the loading task reaching the execution time to load the corresponding classified data into the target distributed database.
Optionally, the data migration unit is further configured to:
naming the classified data based on a date mode, and then loading the named classified data into a target distributed database.
Optionally, the data migration apparatus further comprises a timing cleaning unit configured to:
determining the stay time of each classified data in the network attached storage file;
And triggering and executing the timing cleaning task in response to the stay time exceeding the preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
Optionally, the data migration apparatus further includes a compression holding unit configured to:
Acquiring script execution logs in the process of migrating the classified data to the target distributed database;
And automatically compressing and storing the script execution log.
Optionally, the data migration unit is further configured to:
Creating a flexible connection to a plurality of network attached storage files;
Determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established;
And migrating each classified data to a corresponding target network attached storage file.
Optionally, the network attached storage file is mounted to a server cluster.
In addition, the application also provides a data migration electronic device, which comprises: one or more processors; and a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data migration method as described above.
In addition, the application also provides a computer readable medium, on which a computer program is stored, which when executed by a processor implements the data migration method as described above.
To achieve the above object, according to still another aspect of an embodiment of the present application, there is provided a computer program product.
The computer program product of the embodiment of the application comprises a computer program, and the data migration method provided by the embodiment of the application is realized when the program is executed by a processor.
One embodiment of the above application has the following advantages or benefits: the method comprises the steps of obtaining corresponding data to be migrated and classification identifiers by receiving a data migration request; dividing the data to be migrated based on the classification identification to obtain various classification data; respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark; and migrating the classified data to a target distributed database. By setting the data migration flow to a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of data migration is realized. Through each network attached storage file (NAS) mounted to a server cluster, through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to be processed by the local disks of each server; the data transcoding and data processing are put on servers in the server cluster to run, and the original IBM mainframe is only responsible for the tasks of data generation and data transmission, so that the maximization of data migration efficiency is achieved, and the instantaneity and accuracy of data migration are improved.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the application and are not to be construed as unduly limiting the application. Wherein:
FIG. 1 is a schematic diagram of the main flow of a data migration method according to a first embodiment of the present application;
FIG. 2 is a schematic diagram of the main flow of a data migration method according to a second embodiment of the present application;
Fig. 3 is a schematic view of an application scenario of a data migration method according to a third embodiment of the present application;
FIG. 4 is a schematic diagram of the main units of a data migration apparatus according to an embodiment of the present application;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present application may be applied;
Fig. 6 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. The technical scheme of the application obtains, stores, uses, processes and the like the data, which all meet the relevant regulations of national laws and regulations.
Fig. 1 is a schematic diagram of a main flow of a data migration method according to a first embodiment of the present application, and as shown in fig. 1, the data migration method includes:
step S101, receiving a data migration request, and obtaining corresponding data to be migrated and a classification identifier.
In this embodiment, the execution body (for example, may be a server) of the data migration method may receive the data migration request through a wired connection or a wireless connection. In particular, the data migration request may be a request to migrate data from an IBM mainframe to a distributed database. After receiving the data migration request, the execution body may obtain the data to be migrated corresponding to the request from the VSAM/QSAM file of the IBM mainframe. The data to be migrated may include business handling data of different regions, business handling data of different sub-companies, etc., and the embodiment of the present application does not specifically limit the data to be migrated. The execution body may further obtain a classification identifier in the request after receiving the data migration request, and specifically, the classification identifier may be a regional name or a branch company name. The classification identifier is used for classifying the data to be migrated.
Step S102, dividing the data to be migrated based on the classification identification to obtain each classification data.
After the classification identification is obtained, the execution main body can conduct classification division on the data to be migrated based on the classification identification, so that batch migration of the data to be migrated is achieved based on the obtained classification data after classification is obtained.
Step S103, each classified data is migrated to the corresponding network-attached storage file, and the server cluster is called in response to the detection of the data migration end identifier, so as to read the corresponding classified data from the network-attached storage file based on the data migration end identifier.
Specifically, in the embodiment of the application, the network attached storage file is mounted to the server cluster.
Mount (mount) refers to a process by which a computer file and directory on a storage device (such as a hard disk, CD-ROM, or shared resource) is made available to a user for access through the computer's file system by the operating system. NAS: network Attached Storage, network attached storage files, devices with data storage function, which are connected to the network, thoroughly separate storage equipment from a server by taking data as a center, and centrally manage data.
In the embodiment of the application, the network-attached storage file is mounted to the server cluster, so that each server in the server cluster can access the classified data and the catalogue of the network-attached file.
Specifically, each classified data is migrated to a corresponding network attached storage file, including: creating a flexible connection to a plurality of network attached storage files; determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established; and migrating each classified data to a corresponding target network attached storage file.
The executive may store classification data using a plurality of network attached storage files, each of the classification data stored in one of the network attached storage files, and the executive may create a soft connection for each of the network attached storage files. The soft links, also called symbolic links, specifically, one file contains the pathname of another file. May be any file or directory and may link files of different file systems.
Each classification data determines a target network-attached storage file corresponding to a classification identifier (e.g., a region identifier or a branch identifier to which the classification data belongs) corresponding to the classification data from among a plurality of network-attached storage files to which the flexible connection is created, according to the corresponding classification identifier (e.g., the region identifier or the branch identifier to which the classification data belongs). The executing entity may then migrate each classified data to a corresponding target network attached storage file. For example, the classification data 1 is migrated to the target network-attached storage file 1 corresponding to the corresponding classification identifier 1, the classification data 2 is migrated to the target network-attached storage file 2 corresponding to the corresponding classification identifier 2, and the classification data 3 is migrated to the target network-attached storage file 3 corresponding to the corresponding classification identifier 3.
In the embodiment of the application, because of the limit of the reading and writing speed of a single network attached storage file NAS, the problem of speed limitation can be avoided by using a plurality of NAS storages. By creating soft links, data for different branches or regions is stored to different NAS directories. Multiple branches or regions can be operated concurrently without mutual influence.
Specifically, after each classified data is migrated to the corresponding network-attached storage file, the method further comprises: determining the stay time of each classified data in the network attached storage file; and triggering and executing the timing cleaning task in response to the stay time exceeding the preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
By way of example, the servers in the server cluster in embodiments of the present application may be comprised of X86 servers. The execution main body can call the automatic data storage cleaning mechanism, uses the Crontab timing task carried by the X86 server linux, only reserves the data to be migrated acquired from the IBM mainframe for N days, and automatically cleans the data catalogue for more than N days, thereby avoiding the insufficient space and the job error reporting. The number of N may be 7 days, 9 days, or the like, and the number of N is not particularly limited in the embodiment of the present application, and may be set according to actual needs.
Step S104, the classified data is migrated to the target distributed database.
Specifically, migrating the classification data to the target distributed database includes: naming the classified data based on a date mode, and then loading the named classified data into a target distributed database.
In the embodiment of the application, the date naming is used for the data storage catalogue, so that the consistency of data migration and migration can be ensured. The problem of the new system in the parallel account following of the double machines is avoided, the quasi-real-time account following can be realized, the data of the old system cannot be missed, and the integrity of data migration is ensured.
Specifically, after migrating the classification data to the target distributed database, the data migration method further includes: acquiring script execution logs in the process of migrating the classified data to the target distributed database; and automatically compressing and storing the script execution log.
The execution body may also execute script execution log compression save. When the execution body detects that each batch of data in the batch of data is migrated, the execution body can automatically compress and store the script execution log in the batch of data migration process, so that the analysis and tracking of subsequent problems are facilitated.
In the embodiment, corresponding data to be migrated and classification identifiers are obtained by receiving a data migration request; dividing the data to be migrated based on the classification identification to obtain various classification data; respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark; and migrating the classified data to a target distributed database. By setting the data migration flow to a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of data migration is realized. Through each network attached storage file (NAS) mounted to a server cluster, through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to be processed by the local disks of each server; the data transcoding and data processing are put on servers in the server cluster to run, and the original IBM mainframe is only responsible for the tasks of data generation and data transmission, so that the maximization of data migration efficiency is achieved, and the instantaneity and accuracy of data migration are improved.
Fig. 2 is a schematic flow chart of a data migration method according to a second embodiment of the present application, and as shown in fig. 2, the data migration method includes:
step S201, receiving a data migration request, and obtaining corresponding data to be migrated and a classification identifier.
Step S202, dividing the data to be migrated based on the classification identification to obtain each classification data.
In step S203, each classified data is migrated to the corresponding network-attached storage file, and in response to detecting the data migration end identifier, the server cluster is invoked to read the corresponding classified data from the network-attached storage file based on the data migration end identifier.
Step S204, calling the server cluster to perform data transcoding and data processing on the classified data.
The server cluster in the embodiment of the application can be a cluster machine group formed by a plurality of X86 servers, namely an X86 server cluster. In the X86 server cluster, the resources of each server are fully used by job-controlled scheduling. The X86 server cluster supports load balancing, and each server sets the maximum task concurrent processing number, so that the situation that server resources are unevenly used in the peak period is avoided. The high concurrency is controllable, and the task concurrency number in the cluster machine group can be adjusted at any time according to the pressures of the X86 server and the database; the single-point fault is supported, the availability is high, the data migration flow is not interrupted even if one X86 server fails, other servers can continue to process, and the stability of data migration is ensured; the method supports extensible and horizontal expansion, only adds the server to the cluster continuously according to the actual data volume of the production service, has low cost and easy operation, does not need to modify codes, and does not influence the existing data migration flow; through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to the local disk processing of each server, so that the migration efficiency is maximized.
In the embodiment of the application, the execution body can distribute each classified data to the servers in the server cluster based on the consistency hash algorithm, so that the servers distributed to the classified data perform data transcoding and data processing on the classified data. The data processing comprises decompression and expansion of various classified data, data transcoding, and visual storage of the data transcoding to a distributed database. Data transcoding includes converting each classified data into ASCII character encoding.
Step S205, generating a loading task based on the classification data and the target distributed database.
After obtaining each classified data, the executing body can determine the target distributed database to be migrated corresponding to each classified data. In particular, a distributed database system uses smaller computer systems, each of which may be placed separately in a place, where each computer may have a full or partial copy of the DBMS, and have its own local database, where many computers in different places are interconnected by a network, together forming a complete, global, logically centralized, physically distributed, large database. The distributed database system can be a homogeneous distributed database system or a heterogeneous distributed database system.
The execution body may generate a loading task from each classified data and the corresponding target distributed database, and further generate loading tasks with the same number as the classified data, and set an execution time for each loading task, so as to wait for the set execution time to be reached, and execute the corresponding loading task. In particular, the load task may be an asynchronous task, placed in an asynchronous task pool, and waiting for execution.
In step S206, when the current time reaches the execution time corresponding to the loading task, the loading task reaching the execution time is executed to load the corresponding classification data into the target distributed database.
When the offloaded task in the asynchronous task pool reaches the execution time, the execution subject may execute the offloaded task that has reached the execution time to load the classification data corresponding to the offloaded task into the corresponding target distributed database.
According to the embodiment of the application, through server cluster operation, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog of the network attached storage files to the local disk of each server for processing, so that the migration efficiency is maximized.
Fig. 3 is an application scenario diagram of a data migration method according to a third embodiment of the present application. The data migration method of the embodiment of the application can be applied to a scene of migrating from a traditional IBM mainframe VSAM/QSAM file to a distributed database. For example, a dual machine (i.e., a conventional IBM mainframe and a distributed database) may perform a parallel accounting test, and a new system (e.g., a system corresponding to the distributed database) may process a transaction started on the next day according to a data source on a certain day as a starting point, and then compare processing results of the new and old systems (i.e., the conventional IBM mainframe and the distributed database), which may relate to online transactions, batch transactions, general account subjects, database tables, messages, and the like. Data migration requires daily loading of data from the old system (e.g., IBM mainframe) into the distributed database of the new system, data comparison, and verification of the new system's accuracy in transaction processing. By setting the data migration flow in a batch processing mode, dependency waiting is increased among jobs, the full-flow account following test among cross-platforms is met, manual operation and human error of double-machine parallel account following every day are reduced, workload is saved, data migration automation is realized, and instantaneity and accuracy of data migration are ensured. In The embodiment of The present application, the X86 architecture (The X86 architecture) is a microprocessor-executed computer language instruction set, which refers to a standard numbered abbreviation for an intel general-purpose computer column, and also identifies a general-purpose set of computer instructions.
As shown in fig. 3, after receiving the data migration request, the executing body may acquire the corresponding data to be migrated from the IBM mainframe and migrate to the NAS network file. Each X86 server in the X86 server cluster can read the file corresponding to the data to be migrated from the NAS network file, and perform data transcoding and data processing on the local disk of each X86 server, so as to achieve the maximization of data migration efficiency. After data conversion and data processing are carried out on data to be migrated read from NAS network files, and after checking and passing through the data to be migrated, each X86 server in the X86 server cluster migrates the data to be migrated into a target distributed database so as to complete data migration from the traditional IBM mainframe VSAM/QSAM file to the distributed database.
Specifically, as shown in fig. 3, the data migration method according to the embodiment of the present application may implement data migration automation: by setting the data migration flow to a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of data migration is realized.
The following functions that can be implemented on the IBM mainframe, the X86 server, the NAS network file, and the X86 server cluster by the data migration method according to the embodiment of the present application are specifically described as follows:
In IBM mainframes: the mainframe data transfer job adds the scheduling of the data transfer job through Control-M. After the full data generation is completed, an end file is communicated to the X86 server, and the identification data is completely generated and transferred to the NAS storage directory. Control-M is an enterprise-level centralized job scheduling management solution provided by BMC Software, which centrally manages cross-platform, cross-application production Control and scheduling processes, and has the core functions of: automatically scheduling and submitting related jobs according to service logic, and monitoring and analyzing the operation state and operation result of the jobs in real time; and automatically carrying out subsequent processing of the job based on the operation result.
At the X86 server: data transcoding, data processing and scheduling of package number jobs by job management and control adding all shell scripts. And the monitoring process starts the execution of the shell script once finding that the end file of the mainframe is received. Date naming is used through the data storage catalogue, and data migration and consistency of migration are guaranteed. The problem that the new system (namely the system corresponding to the distributed database) can not keep track of accounts in quasi-real time when the two machines are in parallel and miss the data of the old system (namely the IBM mainframe) is avoided. The automatic cleaning mechanism for data storage can be realized, the X86 server linux is used for carrying out a Crontab timing task, data generated by a host computer is reserved for only seven days, and a data catalog exceeding seven days is automatically cleaned, so that insufficient space and job error reporting are avoided. The compression and storage of the script execution log can be realized, each round of data migration is completed, and the script execution log of the round is automatically compressed and stored, so that the analysis and tracking of subsequent problems are convenient.
NAS network file: for storing data, due to the huge amount of data in the core system of large banks, the data storage is usually divided into rows or regions. The soft link is also called a symbolic link and this file contains the pathname of another file. May be any file or directory and may link files of different file systems. The use of multiple NAS storages can avoid speed limiting problems due to single NAS read and write speed limitations. By creating soft links, data for different branches or regions is stored to different NAS directories. Multiple sub-lines can operate concurrently without mutual influence. The capacity expansion is simple and flexible, the NAS capacity expansion can be well supported according to the actual data volume condition of the production service, the cost is lower, the cost performance is high, and the operation is easy. Data sharing may be implemented, with each NAS mounted to an X86 server cluster. Each server can transcode, process and load data of the same branch or region into a database.
X86 server Cluster: and a plurality of X86 servers are used as a cluster machine group, and the resources of each server are fully used through job management and control scheduling. And load balancing is supported, and each server sets the maximum concurrent processing number of tasks, so that the situation of uneven use of server resources in the peak period is avoided. The high concurrency can be controlled, and the task concurrency number in the cluster machine group can be adjusted at any time according to the pressures of the X86 server and the database; the single-point fault is supported, the availability is high, the data migration flow is not interrupted even if one X86 server fails, other servers can continue to process, and the stability of data migration is ensured; the method supports extensible and horizontal expansion, only adds the server to the cluster continuously according to the actual data volume of the production service, has low cost and easy operation, does not need to modify codes, and does not influence the existing data migration flow; through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to the local disk processing of each server, so that the migration efficiency is maximized.
Because the maintenance cost of the mainframe is higher, NAS network storage and an X86 server group are brought into cluster management through an open platform, the tasks of data transcoding and data processing are responsible, the workload of data migration is shared, the idle period and single-point faults of the X86 server are reduced, the resources of each server are fully utilized, and the stability and the efficiency of data migration are ensured. In addition, the NAS network storage and the X86 server are low in cost, simple in deployment and installation, use of the field Jing Jiaoduo, and can be well reused in other requirements after data migration is completed, so that the resource utilization rate is maximized.
According to the embodiment of the application, the data migration flow is set into a batch processing mode, so that the manual operation and human errors during the double-machine parallel account following test are reduced, the workload of maintenance personnel is saved, the data migration automation is realized, and the instantaneity and the accuracy of the data migration are ensured; the NAS network storage and the X86 server are used for cluster management, load balancing is supported, the expansion is realized (the servers can be continuously added according to actual conditions), the high concurrency is controllable, the server idleness and single-point faults are reduced, the resources of each server are fully utilized, and the stability and the efficiency of data migration are ensured.
The embodiment of the application can realize the automation of data migration, reduce human errors, save the workload of maintenance personnel and ensure the instantaneity and the accuracy of data migration; the server is clustered, expandable and high in availability, stability and efficiency of data migration are guaranteed, single-point faults are avoided, and data migration flow is interrupted.
Fig. 4 is a schematic diagram of main units of a data migration apparatus according to an embodiment of the present application. As shown in fig. 4, the data migration apparatus 400 includes a receiving unit 401, a dividing unit 402, a data migrating unit 403, and a data migrating unit 404.
The receiving unit 401 is configured to receive a data migration request, and obtain corresponding data to be migrated and a classification identifier.
The dividing unit 402 is configured to divide the data to be migrated based on the classification identification to obtain each classification data.
The data migrating unit 403 is configured to migrate each classified data to the corresponding network-attached storage file, and in response to detecting the data migration end identifier, invoke the server cluster to read the corresponding classified data from the network-attached storage file based on the data migration end identifier.
A data migration unit 404 configured to migrate the classification data to the target distributed database.
In some embodiments, the apparatus further comprises a data processing unit, not shown in fig. 4, configured to: the server cluster is invoked to transcode and process the classified data.
In some embodiments, data migration unit 404 is further configured to: generating a loading task based on the classification data and the target distributed database; and when the current time reaches the execution time corresponding to the loading task, executing the loading task reaching the execution time to load the corresponding classified data into the target distributed database.
In some embodiments, data migration unit 404 is further configured to: naming the classified data based on a date mode, and then loading the named classified data into a target distributed database.
In some embodiments, the apparatus further comprises a timing cleaning unit, not shown in fig. 4, configured to: determining the stay time of each classified data in the network attached storage file; and triggering and executing the timing cleaning task in response to the stay time exceeding the preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
In some embodiments, the data migration apparatus further comprises a compression holding unit, not shown in fig. 4, configured to: acquiring script execution logs in the process of migrating the classified data to the target distributed database; and automatically compressing and storing the script execution log.
In some embodiments, data migration unit 403 is further configured to: creating a flexible connection to a plurality of network attached storage files; determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established; and migrating each classified data to a corresponding target network attached storage file.
In some embodiments, the network attached storage file is mounted to a server cluster.
It should be noted that, in the data migration method and the data migration apparatus of the present application, there is a corresponding relationship in the implementation content, so the description of the repeated content is not repeated.
Fig. 5 illustrates an exemplary system architecture 500 in which a data migration method or data migration apparatus of embodiments of the present application may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 is used as a medium to provide communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 505 via the network 504 using the terminal devices 501, 502, 503 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 501, 502, 503, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be a variety of electronic devices having a data migration processing screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (by way of example only) providing support for data migration requests submitted by users using the terminal devices 501, 502, 503. The background management server can receive the data migration request and acquire corresponding data to be migrated and classification identifiers; dividing the data to be migrated based on the classification identification to obtain various classification data; respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark; and migrating the classified data to a target distributed database. By setting the data migration flow to a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of data migration is realized. Through each network attached storage file (NAS) mounted to a server cluster, through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to be processed by the local disks of each server; the data transcoding and data processing are put on servers in the server cluster to run, and the original IBM mainframe is only responsible for the tasks of data generation and data transmission, so that the maximization of data migration efficiency is achieved, and the instantaneity and accuracy of data migration are improved.
It should be noted that, the data migration method provided in the embodiment of the present application is generally executed by the server 505, and accordingly, the data migration device is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use in implementing an embodiment of the present application. The terminal device shown in fig. 6 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the computer system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a liquid crystal credit authorization query processor (LCD), and the like, and a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601.
The computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented in software or in hardware. The described units may also be provided in a processor, for example, described as: a processor includes a receiving unit, a dividing unit, a data eviction unit, and a data eviction unit. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs, and when the one or more programs are executed by one device, the device receives a data migration request to acquire corresponding data to be migrated and classification identifiers; dividing the data to be migrated based on the classification identification to obtain various classification data; respectively migrating each classified data to a corresponding network attached storage file, and calling a server cluster in response to detecting a data migration ending mark so as to read the corresponding classified data from the network attached storage file based on the data migration ending mark; and migrating the classified data to a target distributed database.
The computer program product of the present application comprises a computer program which, when executed by a processor, implements the data migration method of the embodiments of the present application.
According to the technical scheme of the embodiment of the application, the data migration flow is set into a batch processing mode, a file waiting mode is used among the cross systems to trigger, the dependency relationship among the jobs is increased, and the automation of the data migration is realized. Through each network attached storage file (NAS) mounted to a server cluster, through the operation of the server cluster, the local disks of a plurality of servers can share resources, and the files are read from the NAS catalog to be processed by the local disks of each server; the data transcoding and data processing are put on servers in the server cluster to run, and the original IBM mainframe is only responsible for the tasks of data generation and data transmission, so that the maximization of data migration efficiency is achieved, and the instantaneity and accuracy of data migration are improved.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (12)

1. A method of data migration, comprising:
receiving a data migration request, and acquiring corresponding data to be migrated and a classification identifier;
dividing the data to be migrated based on the classification identifier to obtain various classification data;
Respectively migrating each classified data to a corresponding network-attached storage file, and in response to detecting a data migration end identifier, calling a server cluster to read the corresponding classified data from the network-attached storage file based on the data migration end identifier, wherein the network-attached storage file is mounted to the server cluster, and the respectively migrating each classified data to the corresponding network-attached storage file comprises the following steps: creating a flexible connection to a plurality of network attached storage files; determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established; migrating each classified data to the corresponding target network attached storage file;
Invoking a server cluster to perform data transcoding and data processing on the classified data;
and migrating the classified data to a target distributed database.
2. The method of claim 1, wherein the migrating the classification data to a target distributed database comprises:
Generating a loading task based on the classification data and a target distributed database;
And when the current time reaches the execution time corresponding to the loading task, executing the loading task reaching the execution time so as to load the corresponding classified data into the target distributed database.
3. The method of claim 1, wherein the migrating the classification data to a target distributed database comprises:
And naming the classified data based on a date mode, and loading the named classified data to the target distributed database.
4. The method of claim 1, wherein after said migrating each of said classified data to a corresponding network attached storage file, respectively, the method further comprises:
determining the stay time of each classified data in the network attached storage file;
and triggering and executing a timing cleaning task in response to the stay time exceeding a preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
5. The method of claim 1, wherein after said migrating the classification data to a target distributed database, the method further comprises:
Acquiring script execution logs in the process of migrating the classified data to a target distributed database;
and automatically compressing and storing the script execution log.
6. A data migration apparatus, comprising:
the receiving unit is configured to receive the data migration request and acquire corresponding data to be migrated and classification identifiers;
The dividing unit is configured to divide the data to be migrated based on the classification identifier so as to obtain various classification data;
A data migrating unit configured to migrate each of the classified data to a corresponding network-attached storage file, and in response to detecting a data migration end identifier, invoke a server cluster to read the corresponding classified data from the network-attached storage file based on the data migration end identifier, wherein the network-attached storage file is mounted to the server cluster, and the migrating each of the classified data to the corresponding network-attached storage file includes: creating a flexible connection to a plurality of network attached storage files; determining a target network attached storage file corresponding to each classified data from a plurality of network attached storage files with soft connection established; migrating each classified data to the corresponding target network attached storage file;
The data processing unit is configured to call a server cluster to perform data transcoding and data processing on the classified data;
and the data migration unit is configured to migrate the classified data to a target distributed database.
7. The apparatus of claim 6, wherein the data migration unit is further configured to:
Generating a loading task based on the classification data and a target distributed database;
And when the current time reaches the execution time corresponding to the loading task, executing the loading task reaching the execution time so as to load the corresponding classified data into the target distributed database.
8. The apparatus of claim 6, wherein the data migration unit is further configured to:
And naming the classified data based on a date mode, and loading the named classified data to the target distributed database.
9. The apparatus of claim 6, further comprising a timing cleaning unit configured to:
determining the stay time of each classified data in the network attached storage file;
and triggering and executing a timing cleaning task in response to the stay time exceeding a preset threshold so as to automatically clean the classified data corresponding to the stay time exceeding the preset threshold.
10. A data migration electronic device, comprising:
One or more processors;
storage means for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-5.
11. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-5.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.
CN202210271343.6A 2022-03-18 2022-03-18 Data migration method, device, electronic equipment and computer readable medium Active CN115277840B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210271343.6A CN115277840B (en) 2022-03-18 2022-03-18 Data migration method, device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210271343.6A CN115277840B (en) 2022-03-18 2022-03-18 Data migration method, device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN115277840A CN115277840A (en) 2022-11-01
CN115277840B true CN115277840B (en) 2024-04-23

Family

ID=83758469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210271343.6A Active CN115277840B (en) 2022-03-18 2022-03-18 Data migration method, device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN115277840B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111078667B (en) * 2019-12-12 2023-03-10 腾讯科技(深圳)有限公司 Data migration method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753229A (en) * 2017-11-08 2019-05-14 华为技术有限公司 Data migration method, device, network attached storage equipment and storage medium
CN110162517A (en) * 2019-05-30 2019-08-23 深圳前海微众银行股份有限公司 Data migration method, device, equipment and computer readable storage medium
CN113297166A (en) * 2020-07-27 2021-08-24 阿里巴巴集团控股有限公司 Data processing system, method and device
CN113392091A (en) * 2021-06-30 2021-09-14 中国工商银行股份有限公司 Distributed cluster data migration method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9922103B2 (en) * 2014-10-21 2018-03-20 Bank Of America Corporation Copying datasets between data integration systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753229A (en) * 2017-11-08 2019-05-14 华为技术有限公司 Data migration method, device, network attached storage equipment and storage medium
CN110162517A (en) * 2019-05-30 2019-08-23 深圳前海微众银行股份有限公司 Data migration method, device, equipment and computer readable storage medium
WO2020238858A1 (en) * 2019-05-30 2020-12-03 深圳前海微众银行股份有限公司 Data migration method and apparatus, and computer-readable storage medium
CN113297166A (en) * 2020-07-27 2021-08-24 阿里巴巴集团控股有限公司 Data processing system, method and device
CN113392091A (en) * 2021-06-30 2021-09-14 中国工商银行股份有限公司 Distributed cluster data migration method and device

Also Published As

Publication number Publication date
CN115277840A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
CN110417613B (en) Distributed performance testing method, device, equipment and storage medium based on Jmeter
US10956403B2 (en) Verifying data consistency
US20130091376A1 (en) Self-repairing database system
WO2019001017A1 (en) Inter-cluster data migration method and system, server, and computer storage medium
US10628070B2 (en) Selecting and compressing target files to obtain additional free data storage space to perform an operation in a virtual machine
US11469943B2 (en) Pre-scheduling for cloud resource provisioning
US11586646B2 (en) Transforming data structures and data objects for migrating data between databases having different schemas
CN115277840B (en) Data migration method, device, electronic equipment and computer readable medium
US11797523B2 (en) Schema and data modification concurrency in query processing pushdown
CN111444148A (en) Data transmission method and device based on MapReduce
US20230153279A1 (en) System and method for selective migration of mainframe resources to a non-mainframe environment
CN115982273A (en) Data synchronization method, system, electronic equipment and storage medium
CN109918174A (en) The system and method for live migration of virtual machine optimization processing are realized based on accelerator card
CN113515306B (en) System transplanting method and device
CN113076175B (en) Memory sharing method and device for virtual machine
US11886460B2 (en) Multiple version data cluster ETL processing
CN113568892A (en) Method and equipment for carrying out data query on data source based on memory calculation
CN109635040B (en) Real-time data migration method and device and storage medium
CN110807058B (en) Method and system for exporting data
CN113326038A (en) Method, apparatus, device, storage medium and program product for providing service
US10528400B2 (en) Detecting deadlock in a cluster environment using big data analytics
CN113778657B (en) Data processing method and device
US20230376479A1 (en) Schema and data modification concurrency in query processing pushdown
US20230305918A1 (en) Fault Tolerant Big Data Processing
US11775417B1 (en) Sharing execution states among storage nodes during testing of stateful software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant