CN116339822A - Method for simplifying migration of data docking task - Google Patents

Method for simplifying migration of data docking task Download PDF

Info

Publication number
CN116339822A
CN116339822A CN202310609299.XA CN202310609299A CN116339822A CN 116339822 A CN116339822 A CN 116339822A CN 202310609299 A CN202310609299 A CN 202310609299A CN 116339822 A CN116339822 A CN 116339822A
Authority
CN
China
Prior art keywords
data
task
migration
visualization
docking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310609299.XA
Other languages
Chinese (zh)
Inventor
王聪明
王三明
胡小敏
李成坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qiye Cloud Big Data Nanjing Co ltd
Anyuan Technology Co ltd
Original Assignee
Qiye Cloud Big Data Nanjing Co ltd
Anyuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qiye Cloud Big Data Nanjing Co ltd, Anyuan Technology Co ltd filed Critical Qiye Cloud Big Data Nanjing Co ltd
Priority to CN202310609299.XA priority Critical patent/CN116339822A/en
Publication of CN116339822A publication Critical patent/CN116339822A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/76Adapting program code to run in a different environment; Porting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/18Multiprotocol handlers, e.g. single devices capable of handling multiple protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a method for simplifying data butt joint task migration, and relates to the field of data migration. The method for simplifying the data docking task migration comprises a task configuration description module, a task migration correction module and a task visualization and debugging module; the task visualization and debugging module is used for deploying a docking system, configuring related component modules and debugging, and describing and correcting migration data after the task visualization and debugging module is used for debugging, so that the task migration is fast. According to the method for simplifying the data butt joint task migration, a standard version migration task can automatically complete the adaptation of each component to achieve an operable basic version; through plug-in, visualization and the like, the method is convenient for general personnel to detect whether the migration is correct or not; when the overall scheme changes, all migration tasks can be updated in batches.

Description

Method for simplifying migration of data docking task
Technical Field
The invention relates to the technical field of data migration, in particular to a method for simplifying data butt joint task migration.
Background
There are many scenarios of data synchronization within, between, and between companies and governments involving multiple transport protocol adaptations across data sources, and efficiency can be problematic when batch-like tasks are required.
Generally, task description is carried out on synchronous tasks by using execl or similar products, then main development is completed by using script development, and script adjustment is carried out according to the special cases of the current scene when migration is carried out.
However, the migration method has the following disadvantages:
1. the debugging process depends on script logs;
2. the migration process meets the special case adjustment and needs research and development participation, and the migration degree is not high;
3. the multi-source multi-transmission protocol is more customized;
4. the overall migration cost is high.
We therefore propose a method to simplify data docking task migration.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a method for simplifying the migration of data docking tasks, which solves the problems that the debugging process depends on script logs, the migration process encounters special adjustment and needs research and development participation, the migration degree is low and the adaption to the multi-source multi-transmission protocol is more customized.
(II) technical scheme
In order to achieve the above purpose, the invention is realized by the following technical scheme: a method for simplifying data docking task migration comprises a task configuration description module, a task migration correction module and a task visualization and debugging module;
the task visualization and debugging module is used for deploying a docking system, configuring related component modules and debugging, and describing and correcting migration data after the task visualization and debugging module is used for debugging, so that the task migration is fast.
Preferably, the data source describes: structuring describes various types of data source connections;
the ETL task description: the data processing process of grouping, screening and other operator construction is carried out on the data source tables of the respective sources by the structural description;
the gateway protocol describes: structured description docking interface
The task schedule description: the structure describes the scheduling time of the task and the dependency.
Preferably, the data sources include mysql, oracle, hive, postgresql, mongodb, api sources and kafka sources;
the docking interface includes a header, an authorization, an agent, and a parameter change.
Preferably, the task migration correction includes variable replacement, metadata collation and term shortage completion.
Preferably, the variables replace: in the description file, replacing new data obtained after some variables are migrated;
the metadata proofreading: after migration, the data of the table needed in the ETL task is compared with the corresponding table metadata according to the new data source;
the absence complement: according to the idea of productization migration, some migrated data need to acquire new data, and complement processing is performed through a rule.
Preferably, the task visualization and debugging module comprises a data source visualization, an ETL visualization, a gateway visualization, a task scheduling visualization and a task debugging.
Preferably, the data source visualizations: providing a data source management interface, and checking the current system data source and detail configuration;
ETL visualization: providing an ETL task management interface, and checking the ETL task and detail configuration of the current system;
the gateway visualizations: providing a management interface of a gateway, and checking the current system module plug-in and protocol details;
the task scheduling visualization: providing a task scheduling management interface, and checking the current system task and scheduling history details;
the task debugging: the data and configuration are adjusted in the data docking task.
The ETL includes extraction, conversion and loading steps, and the ETL tool processes the data sources, text files and other files to generate temporary data, and then collates the data and transmits the data to the target database.
The ETL is responsible for extracting data in distributed and heterogeneous data sources such as relationship data, plane data files and the like to a temporary intermediate layer, then cleaning, converting and integrating the data, and finally loading the data into a data warehouse or a data mart to form the basis of online analysis and data mining.
The invention discloses a method for simplifying migration of data docking tasks, which has the following beneficial effects:
1. according to the method for simplifying the migration of the data docking task, the structural description file of the data docking task is designed, the adaptation rule of the current scene in the migration process is considered, the modification of the migration description file is automatically completed, the executable and viewable migration task is created, the dependence of the migration process on research and development is reduced, the plug-in function is realized, and the migration efficiency is improved. The standard version migration task can automatically complete the adaptation of each component to reach an operable basic version; through plug-in, visualization and the like, the method is convenient for general personnel to detect whether the migration is correct or not; when the overall scheme changes, all migration tasks can be updated in batches.
2. According to the method for simplifying the migration of the data docking task, debugging is needed to be conducted on the migrated data docking task, whether the data meet expectations or not can be observed through previewing and checking the current ETL data preparation condition, and when the data do not meet expectations, the data can be timely adjusted. And the task execution operation is performed by trying to run the butt joint task, whether the gateway is executed correctly or not can be checked in the gateway history record, and the adjustment configuration is performed by checking failure details.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a task visualization and debugging module according to the present invention;
FIG. 3 is a schematic diagram of a task configuration description module according to the present invention
FIG. 4 is a schematic diagram of a task migration correction module according to the present invention
Fig. 5 is a schematic diagram of an ETL tool of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the method for simplifying data butt joint task migration, the problems that a debugging process depends on script logs, special case adjustment needs to be researched and developed in the migration process, migration degree is low, multi-source multi-transmission protocol adaptation is more customized are solved, a standard version migration task is achieved, adaptation of all parts can be automatically completed, and an operable basic version is achieved; through plug-in, visualization and the like, the method is convenient for general personnel to detect whether the migration is correct or not; when the overall scheme changes, all migration tasks can be updated in batches.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
The embodiment of the invention discloses a method for simplifying migration of data docking tasks.
1-5, the system comprises a task configuration description module, a task migration correction module and a task visualization and debugging module;
the task visualization and debugging module is used for deploying a docking system, configuring related component modules and debugging, and describing and correcting migration data after the task visualization and debugging module is used for debugging, so that the task migration is fast.
Furthermore, by designing the structured description file of the data docking task and considering the adaptation rule of the current scene in the migration process, the modification of the migration description file and the creation of the executable and viewable migration task are automatically completed, the dependence of the migration process on research and development is reduced, the function of plug-in is realized, and the migration efficiency is improved.
For example, a standard version migration task can automatically complete the adaptation of each component to an operable base version; through plug-in, visualization and the like, the method is convenient for general personnel to detect whether the migration is correct or not; when the overall scheme changes, all migration tasks can be updated in batches.
When the data interfacing task of the same type is generally performed, for example, the task needs to be completed in N companies in a certain province. The docking system needs to be deployed in a company, and the configuration of related component modules and the successful debugging are completed in a task visualization and debugging module. The job plan of the company can be exported to the file.
The file will therefore contain a data source description, an ETL task description, a gateway protocol description, a task schedule description. When the files are migrated to other companies, the files are not required to be completely reconfigured and only need to be imported into a system, the system can carry out task migration correction, when data docking tasks are highly similar, the files can be used basically without modification, but in practice, the services of different companies are customized, some service field descriptions are not very consistent, the reported protocols are different in authorization mode, the reported interface protocols are also different, and in the process of visualizing and debugging the tasks by a service staff, the related debugging is carried out on the migrated tasks, and the targeted customized adjustment is carried out.
The adjustment of the subscription work is necessary and acceptable to business personnel, and the data docking task development of N companies can be rapidly completed in the above mode. Saving a great deal of work.
Preferably, the data source describes: structuring describes various types of data source connections;
the ETL task description: the data processing process of grouping, screening and other operator construction is carried out on the data source tables of the respective sources by the structural description;
the gateway protocol describes: structured description docking interface
The task schedule description: the structure describes the scheduling time of the task and the dependency.
Preferably, the data sources include mysql, oracle, hive, postgresql, mongodb, api sources and kafka sources;
the docking interface includes a header, an authorization, an agent, and a parameter change.
Preferably, the task migration correction includes variable replacement, metadata collation and term shortage completion.
Preferably, the variables replace: in the description file, replacing new data obtained after some variables are migrated;
the metadata proofreading: after migration, the data of the table needed in the ETL task is compared with the corresponding table metadata according to the new data source;
the absence complement: according to the idea of productization migration, some migrated data need to acquire new data, and complement processing is performed through a rule.
Such as:
rule 1: the ETL description file only describes the name of the data source, and after migration, whether the same name of the data source exists in the data source of the current system is checked, and multiplexing is performed if the same name exists. If the data source name does not exist, the default data source is used for replacing when the current data source name is judged to be the produced data source.
Rule 2: the ETL description file only describes the names of the ETL input node tables; after migration, checking whether the same name exists in the data source database of the current system, and directly using the same name without the necessity of consistency of English names.
Preferably, the task visualization and debugging module comprises a data source visualization, an ETL visualization, a gateway visualization, a task scheduling visualization and a task debugging.
Preferably, the data source visualizations: providing a data source management interface, and checking the current system data source and detail configuration;
ETL visualization: providing an ETL task management interface, and checking the ETL task and detail configuration of the current system;
the gateway visualizations: providing a management interface of a gateway, and checking the current system module plug-in and protocol details;
the task scheduling visualization: providing a task scheduling management interface, and checking the current system task and scheduling history details;
the task debugging: the data and configuration are adjusted in the data docking task.
And when the data meets the expected or not, the data can be timely adjusted. And the task execution operation is performed by trying to run the butt joint task, whether the gateway is executed correctly or not can be checked in the gateway history record, and the adjustment configuration is performed by checking failure details.
By designing the structured description file of the data docking task, the adaptation rule of the current scene in the migration process is considered, the modification of the migration description file is automatically completed, the executable and viewable migration task is created, the dependence of the migration process on research and development is reduced, the plug-in function is realized, and the migration efficiency is improved. The standard version migration task can automatically complete the adaptation of each component to reach an operable basic version; through plug-in, visualization and the like, the method is convenient for general personnel to detect whether the migration is correct or not; when the overall scheme changes, all migration tasks can be updated in batches.
When the data interfacing task of the same type is generally performed, for example, the task needs to be completed in N companies in a certain province. The docking system needs to be deployed in a company, and the configuration of related component modules and the successful debugging are completed in a task visualization and debugging module. The job plan of the company can be exported to the file.
This file will contain a data source description, an ETL task description, a gateway protocol description, a task schedule description. When the files are migrated to other companies, the files are not required to be completely reconfigured and only need to be imported into a system, the system can carry out task migration correction, when data docking tasks are highly similar, the files can be used basically without modification, but in practice, the services of different companies are customized, some service field descriptions are not very consistent, the reported protocols are different in authorization mode, the reported interface protocols are also different, and in the process of visualizing and debugging the tasks by a service staff, the related debugging is carried out on the migrated tasks, and the targeted customized adjustment is carried out.
The adjustment of the subscription work is necessary and acceptable to business personnel, and the data docking task development of N companies can be rapidly completed in the above mode. Saving a great deal of work.
The ETL includes extraction, conversion and loading steps, and the ETL tool processes the data sources, text files and other files and generates temporary data, and then collates the data and transfers the data to the target database.
ETL is responsible for extracting data in distributed and heterogeneous data sources such as relation data, plane data files and the like to a temporary intermediate layer, then cleaning, converting and integrating the data, and finally loading the data into a data warehouse or a data mart to form the basis of online analysis and processing and data mining.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has shown and described the basic principles and main features of the present invention and the advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. The method for simplifying the data docking task migration is characterized by comprising a task configuration description module, a task migration correction module and a task visualization and debugging module;
the task visualization and debugging module is used for deploying a docking system, configuring related component modules and debugging, and describing and correcting migration data after the task visualization and debugging module is used for debugging, so that the task migration is fast.
2. The method for facilitating migration of data docking tasks of claim 1, wherein: the task configuration description module comprises data source description, ETL task description, gateway protocol description and task scheduling description.
3. A method of simplifying data docking task migration in accordance with claim 2, wherein: the data source description: structuring describes various types of data source connections;
the ETL task description: the data processing process of grouping, screening and other operator construction is carried out on the data source tables of the respective sources by the structural description;
the gateway protocol describes: structured description docking interface
The task schedule description: the structure describes the scheduling time of the task and the dependency.
4. A method of simplifying data docking task migration in accordance with claim 3, wherein: the data sources include mysql, oracle, hive, postgresql, mongodb, api sources and kafka sources;
the docking interface includes a header, an authorization, an agent, and a parameter change.
5. The method for facilitating migration of data docking tasks of claim 1, wherein: the task migration corrections include variable replacement, metadata collation, and default completion.
6. The method for facilitating migration of data docking tasks of claim 5, wherein: the variable substitution: in the description file, replacing new data obtained after some variables are migrated;
the metadata proofreading: after migration, the data of the table needed in the ETL task is compared with the corresponding table metadata according to the new data source;
the absence complement: according to the idea of productization migration, some migrated data need to acquire new data, and complement processing is performed through a rule.
7. The method for facilitating migration of data docking tasks of claim 1, wherein: the task visualization and debugging module comprises a data source visualization, an ETL visualization, a gateway visualization, a task scheduling visualization and task debugging.
8. The method for facilitating migration of data docking tasks of claim 7, wherein: the data source visualization: providing a data source management interface, and checking the current system data source and detail configuration;
ETL visualization: providing an ETL task management interface, and checking the ETL task and detail configuration of the current system;
the gateway visualizations: providing a management interface of a gateway, and checking the current system module plug-in and protocol details;
the task scheduling visualization: providing a task scheduling management interface, and checking the current system task and scheduling history details;
the task debugging: the data and configuration are adjusted in the data docking task.
9. The method for facilitating migration of data docking tasks of claim 1, wherein: the ETL includes extraction, conversion and loading steps, and the ETL tool processes the data sources, text files and other files to generate temporary data, and then collates the data and transmits the data to the target database.
10. The method for facilitating migration of data docking tasks of claim 9, wherein: the ETL is responsible for extracting data in distributed and heterogeneous data sources such as relationship data and plane data files to a temporary middle layer, then cleaning, converting and integrating the data, and finally loading the data into a data warehouse or a data mart to form the basis of online analysis and processing and data mining.
CN202310609299.XA 2023-05-29 2023-05-29 Method for simplifying migration of data docking task Pending CN116339822A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310609299.XA CN116339822A (en) 2023-05-29 2023-05-29 Method for simplifying migration of data docking task

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310609299.XA CN116339822A (en) 2023-05-29 2023-05-29 Method for simplifying migration of data docking task

Publications (1)

Publication Number Publication Date
CN116339822A true CN116339822A (en) 2023-06-27

Family

ID=86882689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310609299.XA Pending CN116339822A (en) 2023-05-29 2023-05-29 Method for simplifying migration of data docking task

Country Status (1)

Country Link
CN (1) CN116339822A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125065A (en) * 2019-12-24 2020-05-08 阳光人寿保险股份有限公司 Visual data synchronization method, system, terminal and computer readable storage medium
CN111125058A (en) * 2019-12-06 2020-05-08 浪潮电子信息产业股份有限公司 Data migration method, device and system
CN114048188A (en) * 2021-11-16 2022-02-15 金现代信息产业股份有限公司 Cross-database data migration system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125058A (en) * 2019-12-06 2020-05-08 浪潮电子信息产业股份有限公司 Data migration method, device and system
CN111125065A (en) * 2019-12-24 2020-05-08 阳光人寿保险股份有限公司 Visual data synchronization method, system, terminal and computer readable storage medium
CN114048188A (en) * 2021-11-16 2022-02-15 金现代信息产业股份有限公司 Cross-database data migration system and method

Similar Documents

Publication Publication Date Title
CN109684053B (en) Task scheduling method and system for big data
CN107958057B (en) Code generation method and device for data migration in heterogeneous database
Fan et al. Migrating monolithic mobile application to microservice architecture: An experiment report
CN107368503B (en) Data synchronization method and system based on button
CN103914526B (en) A kind of interface method and device for SAP ERP systems and ORACLE ERP systems
US10523502B2 (en) Method and system for configuration of devices of a control system
CN103019874B (en) Method and the device of abnormality processing is realized based on data syn-chronization
EP2482192A1 (en) Testing lifecycle
CN105700888A (en) Visualization rapid developing platform based on jbpm workflow engine
CN101819529A (en) System and method for realizing visual development of workflow task interface
CN110019138B (en) Automatic transfer table space migration method and system based on Zabbix
CN114741375A (en) Rapid and automatic data migration system and method for multi-source heterogeneous database
US8140958B2 (en) Cyclical and synchronized multi-source spreadsheet imports and exports
CN110782225A (en) Workflow dynamic reconstruction method for parameter value influence flow branch
CN114048188A (en) Cross-database data migration system and method
CN110780854A (en) APP automatic integration platform system and method based on IOS system
Braun et al. A methodology for the detection of functional relations of mechatronic components and assemblies in brownfield systems
CN111930862B (en) SQL interactive analysis method and system based on big data platform
CN117762865A (en) Data lake entering method and system of big data platform
CN112667469A (en) Method, system and readable medium for automatically generating diversified big data statistical report
CN116339822A (en) Method for simplifying migration of data docking task
CN117333155A (en) Fault information processing method, system, server and client
CN112232655A (en) Electronic equipment digital production management system based on MBOM baseline
JP2006059108A (en) Support system for development test of information system
CN101458628A (en) Program edition management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230627

RJ01 Rejection of invention patent application after publication