CN113626514A - Automatic data loading method and device - Google Patents

Automatic data loading method and device Download PDF

Info

Publication number
CN113626514A
CN113626514A CN202111184802.9A CN202111184802A CN113626514A CN 113626514 A CN113626514 A CN 113626514A CN 202111184802 A CN202111184802 A CN 202111184802A CN 113626514 A CN113626514 A CN 113626514A
Authority
CN
China
Prior art keywords
data
loaded
information
configuration file
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111184802.9A
Other languages
Chinese (zh)
Other versions
CN113626514B (en
Inventor
陈晓希
史晨阳
王磊
刘淼
彭强
薛淇升
吴晓刚
张翠
何如意
侯强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Everbright Bank Co Ltd
Original Assignee
China Everbright Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Everbright Bank Co Ltd filed Critical China Everbright Bank Co Ltd
Priority to CN202111184802.9A priority Critical patent/CN113626514B/en
Publication of CN113626514A publication Critical patent/CN113626514A/en
Application granted granted Critical
Publication of CN113626514B publication Critical patent/CN113626514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/235Update request formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses an automatic data loading method and device, comprising the following steps: calling a configuration file at least containing attribute information of a data table to be loaded, and importing first information in the configuration file into the SQL template based on a corresponding relation between the pre-generated SQL template and the configuration file; importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the first information and the second information at least comprise attribute information of a data table to be loaded; importing data contained in a data table to be loaded into a big data platform through an HIVE template; recording information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded. Therefore, the method for automatically loading the data to the big data platform is realized, the accuracy of data loading is improved, and the error rate is reduced.

Description

Automatic data loading method and device
Technical Field
The present invention relates to the field of data processing, and in particular, to an automated data loading method and apparatus.
Background
With the continuous expansion of banking business, the data volume of the banking system will increase continuously, and in order to ensure that the data of the banking system can be reliably and stably stored, the data of the banking system is usually stored in a big data platform.
Currently, a manual loading mode is usually adopted to load data in a bank system into a big data platform, but the manual loading mode easily causes errors, and for some important data, even small errors may cause immeasurable loss, so that a method capable of reducing error rate is urgently needed to achieve the purpose of loading business data of the bank system into the big data platform.
Disclosure of Invention
In view of this, the embodiment of the present invention discloses an automated data loading method and apparatus, which automatically implement a process of loading from a source system to a big data platform, reduce an error rate, and improve accuracy and loading efficiency of data loading.
The embodiment of the invention discloses an automatic data loading method, which comprises the following steps of;
calling a configuration file in response to a data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
importing first information in a configuration file into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded;
importing the data contained in the data table to be loaded into a big data platform through the HIVE template;
recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
Optionally, the importing the first information in the configuration file into the SQL template based on a correspondence between the pre-generated SQL template and the configuration file includes:
traversing the configuration file, and when first target information in the configuration file is detected, determining a target SQL statement corresponding to the first target information based on the corresponding relation between the SQL template and the configuration file; and replacing the target SQL statement in the SQL template by the first target information.
Optionally, the importing, based on a correspondence between a pre-generated live template and a configuration file, second information in the configuration file into the live template includes:
traversing the configuration file, and when second target information in the configuration file is detected, determining a target HIVE statement corresponding to the second target information based on the corresponding relation between the HIVE template and the configuration file;
and replacing the target HIVE statement in the HIVE template with the second target information.
Optionally, the importing, by the HIVE template, data included in the data table to be loaded into a big data platform includes:
creating an external table in a database of a big data platform, and importing data in a data table to be loaded into the external table through attribute information of the data table to be loaded in the HIVE template;
creating an internal table in the database;
and importing the data of the data table in the external table into the internal table.
Optionally, the configuration file further includes:
and identification information of the large data platform corresponding to the data table to be loaded.
Optionally, the method further includes:
the first information imported into the SQL template also comprises identification information of a big data platform corresponding to the data table to be loaded.
Optionally, before importing the data included in the data table to be loaded to the big data platform, the method further includes:
and determining a target big data platform into which the data to be loaded needs to be imported based on the identification information of the big data platform in the configuration file.
Optionally, the configuration file and the first information further include:
an authorization field characterizing usage rights of the data table to be loaded.
Optionally, the configuration file further includes a scene type of the data table to be loaded, where the scene type of the data table to be loaded at least includes:
a full table, an incremental table, a full up-to-date table, and a linked list.
The first information imported into the SQL template comprises a scene type of a data table to be loaded;
the second information imported into the HIVE template includes the scene type of the data table to be loaded.
Optionally, the method further includes:
determining the scene type of a data table to be loaded through the HIVE template;
if the scene type of the data table to be loaded is a full table, importing the data in the data table to be loaded into a database of a big data platform, and replacing historical data of the data table stored in the big data platform;
if the scene type of the data table to be loaded is an increment table, loading the data in the data table into a database of a big data platform;
if the scene type of the data table to be loaded is a full-quantity latest table, matching the data in the data table to be loaded with the historical data stored in the big data platform, and loading the data which cannot be successfully matched with the historical data stored in the big data platform in the data table to be loaded into a database of the big data platform;
and if the scene type of the data table to be loaded is a zipper table, determining the beginning and end time of the data table, and replacing the historical data corresponding to the beginning and end time stored in the big data platform according to the beginning and end time of the data table to be loaded.
The embodiment of the invention discloses an automatic data loading device, which comprises:
the calling unit is used for calling the configuration file in response to the data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
the first import unit is used for importing first information in the configuration file into the SQL template based on the corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
the second import unit is used for importing second information in the configuration file into the HIVE template based on the corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded;
the data loading unit is used for importing the data contained in the data table to be loaded into a big data platform through the HIVE template;
the data recording unit is used for recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
The embodiment of the invention discloses an automatic data loading method and a device, wherein the method comprises the following steps: calling a configuration file at least containing attribute information of a data table to be loaded, and importing first information in the configuration file into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded; importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded; importing the data contained in the data table to be loaded into a big data platform through the HIVE template; recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded. Therefore, the SQL sentences and the HIVE sentences are automatically generated by the method, and based on the method, the method for automatically loading data to the big data platform is realized, the accuracy of data loading is improved, and the error rate is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart illustrating an automated data loading method according to an embodiment of the present invention;
FIG. 2 shows a flow diagram of a method of automatically generating an SQL statement;
FIG. 3 illustrates a flow diagram of a method of automatically generating a HIVE statement;
FIG. 4 is a schematic flow chart of an automated data loading method provided in the present invention;
fig. 5 is a schematic structural diagram illustrating an automated data loading apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a schematic flowchart of an automated data loading method provided in an embodiment of the present invention is shown, where in the embodiment, the method includes:
s101: calling a configuration file in response to a data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
in this embodiment, the attribute information of the data table mainly includes some basic information, for example, including: chinese name and English name of the data table to be loaded, source system to which the data table belongs, Chinese name of the data table field, English name of the data table field, type of the data table field and the like.
Besides, the configuration file also comprises: an authorization field and/or identification information of the big data platform.
The authorization field represents the use permission of the data table to be loaded; the identification information of the big data platform is used for indicating a target big data platform to which the data table to be loaded needs to be stored.
The alarm time can be determined according to the data loading time in the data table to be loaded, namely if the preset loading time is 4 points, if the preset loading time exceeds 4 points, alarm reminding is carried out.
S102: importing first information in a configuration file into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
in this embodiment, at present, data of the source system cannot be directly loaded into the database of the big data platform, and the data can be loaded only by means of a tool or after corresponding conversion.
In this embodiment, the method of converting the relevant information of the data table to be recorded into the SQL statement and the HIVE statement realizes the loading from the source system to the big data platform. In the prior art, the generation of the SQL statement and the HIVE statement is usually written manually, which not only wastes time and labor, but also is easy to generate errors, thereby affecting the loading of data.
In order to solve the above problem, in this embodiment, a method for automatically generating an SQL statement is provided:
presetting an SQL template, wherein the SQL template comprises written SQL sentences and missing parts; moreover, the corresponding relationship between the SQL template and the configuration file is preset, and further, the corresponding relationship between the SQL statement in the SQL template and the related information of the configuration file may be preset. By means of the preset corresponding relationship between the SQL template and the configuration file, the first information in the configuration file can be imported into the SQL template, or imported into the missing part of the SQL template, so that the SQL statement containing the related information of the configuration file is formed.
Specifically, referring to fig. 2, a flowchart of a method for automatically generating an SQL statement is shown, and the implementation method of S102 includes:
s201: traversing the configuration file, and when first target information in the configuration file is detected, determining a target SQL statement corresponding to the first target information based on the corresponding relation between the SQL template and the configuration file;
s202: and replacing the target SQL statement in the SQL template by the first target information.
In this embodiment, as can be seen from the above description, the SQL template includes written SQL statements and missing parts, where the missing parts are expressed by the SQL statements, and the SQL statements of the missing parts have a corresponding relationship with related information in the configuration file. When first target information in a configuration file is detected, determining a missing part SQL statement (target SQL statement) corresponding to the first target information, and replacing the first target information with the missing part SQL statement in the SQL template. This constitutes an SQL statement containing the relevant information for the configuration file.
In this embodiment, the first information in the configuration file is imported into the SQL template through the pre-generated corresponding relationship between the SQL template and the configuration file, so that the purpose of automatically generating the SQL statement is achieved.
It should be noted that: the first target information imported from the configuration file into the SQL template includes: the attribute information of the data table to be loaded may include, in addition to the attribute information: an authorization field and identification information of the big data platform.
The first target information imported from the configuration file into the SQL template may include other information besides the above-described information, which is not limited in this embodiment.
S103: importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded;
in order to solve the above problem, the present embodiment further provides a method for automatically generating a HIVE statement, where a HIVE template is preset, and the HIVE template includes a written HIVE statement and a missing part; moreover, the corresponding relationship between the HIV template and the configuration file is preset, and further, the corresponding relationship between the HIVE statement in the HIVE template and the configuration file may be preset, and the second information in the configuration file may be imported into the SQL template based on the corresponding relationship, so that the HIVE statement including the related information of the configuration file is generated.
Specifically, referring to fig. 3, a flowchart of a method for automatically generating a live statement is shown, which includes:
s301: traversing the configuration file, and when second target information in the configuration file is detected, determining a target HIVE statement corresponding to the second target information based on the corresponding relation between the HIVE template and the configuration file;
s302: and replacing the target HIVE statement in the HIVE template with the second target information.
In this embodiment, as can be known from the above description, the HIVE template includes written HIVE statements and missing parts, where the missing parts are represented by the HIVE statements, and the HIVE statements of the missing parts have a corresponding relationship with related information in the configuration file. When second target information in the configuration file is detected, determining a missing part HIVE statement (target HIVE statement) corresponding to the second target information, and replacing the missing part HIVE statement in the HIVE template with the second target information. This constitutes a HIVE statement containing the relevant information of the configuration file.
In this embodiment, the first information in the configuration file is imported into the HIVE template through the pre-generated corresponding relationship between the HIVE template and the configuration file, so that the purpose of automatically generating the HIVE statement is achieved.
It should be noted that: the second target information imported from the configuration file into the HIVE template comprises: the attribute information of the data table to be loaded may include, in addition to the attribute information: an authorization field and identification information of the big data platform.
The second target information imported from the configuration file into the HIVE template may include other information besides the above-described information, which is not limited in this embodiment.
S104: importing the data contained in the data table to be loaded into a big data platform through the HIVE template;
s105: recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
In this embodiment, the process of recording data by the big data platform includes two parts: storing the original data contained in the data table on the big data platform, and recording the relevant information of the data table on the big data platform. The purpose of loading the original data contained in the data table is achieved through the generated HIVE template, and the related information of the data table is recorded in the big data platform through the generated SQL template.
In this embodiment, importing, by using an HIVE template, data included in the data table into a big data platform includes a multi-medium method, which is not limited in this embodiment, and may be, for example, a method as follows:
creating an external table in a database of a big data platform, and importing data in a data table to be loaded into the external table through attribute information of the data table to be loaded in the HIVE template;
creating an internal table in the database;
and importing the data of the data table in the external table into the internal table.
After the internal table is created, the internal table may be partitioned to enable ordered data storage, and the partitioning method is not limited in this embodiment.
In the embodiment, a configuration file at least containing attribute information of a data table to be loaded is called, and first information in the configuration file is imported into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded; importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded; importing the data contained in the data table to be loaded into a big data platform through the HIVE template; recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded. Therefore, the SQL sentences and the HIVE sentences are automatically generated by the method, and based on the method, the method for automatically loading data to the big data platform is realized, the accuracy of data loading is improved, and the error rate is reduced.
In this embodiment, the big data platform includes multiple different types, the function implemented by each type of big data platform is different, the big data platforms required by different data may also be different, and in order to implement data loading to different big data platforms automatically, in this embodiment, the configuration file further includes identification information of the big data platform corresponding to the data table to be loaded.
Before loading data to a big data platform, identification information of the big data platform corresponding to a data table to be loaded in a configuration file needs to be identified, and a target big data platform corresponding to the data table is determined based on the identification information of the big data platform in the identified configuration file.
In this embodiment, when the information in the configuration file is imported into the SQL template, in addition to importing the information of the data table to be loaded, the identification information of the big data platform corresponding to the data table to be loaded may also be imported into the SQL template.
According to the introduced content, the corresponding relation between the SQL statement in the SQL template and the related information in the configuration file is preset, and besides the correlation relation between the SQL statement and the attribute information of the data table to be loaded, the SQL statement also has the corresponding relation with the identification information of the big data platform.
Furthermore, the SQL statement in the SQL template has a preset incidence relation with the authorization field.
In this embodiment, the data table includes different scene types, and the different scene types correspond to different loading manners, and in order to adapt to loading of the data table of each scene type, in this embodiment, the scene types of the different data tables are set in the configuration file. In this way, in the process of executing data loading, the data loading process may be executed based on the scene type information of the data table contained in the configuration file:
wherein, the scene type of the data table at least comprises: a full table, an incremental table, a full up-to-date table, and a linked list.
Before executing data loading, the method further comprises the following steps:
importing the scene type of the data table to be loaded into the SQL template;
and importing the scene type of the data table to be loaded into the HIVE template.
That is, the first information imported into the SQL template includes the scene type of the data table to be loaded; the second information imported into the HIVE template includes the scene type of the data table to be loaded.
Aiming at the scene type of the full data table, all information in the data table is loaded into the big data platform, and the original data of the data table in the big data platform is replaced;
wherein, aiming at the scene type of the increment table, namely adding data in the data table in the big data platform;
loading newly added data in the data table into a big data platform according to the scene type of the full-scale latest table;
and aiming at the scene type of the linked list, namely extracting part of data in the data table and loading the part of the data into the big data platform.
In order to implement loading of data tables of different scene types, in this embodiment, the method further includes: and importing the types of the data tables to be loaded in the configuration file into the SQL template and the HIVE template.
With reference to fig. 4, a further flowchart of an automated data loading method provided in the present invention is shown for the above scenario types of the data table, including:
s401: determining the scene type of a data table to be loaded through the HIVE template;
s402: if the scene type of the data table to be loaded is a full table, importing the data in the data table to be loaded into a database of a big data platform, and replacing historical data of the data table stored in the big data platform;
s403: if the scene type of the data table to be loaded is an increment table, loading the data in the data table into a database of a big data platform;
s404: if the scene type of the data table to be loaded is a full-quantity latest table, matching the data in the data table to be loaded with the historical data stored in the big data platform, and loading the data which cannot be successfully matched with the historical data stored in the big data platform in the data table to be loaded into a database of the big data platform;
s405: and if the scene type of the data table to be loaded is a zipper table, determining the beginning and end time of the data table, and replacing the historical data corresponding to the beginning and end time stored in the big data platform according to the beginning and end time of the data table to be loaded.
In this embodiment, different data loading modes are implemented by the pre-labeled scene type of the data table to be loaded.
Referring to fig. 5, a schematic structural diagram of an automated data loading apparatus according to an embodiment of the present invention is shown, in this embodiment, the apparatus includes:
a calling unit 501, configured to call a configuration file in response to a data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
a first importing unit 502, configured to import first information in a configuration file into an SQL template based on a correspondence relationship between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
a second importing unit 503, configured to import second information in the configuration file into the HIVE template based on a correspondence between a pre-generated HIVE template and the configuration file; the second information at least comprises attribute information of a data table to be loaded;
a data loading unit 504, configured to import, through the HIVE template, data included in the data table to be loaded into a big data platform;
a data recording unit 505, configured to record information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
Optionally, the first import unit includes:
the first traversal unit is used for traversing the configuration file, and when first target information in the configuration file is detected, determining a target SQL statement corresponding to the first target information based on the corresponding relation between the SQL template and the configuration file;
and the first replacing unit is used for replacing the target SQL statement in the SQL template with the first target information.
Optionally, the second import unit includes:
the second compiling unit is used for traversing the configuration file, and when second target information in the configuration file is detected, determining a target HIVE statement corresponding to the second target information based on the corresponding relation between the HIVE template and the configuration file;
and the second replacing unit is used for replacing the second target information with the target HIVE statement in the HIVE template.
Optionally, the data loading unit includes:
the data loading subunit is used for creating an external table in a database of a big data platform, and importing the data in the data table to be loaded into the external table through the attribute information of the data table to be loaded in the HIVE template;
an internal table creating subunit configured to create an internal table in the database;
and the data importing subunit is used for importing the data of the data table in the external table into the internal table.
Optionally, the configuration file further includes:
and identification information of the large data platform corresponding to the data table to be loaded.
Optionally, the first information imported into the SQL template further includes identification information of a big data platform corresponding to the data table to be loaded.
Optionally, the method further includes:
the configuration file further includes a scene type of the data table to be loaded, and the scene type of the data table to be loaded at least includes:
a full table, an incremental table, a full up-to-date table, and a linked list.
The first information imported into the SQL template comprises a scene type of a data table to be loaded;
the second information imported into the HIVE template includes the scene type of the data table to be loaded.
Optionally, the method further includes:
the scene type determining unit is used for determining the scene type of the data table to be loaded through the HIVE template;
the system comprises a full table loading unit, a data processing unit and a data processing unit, wherein the full table loading unit is used for importing data in a data table to be loaded into a database of a big data platform and replacing historical data of the data table stored in the big data platform if the scene type of the data table to be loaded is a full table;
the increment table loading unit is used for loading the data in the data table into a database of a big data platform if the scene type of the data table to be loaded is an increment table;
the system comprises a full-volume latest table loading unit, a data platform and a data processing unit, wherein the full-volume latest table loading unit is used for matching data in a data table to be loaded with historical data stored in the large data platform if the scene type of the data table to be loaded is a full-volume latest table, and loading data which cannot be successfully matched with the historical data stored in the large data platform in the data table to be loaded into a database of the large data platform;
and the zipper table loading unit is used for determining the start time and the end time of the data table if the scene type of the data table to be loaded is the zipper table, and replacing the historical data corresponding to the start time and the end time stored in the big data platform according to the start time and the end time of the data table to be loaded.
The embodiment of the invention discloses an automatic data loading device, which comprises: calling a configuration file at least containing attribute information of a data table to be loaded, and importing first information in the configuration file into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded; importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded; importing the data contained in the data table to be loaded into a big data platform through the HIVE template; recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded. Therefore, the SQL sentences and the HIVE sentences are automatically generated by the method, and based on the method, the method for automatically loading data to the big data platform is realized, the accuracy of data loading is improved, and the error rate is reduced.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. An automated data loading method, comprising;
calling a configuration file in response to a data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
importing first information in a configuration file into an SQL template based on a corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
importing second information in the configuration file into the HIVE template based on a corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded;
importing the data contained in the data table to be loaded into a big data platform through the HIVE template;
recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
2. The method according to claim 1, wherein the importing the first information in the configuration file into the SQL template based on the correspondence between the pre-generated SQL template and the configuration file comprises:
traversing the configuration file, and when first target information in the configuration file is detected, determining a target SQL statement corresponding to the first target information based on the corresponding relation between the SQL template and the configuration file;
and replacing the target SQL statement in the SQL template by the first target information.
3. The method according to claim 1, wherein the importing the second information in the configuration file into the HIVE template based on a correspondence between the pre-generated HIVE template and the configuration file comprises:
traversing the configuration file, and when second target information in the configuration file is detected, determining a target HIVE statement corresponding to the second target information based on the corresponding relation between the HIVE template and the configuration file;
and replacing the target HIVE statement in the HIVE template with the second target information.
4. The method according to claim 1, wherein the importing, through the HIVE template, the data contained in the data table to be loaded into a big data platform includes:
creating an external table in a database of a big data platform, and importing data in a data table to be loaded into the external table through attribute information of the data table to be loaded in the HIVE template;
creating an internal table in the database;
and importing the data of the data table in the external table into the internal table.
5. The method of claim 1, wherein the configuration file further comprises:
and identification information of the large data platform corresponding to the data table to be loaded.
6. The method according to claim 1 or 3,
the first information imported into the SQL template also comprises identification information of a big data platform corresponding to the data table to be loaded.
7. The method according to claim 6, wherein before importing the data contained in the data table to be loaded into the big data platform, the method further comprises:
and determining a target big data platform into which the data to be loaded needs to be imported based on the identification information of the big data platform in the configuration file.
8. The method of claim 1, wherein the configuration file and the first information further comprise:
an authorization field characterizing usage rights of the data table to be loaded.
9. The method according to claim 1, wherein the configuration file further includes a scene type of the data table to be loaded, and the scene type of the data table to be loaded at least includes:
a full table, an increment table, a full latest table and a linked list;
the first information imported into the SQL template comprises a scene type of a data table to be loaded;
the second information imported into the HIVE template includes the scene type of the data table to be loaded.
10. The method of claim 9, further comprising:
determining the scene type of a data table to be loaded through the HIVE template;
if the scene type of the data table to be loaded is a full table, importing the data in the data table to be loaded into a database of a big data platform, and replacing historical data of the data table stored in the big data platform;
if the scene type of the data table to be loaded is an increment table, loading the data in the data table into a database of a big data platform;
if the scene type of the data table to be loaded is a full-quantity latest table, matching the data in the data table to be loaded with the historical data stored in the big data platform, and loading the data which cannot be successfully matched with the historical data stored in the big data platform in the data table to be loaded into a database of the big data platform;
and if the scene type of the data table to be loaded is a zipper table, determining the beginning and end time of the data table, and replacing the historical data corresponding to the beginning and end time stored in the big data platform according to the beginning and end time of the data table to be loaded.
11. An automated data loading apparatus, comprising:
the calling unit is used for calling the configuration file in response to the data loading instruction; the configuration file at least comprises attribute information of a data table to be loaded;
the first import unit is used for importing first information in the configuration file into the SQL template based on the corresponding relation between the SQL template and the configuration file generated in advance; the first information at least comprises attribute information of a data table to be loaded;
the second import unit is used for importing second information in the configuration file into the HIVE template based on the corresponding relation between the HIVE template and the configuration file which is generated in advance; the second information at least comprises attribute information of a data table to be loaded;
the data loading unit is used for importing the data contained in the data table to be loaded into a big data platform through the HIVE template;
the data recording unit is used for recording the information in the SQL template into the big data platform; the information in the SQL template at least comprises attribute information of a data table to be loaded.
CN202111184802.9A 2021-10-12 2021-10-12 Automatic data loading method and device Active CN113626514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111184802.9A CN113626514B (en) 2021-10-12 2021-10-12 Automatic data loading method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111184802.9A CN113626514B (en) 2021-10-12 2021-10-12 Automatic data loading method and device

Publications (2)

Publication Number Publication Date
CN113626514A true CN113626514A (en) 2021-11-09
CN113626514B CN113626514B (en) 2022-02-25

Family

ID=78391041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111184802.9A Active CN113626514B (en) 2021-10-12 2021-10-12 Automatic data loading method and device

Country Status (1)

Country Link
CN (1) CN113626514B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192651A1 (en) * 2014-06-17 2015-12-23 中兴通讯股份有限公司 Method for implementing rapid configuration and rapid configuration server
CN105243167A (en) * 2015-11-10 2016-01-13 中国建设银行股份有限公司 Data processing method and device
CN105760168A (en) * 2016-02-23 2016-07-13 深圳竹信科技有限公司 Automatic code file generation method and system
CN105808778A (en) * 2016-03-30 2016-07-27 中国银行股份有限公司 Method and device for extracting, transforming and loading mass data
CN107958028A (en) * 2017-11-16 2018-04-24 平安科技(深圳)有限公司 Method, apparatus, storage medium and the terminal of data acquisition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192651A1 (en) * 2014-06-17 2015-12-23 中兴通讯股份有限公司 Method for implementing rapid configuration and rapid configuration server
CN105243167A (en) * 2015-11-10 2016-01-13 中国建设银行股份有限公司 Data processing method and device
CN105760168A (en) * 2016-02-23 2016-07-13 深圳竹信科技有限公司 Automatic code file generation method and system
CN105808778A (en) * 2016-03-30 2016-07-27 中国银行股份有限公司 Method and device for extracting, transforming and loading mass data
CN107958028A (en) * 2017-11-16 2018-04-24 平安科技(深圳)有限公司 Method, apparatus, storage medium and the terminal of data acquisition

Also Published As

Publication number Publication date
CN113626514B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN109656934B (en) Source Oracle database DDL synchronization method and device based on log analysis
CN106844307B (en) System and method for converting Excel into Word based on mark
KR20140009297A (en) Formatting data by example
US20140156603A1 (en) Method and an apparatus for splitting and recovering data in a power system
CN109471851B (en) Data processing method, device, server and storage medium
CN103390005A (en) Method and system for merging documents
CN101859303A (en) Metadata management method and management system
CN112698868B (en) Unified error code method applied to multiple systems and storage device
CN104572781A (en) Method and device for generating transaction log
CN112231407A (en) DDL synchronization method, device, equipment and medium of PostgreSQL database
CN113626514B (en) Automatic data loading method and device
CN103488549B (en) Roll-back processing system and roll-back processing method for multi-mirror-image data
CN107506339B (en) Character offset-based SCD node verification error positioning method and device
CN113918377B (en) Method, device and equipment for positioning C + + program crash and storage medium
CN115269548A (en) Method and system for generating data warehouse development model and related equipment
CN109614442A (en) Data synchronous data listing maintenance, device, storage medium and electronic equipment
CN110807037B (en) Data modification method and device, electronic equipment and storage medium
CN111241191A (en) Database synchronization method and device
CN110597828A (en) Database changing method, device, equipment and storage medium
CN110941586A (en) Engineering design data management method and system
CN112612805B (en) Method for indexing hbase data to query engine and related device
KR101148552B1 (en) System and method using information of modified document
CN111045917A (en) Method and device for converting format of test case
CN113297217B (en) Data transmission method, device and system
CN110704302B (en) Mapping relation establishing method and device, and system breakdown shunting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant