CN113312357A - Data loading method, device, equipment and storage medium - Google Patents

Data loading method, device, equipment and storage medium Download PDF

Info

Publication number
CN113312357A
CN113312357A CN202110699549.4A CN202110699549A CN113312357A CN 113312357 A CN113312357 A CN 113312357A CN 202110699549 A CN202110699549 A CN 202110699549A CN 113312357 A CN113312357 A CN 113312357A
Authority
CN
China
Prior art keywords
data
operation plan
target operation
data file
plan
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110699549.4A
Other languages
Chinese (zh)
Inventor
杨晓宇
王波
肖波
宋磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202110699549.4A priority Critical patent/CN113312357A/en
Publication of CN113312357A publication Critical patent/CN113312357A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data loading method, a data loading device, data loading equipment and a storage medium. The method is applied to a data warehouse system, the data warehouse system loads data to a data table according to data files, and the method comprises the following steps: acquiring a work plan list of a data table, wherein the work plan list comprises a plurality of work plans, and each work plan in the plurality of work plans corresponds to a data file; determining a target operation plan among the plurality of operation plans; if the data file corresponding to the target operation plan does not exist, determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time; and if the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time, skipping the target operation plan under the condition that the preset loading strategy of the data table indicates that the target operation plan can be skipped, and determining a new target operation plan in the plurality of operation plans. The method improves the data loading efficiency.

Description

Data loading method, device, equipment and storage medium
Technical Field
The present application relates to computer technologies, and in particular, to a data loading method, apparatus, device, and storage medium.
Background
With the development of computers, data becomes more and more important for enterprises, and systems used by the enterprises can be divided into upstream systems and downstream systems according to the data flow direction, and data interaction can be carried out between the upstream systems and the downstream systems.
The upstream system is a system that generates data and the downstream system is a system that utilizes the data. The upstream system needs to export data to form a data file and send the data file to the downstream system, and the downstream system loads data according to the data file. Generally, an upstream system transmits a data file to a downstream system on the next day, and due to many factors, the generation time of the data file is unstable, and the data file may not be generated on a certain day and further cannot be transmitted to the downstream system. If the downstream system does not receive the data file of the day, the downstream system continuously waits for the data file of the day.
However, the downstream system continuously waits for the data file, and the data file cannot be loaded even if the data file arrives after the current day, so that the data loading efficiency is low.
Disclosure of Invention
The application provides a data loading method, a data loading device, data loading equipment and a storage medium, which are used for solving the problem of low data loading efficiency.
In a first aspect, the present application provides a data loading method, which is applied to a data warehouse system, where the data warehouse system loads data in a data table according to a data file, and the method includes: acquiring a work plan list of a data table, wherein the work plan list comprises a plurality of work plans, and each work plan in the plurality of work plans corresponds to a data file; determining a target operation plan among the plurality of operation plans; if the data file corresponding to the target operation plan does not exist, determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time; and if the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time, skipping the target operation plan under the condition that the preset loading strategy of the data table indicates that the target operation plan can be skipped, and determining a new target operation plan in the plurality of operation plans.
In a second aspect, the present application provides a data loading apparatus, which is applied to a data warehouse system, where the data warehouse system loads data in a data table according to a data file, and the apparatus includes: the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a job plan list of a data table, the job plan list comprises a plurality of job plans, and each job plan in the plurality of job plans corresponds to a data file; a determination module for determining a target job plan among the plurality of job plans; the determining module is further used for determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time or not if the data file corresponding to the target operation plan does not exist; and the determining module is further configured to skip the target operation plan if the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time, and determine a new target operation plan in the plurality of operation plans when a preset loading strategy of the data table indicates that the target operation plan can be skipped.
In a third aspect, the present application provides an electronic device, comprising: a memory, a processor; a memory; a memory for storing the processor-executable instructions; wherein the processor is configured to implement the method of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions for implementing the method of the first aspect when executed by a processor.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
According to the data loading method, the data loading device, the data loading equipment and the storage medium, whether the waiting time of the data file corresponding to the target operation plan is smaller than or equal to the preset time or not is determined under the condition that the data file corresponding to the target operation plan does not exist, the target operation plan is skipped under the condition that the waiting time of the data file corresponding to the target operation plan is smaller than or equal to the preset time and the preset loading strategy of the data table indicates that the target operation plan can be skipped, and a new target operation plan is determined in a plurality of operation plans. The tolerance judgment is carried out according to the preset time, whether the target operation plan can be skipped is determined according to the preset loading strategy, the skipped target operation plan is skipped in the tolerance range, other operation plans are executed, and therefore the subsequent other data loading processes are not affected, and the data loading efficiency is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is an application scenario diagram provided in an embodiment of the present application;
fig. 2 is a first flowchart of a data loading method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a skip loading strategy provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a latest loading strategy provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of a sequential loading strategy provided by an embodiment of the present application;
fig. 6 is a second flowchart of a data loading method according to an embodiment of the present application;
fig. 7 is a flowchart three of a data loading method according to an embodiment of the present application;
FIG. 8 is a schematic structural diagram of a data loading device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
Before describing the present embodiment, please allow explanation of the terms of the present application:
batch application: a computer application for processing a certain type of batch business, a batch application may process one or more batch job plans.
And (3) batch operation planning: a job plan for performing a particular task in a lot application, a job plan including one or more lot nodes.
And (3) node connection in batches: a node in a batch job plan for executing a particular step is specifically configured to execute a branch or transaction, and the batch node is the unit with the smallest granularity in the batch application.
Fig. 1 is an application scenario diagram provided in the embodiment of the present application. As shown in fig. 1, the application scenario includes: the system comprises a data source, a data load, a data warehouse engine and a front-end tool;
wherein, the data source can be a service database, such as an Oracle database or a MySQL database; the data source may also be documentation, such as STL files or TXT files; the data source may also be other data.
Data loading involves format conversion of data obtained from a data source and storage into a data warehouse.
The data warehouse engine comprises different servers, and the different servers acquire data from the data warehouse and provide different services for the front-end user, wherein the different services comprise data query service, data report service, data analysis service and various application programs.
The front-end tools include data query tools, data mining tools, report generation tools, and data warehouse-based application development tools.
Wherein a data source may be understood as an upstream system and a data repository may be understood as a downstream system.
The upstream system generates data required by the downstream system into a data file and transmits the data file to the downstream system, and the downstream system loads the data according to the data file if receiving the data file. The upstream system 11 is affected by various factors, and there may be a case where the data file cannot be transmitted to the downstream system at a certain time. For example, the upstream system and the downstream system agree that the upstream system generates data generated every day into a data file and transmits the data file to the downstream system, then the upstream system on day 1 transmits the data file to the downstream system, but day 2 is affected by some factors, the upstream system does not transmit the data file to the downstream system, while the downstream system on day 3 does not perform data loading according to the data file on day 3 even if the data file is received, but continues to receive the data file on day 2, and does not perform data loading on the data files on days 2 and 3 until the data file on day 2 is received, and if the data file on day 2 is not received continuously, loading of the data file arriving on day 3 and thereafter is affected, and thus, data loading efficiency is affected.
In view of the above technical problems, the inventors of the present application propose the following technical idea: for some data tables, even if a data file of one time is not loaded, the loading of the next data file is not influenced. Therefore, the data tables can be divided into different types of data tables according to the characteristics of the data tables, a data loading strategy is set for the different types of data tables, the data loading strategy is used for indicating whether the data files can be skipped or not under the condition that a certain data file does not exist, and data loading is carried out according to the next data file, so that the subsequent data files can be loaded under the condition that the certain data file does not exist, the loading of the subsequent data files is not influenced, and the data loading efficiency is improved.
Furthermore, each data table can be set with a tolerance, that is, the maximum delay time of the data file after the data file time can be tolerated, and if the late arrival time of the data file is within the tolerance range, the data file can still be loaded.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 2 is a first flowchart of a data loading method according to an embodiment of the present application. As shown in fig. 2, the data loading method includes the following steps:
s201, acquiring a work plan list of a data table, wherein the work plan list comprises a plurality of work plans, and each work plan in the plurality of work plans corresponds to a data file.
The main execution body of the method of the embodiment is a data warehouse system. The data file includes data to be saved in the data warehouse, and may be a data source in the embodiment shown in fig. 1.
For example, the upstream system needs to generate data of each day as a data file and send the data file to the downstream system, and the time when the upstream system generates the data is the date of the data file. And the data warehouse system receives the data file according to the receiving time, and loads the data according to the data file when receiving the data file.
Generally, since the upstream system generally sends the data file to the data warehouse system at the end of the day, the data warehouse system receives the data file later than the data file, that is, the data file received on day T + n is the data file of day T.
The data file time corresponding to each of the plurality of operation plans does not exceed the current time, and may be earlier than the current time or equal to the current time.
The following introduces the job plan list by a table:
TABLE 1 Job plan List
Work plan Date of data file
Work plan 1 2021, 6 months and 14 days
Work plan 2 2021, 6 months and 15 days
Work plan 3 6/month/16/2021
In this embodiment, the plurality of job plans may be understood as batch job plans, and the plurality of job plans may be batch-processed using a batch application.
S202, a target work plan is determined from the plurality of work plans.
In this embodiment, the data table is configured with a data loading order, where the data loading order includes a first type loading order and a second type loading order, and the first type loading order is used to indicate that data loading is performed in an order of data file time from early to late. The second type of loading order is used to indicate that data loading is performed according to the most recently received data file.
This step is to determine a target job plan among the plurality of job plans according to the data loading order arranged in the data table. Specifically, if the data loading order of the data table is the first type loading policy, the operation plan corresponding to the earliest reception time among the plurality of operation plans is set as the target operation plan. And if the data loading sequence of the data table is the second type loading strategy, taking the operation plan corresponding to the latest receiving time in the plurality of operation plans as the target operation plan.
S203, determining whether the data file corresponding to the target operation plan exists.
Specifically, whether the downstream system receives the data file corresponding to the target job plan sent by the upstream system is determined. For example, the job plan corresponding to the data file date of 2021 year 6, month 14 is to determine whether the data file corresponding to the data file date of 2021 year 6, month 14 is received, that is, to determine whether the upstream system transmits the data file of 2021 year 6, month 14 to the downstream system.
S204, if the data file corresponding to the target operation plan does not exist, determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time.
And the waiting time of the data file corresponding to the target operation plan is the time difference between the current time and the receiving time of the data file corresponding to the target operation plan. The receiving time refers to the time preset by the downstream system for receiving the data file for the first time. The receiving time and the data file time have an association relationship, for example, if the upstream system generates the data of the current day as a data file and sends the data file to the data warehouse system the next day, the time corresponding to the next day is the receiving time.
In this embodiment, the preset time is a maximum delay time of the arrival time of the data file compared with the time of the data file.
Assuming that the preset time is R, wherein R is an integer greater than or equal to 0, the unit is day, the current time is CDATE, and the data file time of the current operation plan is RDATE. Then, whether the data file time corresponding to the target operation plan exceeds the tolerance set by the user or not can be judged according to the data file time, the current time and the preset time, specifically: if CDATE-RDATE > R + n, determining that the tolerance is exceeded; if CDATE-RDATE is less than or equal to R + n, the tolerance range is judged to be within.
The preset time is determined according to the tolerance set by the user and the time difference between the time when the downstream system receives the data file and the time when the data file is received. Specifically, the preset time is tolerance set by the user + a time difference between the receiving time of the downstream system for the data file and the data file time.
The downstream system receives the data file within the receiving time of the data file, and the receiving result comprises the following steps: a received data file and an unreceived data file.
In some optional embodiments, the tolerance may be determined according to a data service corresponding to the data table, for example, if the data table provides a weekly report service, the tolerance of the data table may be set to 7 days.
The data table provides a monthly report service, and the tolerance of the data table can be set to 30 days.
And n is the time difference between the receiving time of the downstream system to the data file and the time of the data file, and because the time for the downstream system to receive the upstream data is later than the time of the data file, namely the data file received in T + n days is the data in T days, the value of n is a positive integer greater than 1. For example, if the upstream system processes data and sends a data file to the downstream system the next day, typically at the end of the day, then n may take the value of 1.
And S205, if so, skipping the target operation plan and determining a new target operation plan in the plurality of operation plans under the condition that the preset loading strategy of the data table indicates that the target operation plan can be skipped.
In this embodiment, the data table is further configured with a preset loading policy, where the preset loading policy is used to indicate whether the job plan can be skipped and another job plan is executed when the data file corresponding to the job plan does not exist.
Still referring to table 1, for example, if the current time is 2021 year 6 month 16 day, and the preset time is 3 days, the job plan corresponding to 2021 year 6 month 14 day is determined, and if there is no data file of 2021 year 6 month 14 day, it is determined whether the waiting time for the data file of 2021 year 6 month 14 day is less than or equal to 3, and if it is less than or equal to 3 and the preset loading policy of the data table indicates that the target job plan can be skipped, the job plan corresponding to 2021 year 6 month 14 day is skipped, the job plan corresponding to 2021 year 6 month 15 day is set as the new target job plan, and the process returns to execute S202.
And S206, if not, recording the state of the target operation plan as overtime, and sending alarm information.
And S207, if the data file corresponding to the target operation plan exists, loading data according to the data file corresponding to the target operation plan, and recording the state of the target operation plan as finished.
Specifically, the data in the data file corresponding to the target job plan is imported into the data table.
In this embodiment, the data table is provided with data loading types, and the data loading types include full loading, incremental loading, and the like.
For the full load, the data table is refreshed, i.e. rewritten, according to the data in the data file corresponding to the target operation plan.
For incremental loading, the changed data is added to the data table according to the data in the data file corresponding to the target operation plan.
The embodiment determines whether the waiting time of the data file corresponding to the target job plan is less than or equal to the preset time or not in the case that the data file corresponding to the target job plan does not exist, and skips the target job plan in the case that the waiting time of the data file corresponding to the target job plan is less than or equal to the preset time and the preset loading strategy of the data table indicates that the target job plan can be skipped, and determines a new target job plan among the plurality of job plans. The tolerance judgment is carried out according to the preset time, whether the target operation plan can be skipped is determined according to the preset loading strategy, the skipped target operation plan is skipped in the tolerance range, other operation plans are executed, and therefore the subsequent other data loading processes are not affected, and the data loading efficiency is improved.
On the basis of the above-described embodiment, the preset loading policy may be divided into a skip loading policy, a newest loading policy, and a sequential loading policy. However, when both the skip load policy and the latest load policy indicate that the job plan is not completed, the job plan cannot be skipped and another job plan cannot be executed.
And when the skip loading strategy is used for indicating that the current operation plan is not finished, the current operation plan can be skipped and other operation plans can be executed.
Illustratively, two data files that are adjacent in data file time are referred to as a first data file and a second data file, respectively, wherein the data file time of the first data file is earlier than the data file time of the second data file.
The skip load policy is applied to a case where the result of loading the first data file does not have an influence on the second data file, and in this case, even if the data in the first data file is not loaded, the data in the second data file can be loaded.
Fig. 3 is a schematic diagram of a skip loading strategy according to an embodiment of the present application.
As shown in fig. 3, referring to table 1, the job plan corresponding to the data file time of 2021 year 6/month 14 day is set as the target job plan, and it is determined whether or not the data file of 2021 year 6/month 14 day is present, and if the data file of 2021 year 6/month 14 day is not present, the job plan of 2021 year 6/month 15 day is set as the new target job plan, and if the data file of 2021 year 6/month 15 day is present, the data file of 2021 year 6/month 15 day can be loaded with data.
In an embodiment of the skip load strategy, determining a new target job plan among the plurality of job plans comprises: and determining the operation plan corresponding to the next-earliest receiving time in the plurality of operation plans as a new target operation plan.
And if the data file with the latest data file time does not exist and the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time, acquiring the operation plan corresponding to the next new data file time as a new target operation plan.
Fig. 4 is a schematic diagram of a latest loading policy provided in an embodiment of the present application.
Referring to table 1, as shown in fig. 4, the latest loading policy refers to loading data according to the received data file of the latest data file time. Specifically, the respective job plans are executed in the order from the back to the front in table 1, that is: the job plan corresponding to the data file time of 2021 year 6 month 16 day is set as the target job plan, and it is judged whether the data file of 2021 year 6 month 16 day is present, if not, the job plan of 2021 year 6 month 15 day is set as the new target job plan, and if the data file of 2021 year 6 month 15 day is present, the data file of 2021 year 6 month 15 day can be loaded.
In an embodiment of the latest loading strategy, determining a new target job plan among the plurality of job plans includes: and determining the operation plan corresponding to the next new receiving time in the plurality of operation plans as a new target operation plan.
And the sequential loading strategy is used for indicating a first type of loading sequence, and if the data file corresponding to the target operation plan does not exist and the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time, recording the execution state of the operation plan as unexecuted and ending.
Two data files adjacent in data file time are respectively called a first data file and a second data file, wherein the data file time of the first data file is earlier than that of the second data file.
The sequential loading is suitable for a situation that a loading result of the first data file has an influence on the second data file, and in this situation, the data in the second data file can be loaded only after the data in the first data file is loaded.
Fig. 5 is a schematic diagram of a sequential loading strategy provided in an embodiment of the present application.
As shown in fig. 5, referring to table 1, sequential loading means that the data file of 2021 year 6/month 14 is loaded only after the data file of 2021 year 6/month 15 is loaded, and if the data file of 2021 year 6/month 14 does not exist, the data file of 2021 year 6/month 15 cannot be loaded even if the data file of 2021 year 6/month 15 is already loaded.
On the basis of the above embodiment, the preset loading policy of the data table may also be determined according to the type of the data table. Fig. 6 is a second flowchart of a data loading method according to an embodiment of the present application. As shown in fig. 6, the method includes:
s601, determining the type of the data table, wherein the type of the data table comprises a master file type and a detail type.
In some examples, the master profile type includes accounting. The detail type includes transaction pipelining.
S602, if the type of the data table is the master file type, determining that the preset loading strategy of the data table is that the target operation plan cannot be skipped.
And S603, if the type of the data table is the detail type, determining that the preset loading strategy of the data table is that the target operation plan can be skipped.
On the basis of the above-described embodiment, the job plan list includes the job plan at the current time and the job plans that have not been completed before the current time. Before this step, a job plan list needs to be generated, and fig. 7 is a flowchart of a data loading method provided in this embodiment of the present application. As shown in fig. 7, it is necessary to generate a job plan list including:
s701, whether the current time reaches the receiving time of the data file of the data table or not is judged.
The present embodiment is to determine whether or not a job plan needs to be generated, based on the reception time of the data file of the data table.
And S702, if the current time reaches the receiving time of the data file of the data table, generating a work plan of the current time, and adding the work plan to a pre-established work plan list.
And S703, if the current time does not reach the receiving time of the data file of the data table, not generating the operation plan of the current time.
After the job plan at the current time is generated, the job plan at the current time is processed according to the steps of the embodiment shown in fig. 2, and the state of the job plan is recorded according to the processing result. Specifically, if the operation plan at the current time is not completed, the state of the operation plan at the current time is recorded as unfinished; if the operation plan at the current time is finished, deleting the operation plan at the current time in the operation plan list, or recording the state of the operation plan at the current time in the operation plan list as finished; and if the operation plan at the current time is overtime, recording the state of the operation plan at the current time in the operation plan list as overtime, and sending alarm information.
It should be understood that the embodiment shown in fig. 2 is to process a job plan whose state is incomplete in the job plan list.
For example, if the downstream system needs to receive the data file of data table a every day, the operation plan of the day needs to be generated every day. If the downstream system receives the data file of data table a every 2 days, the operation plan is generated every 2 days, and the operation plan in the interval time is not generated.
Based on the above method embodiment, fig. 8 is a schematic structural diagram of a data loading device according to an embodiment of the present application. As shown in fig. 8, the data loading apparatus includes: an acquisition module 801 and a determination module 802; an obtaining module 801, configured to obtain a job plan list of a data table, where the job plan list includes a plurality of job plans, and each job plan in the plurality of job plans corresponds to a data file; a determining module 802 for determining a target job plan among the plurality of job plans; the determining module 802 is further configured to determine whether the waiting time of the data file corresponding to the target job plan is less than or equal to a preset time if the data file corresponding to the target job plan does not exist; the determining module 802 is further configured to skip the target operation plan if the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time, and determine a new target operation plan among the plurality of operation plans when a preset loading policy of the data table indicates that the target operation plan can be skipped.
In some possible designs, the apparatus further comprises: a recording module 803, configured to record the status of the target operation plan as incomplete if the waiting time of the data file of the target operation plan is less than or equal to a preset time, and then end the recording if the preset loading policy of the data table indicates that the target operation plan cannot be skipped.
In some possible designs, the determining module 802 is further configured to: determining the type of the data table, wherein the type of the data table comprises a main file type and a detail type; if the type of the data table is a master file type, determining that a preset loading strategy of the data table is that the target operation plan cannot be skipped; and if the type of the data table is the detail type, determining that the preset loading strategy of the data table can skip the target operation plan.
In some possible designs, the data file of the target job plan corresponds to a receiving time, where the receiving time is a time when the data warehouse system receives the data file for the first time, and the determining module 802 is specifically configured to: if the preset loading strategy of the data table indicates that data loading is carried out according to the sequence of data file time from morning to evening, determining the operation plan corresponding to the earliest receiving time in the plurality of operation plans as a target operation plan; and determining the operation plan corresponding to the next earliest receiving time in the plurality of operation plans as a new target operation plan.
In some possible designs, the data file of the target job plan corresponds to a receiving time, where the receiving time is a time when the data warehouse system receives the data file for the first time, and the determining module 802 is specifically configured to: if the preset loading strategy of the data table indicates that data loading is carried out according to the latest received data file, determining an operation plan corresponding to the latest receiving time in the plurality of operation plans as a target operation plan; and determining the operation plan corresponding to the next new receiving time in the plurality of operation plans as a new target operation plan.
In some possible designs, the apparatus further includes a sending module 804; the recording module 803 is further configured to record the state of the target operation plan as timeout if the waiting time of the data file corresponding to the target operation plan is greater than a preset time; the sending module 804 is configured to send alarm information if the waiting time of the data file corresponding to the target operation plan is greater than a preset time.
The data loading device provided in the embodiment of the present application can be used for implementing the technical scheme of the data loading method in the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the determining module 802 may be a processing element separately set up, or may be implemented by being integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the function of the determining module 802 may be called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may be the data warehouse system in the above embodiment, and as shown in fig. 9, the electronic device may include: a processor 901, a memory 902, and a transceiver 903.
The processor 901 executes computer-executable instructions stored in the memory, so that the processor 901 performs the schemes in the above-described embodiments. The processor 901 may be a general-purpose processor including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
A memory 902 is coupled to the processor 901 via the system bus and communicates with the processor, the memory 902 storing computer program instructions.
The system bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory).
The electronic device provided in the embodiment of the present application may be used to implement the technical solution of the data loading method in the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, where a computer instruction is stored in the computer-readable storage medium, and when the computer instruction runs on a computer, the computer is enabled to execute the technical solution of the data loading method in the foregoing embodiment.
The embodiment of the present application further provides a computer program product, where the computer program product includes a computer program, and the computer program is stored in a computer-readable storage medium, where the computer program can be read by at least one processor from the computer-readable storage medium, and when the computer program is executed by the at least one processor, the technical solution of the data loading method in the foregoing embodiments can be implemented.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (15)

1. A data loading method is applied to a data warehouse system, the data warehouse system loads data of a data table according to data files, and the method comprises the following steps:
acquiring a work plan list of a data table, wherein the work plan list comprises a plurality of work plans, and each work plan in the plurality of work plans corresponds to a data file;
determining a target operation plan among the plurality of operation plans;
if the data file corresponding to the target operation plan does not exist, determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time;
and if the waiting time of the data file corresponding to the target operation plan is less than or equal to the preset time, skipping the target operation plan under the condition that the preset loading strategy of the data table indicates that the target operation plan can be skipped, and determining a new target operation plan in the plurality of operation plans.
2. The method of claim 1, further comprising:
and if the waiting time of the data file of the target operation plan is less than or equal to the preset time, recording the state of the target operation plan as incomplete and ending under the condition that the preset loading strategy of the data table indicates that the target operation plan cannot be skipped.
3. The method of claim 2, further comprising:
determining the type of the data table, wherein the type of the data table comprises a main file type and a detail type;
if the type of the data table is a master file type, determining that a preset loading strategy of the data table is that the target operation plan cannot be skipped;
and if the type of the data table is the detail type, determining that the preset loading strategy of the data table can skip the target operation plan.
4. The method according to any of claims 1-3, wherein the data file of the target job plan corresponds to a receiving time, the receiving time being a time when the data warehouse system first received the data file, then determining the target job plan among the plurality of job plans comprises:
if the preset loading strategy of the data table indicates that data loading is carried out according to the sequence of data file time from morning to evening, determining the operation plan corresponding to the earliest receiving time in the plurality of operation plans as a target operation plan;
correspondingly, the determining a new target operation plan in the plurality of operation plans includes:
and determining the operation plan corresponding to the next earliest receiving time in the plurality of operation plans as a new target operation plan.
5. The method according to any of claims 1-3, wherein the data file of the target job plan corresponds to a receiving time, the receiving time being a time when the data warehouse system first received the data file, then determining the target job plan among the plurality of job plans comprises:
if the preset loading strategy of the data table indicates that data loading is carried out according to the latest received data file, determining an operation plan corresponding to the latest receiving time in the plurality of operation plans as a target operation plan;
correspondingly, the determining a new target operation plan in the plurality of operation plans includes:
and determining the operation plan corresponding to the next new receiving time in the plurality of operation plans as a new target operation plan.
6. The method according to any one of claims 1-3, further comprising:
and if the waiting time of the data file corresponding to the target operation plan is longer than the preset time, recording the state of the target operation plan as overtime, and sending warning information.
7. A data loading apparatus, applied to a data warehouse system, the data warehouse system loading data to a data table according to a data file, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a job plan list of a data table, the job plan list comprises a plurality of job plans, and each job plan in the plurality of job plans corresponds to a data file;
a determination module for determining a target job plan among the plurality of job plans;
the determining module is further used for determining whether the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time or not if the data file corresponding to the target operation plan does not exist;
and the determining module is further configured to skip the target operation plan if the waiting time of the data file corresponding to the target operation plan is less than or equal to a preset time, and determine a new target operation plan in the plurality of operation plans when a preset loading strategy of the data table indicates that the target operation plan can be skipped.
8. The apparatus of claim 7, further comprising:
and the recording module is used for recording the state of the target operation plan as incomplete and ending under the condition that the preset loading strategy of the data table indicates that the target operation plan cannot be skipped if the waiting time of the data file of the target operation plan is less than or equal to the preset time.
9. The apparatus of claim 8, wherein the determining module is further configured to:
determining the type of the data table, wherein the type of the data table comprises a main file type and a detail type;
if the type of the data table is a master file type, determining that a preset loading strategy of the data table is that the target operation plan cannot be skipped;
and if the type of the data table is the detail type, determining that the preset loading strategy of the data table can skip the target operation plan.
10. The apparatus according to any one of claims 7 to 9, wherein the data file of the target operation plan corresponds to a receiving time, the receiving time being a time when the data warehouse system first receives the data file, and the determining module is specifically configured to:
if the preset loading strategy of the data table indicates that data loading is carried out according to the sequence of data file time from morning to evening, determining the operation plan corresponding to the earliest receiving time in the plurality of operation plans as a target operation plan;
and determining the operation plan corresponding to the next earliest receiving time in the plurality of operation plans as a new target operation plan.
11. The apparatus according to any one of claims 7 to 9, wherein the data file of the target operation plan corresponds to a receiving time, the receiving time being a time when the data warehouse system first receives the data file, and the determining module is specifically configured to:
if the preset loading strategy of the data table indicates that data loading is carried out according to the latest received data file, determining an operation plan corresponding to the latest receiving time in the plurality of operation plans as a target operation plan;
and determining the operation plan corresponding to the next new receiving time in the plurality of operation plans as a new target operation plan.
12. The apparatus of claim 8 or 9, further comprising a transmitting module;
the recording module is further configured to record the state of the target operation plan as timeout if the waiting time of the data file corresponding to the target operation plan is greater than a preset time;
and the sending module is used for sending alarm information if the waiting time of the data file corresponding to the target operation plan is longer than the preset time.
13. An electronic device, comprising: a memory, a processor;
a memory; a memory for storing the processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1-6.
14. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the method of any one of claims 1-6.
15. A computer program product, characterized in that it comprises a computer program which, when being executed by a processor, carries out the method of any one of claims 1-6.
CN202110699549.4A 2021-06-23 2021-06-23 Data loading method, device, equipment and storage medium Pending CN113312357A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110699549.4A CN113312357A (en) 2021-06-23 2021-06-23 Data loading method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110699549.4A CN113312357A (en) 2021-06-23 2021-06-23 Data loading method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113312357A true CN113312357A (en) 2021-08-27

Family

ID=77380265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110699549.4A Pending CN113312357A (en) 2021-06-23 2021-06-23 Data loading method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113312357A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411599A (en) * 2011-08-01 2012-04-11 中国民生银行股份有限公司 Method for processing abnormal behaviors in data base and monitoring server
CN102981904A (en) * 2011-09-02 2013-03-20 阿里巴巴集团控股有限公司 Task scheduling method and system
US20130166338A1 (en) * 2011-12-22 2013-06-27 Ralf Philipp Enhanced business planning and operations management system
CN104794124A (en) * 2014-01-20 2015-07-22 中国移动通信集团重庆有限公司 Intelligent implementation method and system for data missing supplementation
CN108694564A (en) * 2018-06-07 2018-10-23 阿里巴巴集团控股有限公司 A kind of task status control method and device
CN109960708A (en) * 2019-03-22 2019-07-02 蔷薇智慧科技有限公司 Data processing method, device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411599A (en) * 2011-08-01 2012-04-11 中国民生银行股份有限公司 Method for processing abnormal behaviors in data base and monitoring server
CN102981904A (en) * 2011-09-02 2013-03-20 阿里巴巴集团控股有限公司 Task scheduling method and system
US20130166338A1 (en) * 2011-12-22 2013-06-27 Ralf Philipp Enhanced business planning and operations management system
CN104794124A (en) * 2014-01-20 2015-07-22 中国移动通信集团重庆有限公司 Intelligent implementation method and system for data missing supplementation
CN108694564A (en) * 2018-06-07 2018-10-23 阿里巴巴集团控股有限公司 A kind of task status control method and device
CN109960708A (en) * 2019-03-22 2019-07-02 蔷薇智慧科技有限公司 Data processing method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN102971711B (en) For processing the Apparatus for () and method therefor of batch working cell
CN107045676B (en) Logistics circulation message processing method and device
CN112306655A (en) Task auditing method and device, computer equipment and storage medium
WO2016131388A1 (en) Resource allocation method and device
CN111445340A (en) Method and device for estimating value of financial product, electronic equipment and readable storage medium
CN111260253A (en) Information sending method and device, computer equipment and storage medium
CN109829678B (en) Rollback processing method and device and electronic equipment
CN112837007B (en) Supply chain management method, device, equipment and storage medium
WO2021129005A1 (en) Blockchain state change-based transaction tracking method and device
CN113626218A (en) Data processing method, data processing device, storage medium and computer equipment
CN113312357A (en) Data loading method, device, equipment and storage medium
US11120513B2 (en) Capital chain information traceability method, system, server and readable storage medium
CN111522881B (en) Service data processing method, device, server and storage medium
CN111061576B (en) Method and system for creating entity object
CN112416980B (en) Data service processing method, device and equipment
CN113392085A (en) Distributed file batch processing method and platform
CN110765148B (en) Service data processing method and device
CN113393252A (en) Online work order management method, device, equipment and storage medium
CN102637140B (en) Novel uncorrected data tape backup system
CN111159311A (en) Data mapping integration method, device, equipment and storage medium
CN112581248A (en) Information processing system and method
CN113554498B (en) Processing method and device for user account request
US11894999B1 (en) Resolving information discrepancies between network entities
US20080033590A1 (en) Method and system for managing lot transactions
CN111292144B (en) Bill processing method and device based on block chain network and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination