CN112214454B - Data processing method, ETL system, server and storage medium - Google Patents

Data processing method, ETL system, server and storage medium Download PDF

Info

Publication number
CN112214454B
CN112214454B CN202011384503.5A CN202011384503A CN112214454B CN 112214454 B CN112214454 B CN 112214454B CN 202011384503 A CN202011384503 A CN 202011384503A CN 112214454 B CN112214454 B CN 112214454B
Authority
CN
China
Prior art keywords
data
target data
etl system
operation information
converted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011384503.5A
Other languages
Chinese (zh)
Other versions
CN112214454A (en
Inventor
黄胜
叶康
刘忠
张玉林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Rootcloud Technology Co ltd
Rootcloud Technology Co Ltd
Original Assignee
Changsha Rootcloud Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Rootcloud Technology Co ltd filed Critical Changsha Rootcloud Technology Co ltd
Priority to CN202011384503.5A priority Critical patent/CN112214454B/en
Publication of CN112214454A publication Critical patent/CN112214454A/en
Application granted granted Critical
Publication of CN112214454B publication Critical patent/CN112214454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion

Abstract

The application provides a data processing method, an ETL system, a server and a storage medium, which relate to the technical field of data processing, wherein the ETL system is used for converting each target data by creating and storing data operation information corresponding to each target data aiming at a plurality of target data extracted from a source end device, and loading each converted target data to a target end device, and then deleting the data operation information corresponding to the target data aiming at the target data which is successfully loaded to the target end device after conversion, so that when the ETL system is restarted, the ETL system can be used for reintegrating and loading the target data corresponding to the stored data operation information according to the remaining stored data operation information without reintegrating and loading all data of the batch; partial data of the batch cannot be discarded, so that data loss of the destination equipment is avoided, and the data processing accuracy is improved.

Description

Data processing method, ETL system, server and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method, an ETL system, a server, and a storage medium.
Background
The ETL (Extract-Transform-Load) technology can Load data of a service system to a destination device after extraction and conversion from a source device, thereby integrating some scattered, disordered, and non-uniform data.
When the ETL is unstable in operation, the ETL can be restarted, and data processing is continued on the basis of the last data processing, so that continuous integration of data is realized.
However, when the ETL is restarted, some data may be repeatedly processed or part of the data may be lost, resulting in lower data processing accuracy.
Disclosure of Invention
The application aims to provide a data processing method, an ETL system, a server and a storage medium, which can avoid repeated reloading of successfully loaded data, can not discard the data, and improve the accuracy of data processing.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
in a first aspect, the present application provides a data processing method, which is applied to a server running an ETL system, where the ETL system is used to load data from a source device to a destination device; the method comprises the following steps:
aiming at a plurality of target data extracted from the source end equipment, creating and storing data operation information corresponding to each target data;
converting each target data, and loading each converted target data to the destination device;
and deleting the data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
In a second aspect, the present application provides an ETL system for loading data from a source device to a destination device; wherein:
the ETL system is specifically configured to create and store, for a plurality of target data extracted from the source device, data operation information corresponding to each of the target data;
the ETL system is also used for converting each target data and loading each converted target data to the destination terminal equipment;
the ETL system is further configured to, for the converted target data that is successfully loaded to the destination device, delete the data operation information corresponding to the target data.
In a third aspect, the present application provides a server comprising a memory for storing one or more programs; a processor; the one or more programs, when executed by the processor, implement the data processing method described above.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data processing method described above.
According to the data processing method, the ETL system, the server and the storage medium, after the ETL system performs conversion processing on each target data by creating and storing the data operation information corresponding to each target data aiming at a plurality of target data extracted from a source end device and loads each converted target data to the target end device, the ETL system deletes the data operation information corresponding to the target data aiming at the target data successfully loaded to the target end device after conversion, so that when the ETL system is restarted, the ETL system can re-integrate and load the target data corresponding to the stored data operation information according to the remaining stored data operation information without re-integrating and loading all data of a batch, and the re-loading of the successfully loaded data is avoided; partial data of the batch cannot be discarded, so that data loss of the destination equipment is avoided, and the data processing accuracy is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly explain the technical solutions of the present application, the drawings needed for the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also derive other related drawings from these drawings without inventive effort.
Fig. 1 shows a schematic architecture diagram of an ETL system provided in the present application.
Fig. 2 shows a schematic block diagram of a server provided in the present application.
Fig. 3 shows a schematic flow chart of a data processing method provided by the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the accompanying drawings in some embodiments of the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on a part of the embodiments in the present application without any creative effort belong to the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In a scenario of using the ETL system to integrate and load data of a source device into a destination device, for example, the ETL system may integrate multiple data in the same batch in a batch order.
Because the ETL system may have a process crash, a server running the ETL system may also have a downtime, and the like, the ETL system may be restarted, and data needs to be obtained from the source end device again and is loaded to the destination end device in an integrated manner.
In order to avoid that the ETL re-integrates and loads all data from the source device to the destination device, the ETL system may record last run metadata (last run metadata) during the process of extracting data from the source device, and when the ETL system is restarted, the ETL system may re-integrate and load the data of the batch to the destination device according to the recorded last run metadata.
However, since the ETL system may be restarted in the process of integrating data, if the batch is reloaded based on the last running metadata recorded after the ETL system is restarted, there may be a case of reloading part of the data of the batch while processing; if the next batch of data is restarted to be loaded based on the last run metadata, part of the data of the batch may not be processed, so that the destination device loses part of the data, and the data processing accuracy is low.
To this end, based on at least some of the drawbacks of some of the above embodiments, some possible embodiments provided by the present application are: the ETL system deletes the data operation information corresponding to the target data aiming at the target data which is successfully loaded to the destination equipment after the conversion processing is carried out on each target data by establishing and storing the data operation information corresponding to each target data aiming at a plurality of target data extracted from the source equipment and each converted target data is loaded to the destination equipment, so that when the ETL system is restarted, the ETL system can reintegrate and load the target data corresponding to the stored data operation information according to the remaining stored data operation information without reintegrating and loading all data of the batch, and the data which is successfully loaded are prevented from being repeatedly reloaded; partial data of the batch cannot be discarded, so that data loss of the destination equipment is avoided, and the data processing accuracy is improved.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 shows a schematic architecture diagram of an ETL system provided in the present application, which may include a data extraction module (Extractor Manager), a data transformation module (Transformer Manager), and a data loading module (Loader Manager); the data processing method provided by the application can be realized by executing the corresponding functional modules of the data extraction module, the data conversion module and the data loading module.
Referring to fig. 2, fig. 2 shows a schematic block diagram of a server 100 provided in the present application, where the server 100 may include a memory 101, a processor 102 and a communication interface 103, and the memory 101, the processor 102 and the communication interface 103 are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.
The memory 101 may be used for storing software programs and modules, such as program instructions/modules corresponding to the ETL system provided in the present application, and the processor 102 executes the software programs and modules stored in the memory 101 to execute various functional applications and data processing, so as to operate the ETL system provided in the present application, and further execute the steps of the data processing method provided in the present application. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Programmable Read-Only Memory (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in fig. 2 is merely illustrative and that the server 100 may include more or fewer components than shown in fig. 2 or have a different configuration than shown in fig. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
The data processing method provided by the present application is schematically described below with a server operating the ETL system shown in fig. 1 as an exemplary execution subject.
Referring to fig. 3, fig. 3 shows a schematic flow chart of a data processing method provided in the present application, which may include the following steps in some possible embodiments:
step 201, for a plurality of target data extracted from a source device, creating and storing data operation information corresponding to each target data.
Step 203, converting each target data, and loading each converted target data to the destination device.
Step 205, deleting the data operation information corresponding to the target data, with respect to the converted target data successfully loaded to the destination device.
In some embodiments, referring to fig. 1, in the process of loading data from a source device to a destination device by an ELT system in an integrated manner, an ETL system may obtain data from the source device in batches for an integrated process; the ETL system may create and store data operation information corresponding to each target data, where the data operation information may be used to indicate object data being processed by the ETL system.
For example, in some possible embodiments of the present application, the ETL system may create and store a data snapshot corresponding to each target data as data operation information corresponding to each target data; or, the ETL system may also create and store a data verification code corresponding to each target data as the data operation information corresponding to each target data according to the data batch number of each target data and the serial number in the data batch; the implementation manner of the data operation information is not limited in the present application.
Next, the ETL system may perform conversion processing on each target data, and load each converted target data to the destination device.
It should be noted that, in some possible scenarios, the ETL system may enable multiple threads to process each target data in parallel during the process of performing the integrated processing of the data. Therefore, in the process of executing step 203, the conversion processes of each target data may be independent from each other, that is, the conversion processing processes of any two target data do not have a necessary sequence, and for one target data, after the ETL system creates and stores the data operation information corresponding to the target data, the ETL system may execute the operation of converting the target data, and load the converted target data to the destination device.
Then, for each converted target data that is successfully loaded to the destination device, the ETL system may delete the data operation information corresponding to the target data to indicate that the corresponding target data has been successfully converted and loaded to the destination device.
Thus, in some embodiments of the present application, when the ETL system is restarted, the ETL system can reintegrate and load the target data corresponding to the stored data operation information according to the remaining stored data operation information, without reintegrating and loading all the data of the batch, thereby avoiding the data that has been successfully loaded from being repeatedly reloaded; partial data of the batch cannot be discarded, so that data loss of the destination equipment is avoided, and the data processing accuracy is improved.
In some possible embodiments, as shown in fig. 1, the architecture of the ETL system may include a data extraction module, a data conversion module, and a data loading module; in addition, a first blocking Queue (Block Queue) may be created for the data extraction module, and a second blocking Queue may be created for the data conversion module, where the first blocking Queue may be used to store the data extracted by the data extraction module, and the second blocking Queue may be used to store the data converted by the data conversion module.
Based on this, in the course of executing step 201, the ETL system may extract a plurality of target data from the source device by the data extraction module, and add each target data to the first blocking queue after the data extraction module creates and stores the data operation information corresponding to each target data.
Also, in the ETL system executing step 203, the data conversion module may perform an operation of performing conversion processing on each target data, that is: and the data conversion module acquires target data from the first blocking queue, converts each acquired target data, and adds the converted target data to the second blocking queue.
In addition, in the process of executing step 203, the ETL system may further execute, by the data loading module, an operation of loading each converted target data to the destination device, that is: and the data loading module acquires the converted target data from the second blocking queue and loads each converted target data to the destination terminal equipment.
In addition, in the process of executing 205 in the ETL system, the data loading module may delete the data operation information corresponding to the target data, with respect to the converted target data that is successfully loaded to the destination device.
In some embodiments, for the data operation information maintained by the ETL system, the ETL system may create a Staging Area (Staging Area), so as to store the created data operation information into the Staging Area, and delete the created data operation information from the Staging Area during the step 205.
In addition, as can be seen from the schematic diagram of the ETL system architecture shown in fig. 1, in the process of integrally loading data from the source device to the destination device, the ETL system may include two internal processing flows, that is, extracting data from the source device to the first blocking queue, extracting data from the first blocking queue for conversion, and adding the converted data to the second blocking queue, and further includes an external processing flow, that is, loading data in the second blocking queue to the destination device.
Therefore, when the ETL system is restarted, it may be necessary to continue to perform the operation of obtaining data from the first blocking queue for conversion, and it may also be possible to continue to perform the operation of obtaining data from the second blocking queue for loading to the destination device.
Based on this, in order to improve the fineness of data processing, in some embodiments of the present application, the data operation information may include a first data snapshot and a second data snapshot, where the first data snapshot may be used to indicate that corresponding target data is in a state of being extracted from a source device to a first blocking queue, and the second data snapshot may be used to indicate that corresponding target data is in a state of being subjected to data conversion and being saved to a second blocking queue.
Thus, the ETL system during the execution of step 201 can create and include a first data snapshot corresponding to each target data to indicate that the corresponding target data is already in a state of being extracted from the source device to the first blocking queue.
Next, in the process of executing step 203, the ETL system may perform conversion processing on each target data, create and store a second data snapshot corresponding to each target data after the conversion is successful, delete a first data snapshot corresponding to the target data, and load each converted target data to the destination device.
That is, when the ETL system obtains target data from the first blocking queue and performs conversion processing on each target data successfully, the ETL system may create and store a second data snapshot corresponding to the target data that is successfully converted, so as to indicate that the target data that is successfully converted is currently in a state of being subjected to data conversion and being stored in the second blocking queue; and when the ETL system successfully converts the target data and adds the converted target data to the second blocking queue, the ETL system may delete the first data snapshot corresponding to the target data.
In addition, in the process of executing step 205, the ETL system may delete, for the converted target data that is successfully loaded to the destination device, the second data snapshot corresponding to the target data that is successfully converted, so as to indicate that the data conversion loading is completed.
It should be noted that, in the scenario shown in fig. 1, all the first data snapshots and all the second data snapshots that are saved in the above-mentioned manner may be saved in the same operation space of the temporary storage area, or may be saved in different operation spaces.
And based on the configured first data snapshot and second data snapshot, when the ETL system is restarted, the ETL system may read all the first data snapshots and all the second data snapshots saved in the temporary storage area, and resume processing of the target data.
For the stored first data snapshots, the ETL system may perform conversion processing on target data corresponding to each stored first data snapshot; that is to say, for the saved first data snapshot, when the ETL system is restarted, the ETL system may read the saved first data snapshot from the temporary storage area, where the saved first data snapshot may be used to indicate that the corresponding data is in a state of being extracted from the source device to the first blocking queue, and then the ETL system may obtain, from the first blocking queue, the target data corresponding to each saved first data snapshot, perform conversion processing on the obtained target data, and continue to perform the subsequent steps of creating and saving the corresponding second data snapshot.
In addition, for the saved second data snapshots, the ETL system may load the target data corresponding to each saved second data snapshot to the destination device; that is to say, for the saved second data snapshot, when the ETL system is restarted, the ETL system may read the saved second data snapshot from the temporary storage area, where the saved second data snapshot may be used to indicate that the corresponding target data is in a state of being subjected to data conversion and saved to the second blocking queue, and then the ETL system may obtain the target data corresponding to each saved second data snapshot from the second blocking queue, and load the obtained target data to the destination device, so as to complete the integrated loading of the data.
In addition, based on the same inventive concept as the above data processing method provided by the present application, the present application also provides an ETL system as shown in fig. 1, where the ETL system is used to load data from a source device to a destination device; wherein:
the ETL system is specifically configured to create and store, for a plurality of target data extracted from the source device, data operation information corresponding to each of the target data;
the ETL system is also used for converting each target data and loading each converted target data to the destination terminal equipment;
the ETL system is further configured to, for the converted target data that is successfully loaded to the destination device, delete the data operation information corresponding to the target data.
Optionally, as a possible implementation, the data operation information includes a first data snapshot and a second data snapshot;
when the ETL system creates and stores the data operation information corresponding to each target data, the ETL system is specifically configured to:
creating and storing a first data snapshot corresponding to each target data;
when the ETL system performs conversion processing on each target data and loads each converted target data to the destination device, the ETL system is specifically configured to:
performing conversion processing on each target data, creating and storing a second data snapshot corresponding to each target data after the conversion is successful, and deleting a first data snapshot corresponding to the target data;
loading each converted target data to the destination device;
when the ETL system deletes, for the converted target data that is successfully loaded to the destination device, the data operation information corresponding to the target data, the ETL system is specifically configured to:
and deleting the second data snapshot corresponding to the converted target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
Optionally, as a possible implementation manner, when the ETL system is restarted, the ETL system is further configured to:
aiming at the stored first data snapshots, converting target data corresponding to each stored first data snapshot;
and for the stored second data snapshots, loading the target data corresponding to each stored second data snapshot to the destination device.
Optionally, as a possible implementation manner, the ETL system includes a data extraction module, a data conversion module, and a data loading module;
the data extraction module is used for extracting a plurality of target data from the source end equipment, creating and storing data operation information corresponding to each target data, and then adding each target data to a first blocking queue;
the data conversion module is used for acquiring target data from the first blocking queue, converting each acquired target data and adding the converted target data to a second blocking queue;
the data loading module is used for acquiring the converted target data from the second blocking queue and loading each converted target data to the destination device; and the number of the first and second groups,
and deleting the data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to some embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in some embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to some embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The above description is only a few examples of the present application and is not intended to limit the present application, and those skilled in the art will appreciate that various modifications and variations can be made in the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (6)

1. A data processing method is characterized in that the method is applied to a server running an ETL system of a data warehouse, wherein the ETL system is used for loading data from a source end device to a destination end device; the method comprises the following steps:
aiming at a plurality of target data extracted from the source end equipment, creating and storing data operation information corresponding to each target data;
converting each target data, and loading each converted target data to the destination device;
deleting data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment;
wherein the data operation information comprises a first data snapshot and a second data snapshot;
the creating and storing of the data operation information corresponding to each target data includes:
creating and storing a first data snapshot corresponding to each target data;
the converting each target data and loading each converted target data to the destination device includes:
performing conversion processing on each target data, creating and storing a second data snapshot corresponding to each target data after the conversion is successful, and deleting a first data snapshot corresponding to the target data;
loading each converted target data to the destination device;
the deleting the data operation information corresponding to the target data for the converted target data successfully loaded to the destination device includes:
and deleting the second data snapshot corresponding to the converted target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
2. The method of claim 1, wherein the ETL system comprises a data extraction module, a data transformation module, and a data loading module;
the creating and storing data operation information corresponding to each target data for a plurality of target data extracted from the source device includes:
the data extraction module extracts a plurality of target data from the source end equipment, creates and stores data operation information corresponding to each target data, and then adds each target data to a first blocking queue;
performing conversion processing on each target data, including;
the data conversion module acquires target data from the first blocking queue, converts each acquired target data, and adds the converted target data to a second blocking queue;
the loading each converted target data to the destination device includes:
the data loading module acquires the converted target data from the second blocking queue and loads each converted target data to the destination device;
the deleting the data operation information corresponding to the target data for the converted target data successfully loaded to the destination device includes:
and the data loading module deletes the data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
3. An ETL system, wherein the ETL system is used to load data from a source device to a destination device; wherein:
the ETL system is specifically configured to create and store, for a plurality of target data extracted from the source device, data operation information corresponding to each of the target data;
the ETL system is also used for converting each target data and loading each converted target data to the destination terminal equipment;
the ETL system is also used for deleting the data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment;
wherein the data operation information comprises a first data snapshot and a second data snapshot;
when the ETL system creates and stores the data operation information corresponding to each target data, the ETL system is specifically configured to:
creating and storing a first data snapshot corresponding to each target data;
when the ETL system performs conversion processing on each target data and loads each converted target data to the destination device, the ETL system is specifically configured to:
performing conversion processing on each target data, creating and storing a second data snapshot corresponding to each target data after the conversion is successful, and deleting a first data snapshot corresponding to the target data;
loading each converted target data to the destination device;
when the ETL system deletes, for the converted target data that is successfully loaded to the destination device, the data operation information corresponding to the target data, the ETL system is specifically configured to:
and deleting the second data snapshot corresponding to the converted target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
4. The ETL system of claim 3, wherein said ETL system comprises a data extraction module, a data transformation module, and a data loading module;
the data extraction module is used for extracting a plurality of target data from the source end equipment, creating and storing data operation information corresponding to each target data, and then adding each target data to a first blocking queue;
the data conversion module is used for acquiring target data from the first blocking queue, converting each acquired target data and adding the converted target data to a second blocking queue;
the data loading module is used for acquiring the converted target data from the second blocking queue and loading each converted target data to the destination device; and the number of the first and second groups,
and deleting the data operation information corresponding to the target data aiming at the converted target data which is successfully loaded to the destination terminal equipment.
5. A server, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of claim 1 or 2.
6. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to claim 1 or 2.
CN202011384503.5A 2020-12-02 2020-12-02 Data processing method, ETL system, server and storage medium Active CN112214454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011384503.5A CN112214454B (en) 2020-12-02 2020-12-02 Data processing method, ETL system, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011384503.5A CN112214454B (en) 2020-12-02 2020-12-02 Data processing method, ETL system, server and storage medium

Publications (2)

Publication Number Publication Date
CN112214454A CN112214454A (en) 2021-01-12
CN112214454B true CN112214454B (en) 2021-02-26

Family

ID=74068035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011384503.5A Active CN112214454B (en) 2020-12-02 2020-12-02 Data processing method, ETL system, server and storage medium

Country Status (1)

Country Link
CN (1) CN112214454B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521397A (en) * 2011-12-23 2012-06-27 山东中创软件工程股份有限公司 Data transmission method
CN109241042A (en) * 2018-07-24 2019-01-18 新华三大数据技术有限公司 Data processing method, device and electronic equipment
CN109656992A (en) * 2018-11-27 2019-04-19 山东中创软件商用中间件股份有限公司 A kind of data transmission account checking method, device and equipment
CN111090699A (en) * 2019-12-13 2020-05-01 北京奇艺世纪科技有限公司 Service data synchronization method and device, storage medium and electronic device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8762333B2 (en) * 2009-07-08 2014-06-24 Pivotal Software, Inc. Apparatus and method for read optimized bulk data storage
US8909882B2 (en) * 2009-11-22 2014-12-09 International Business Machines Corporation Concurrent data processing using snapshot technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521397A (en) * 2011-12-23 2012-06-27 山东中创软件工程股份有限公司 Data transmission method
CN109241042A (en) * 2018-07-24 2019-01-18 新华三大数据技术有限公司 Data processing method, device and electronic equipment
CN109656992A (en) * 2018-11-27 2019-04-19 山东中创软件商用中间件股份有限公司 A kind of data transmission account checking method, device and equipment
CN111090699A (en) * 2019-12-13 2020-05-01 北京奇艺世纪科技有限公司 Service data synchronization method and device, storage medium and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于OLAP+DM技术的医院财务分析系统的设计与实现;王萤等;《电子设计工程》;20160323(第05期);第51-54页 *

Also Published As

Publication number Publication date
CN112214454A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
CN106897072B (en) Service engineering calling method and device and electronic equipment
CN111444196B (en) Method, device and equipment for generating Hash of global state in block chain type account book
CN111159329B (en) Sensitive word detection method, device, terminal equipment and computer readable storage medium
US9563669B2 (en) Closed itemset mining using difference update
CN111414362B (en) Data reading method, device, equipment and storage medium
CN106406913B (en) Method and system for extracting codes from project
CN108509322B (en) Method for avoiding excessive return visit, electronic device and computer readable storage medium
CN112199935A (en) Data comparison method and device, electronic equipment and computer readable storage medium
CN113468118B (en) File increment storage method, device and storage medium based on blockchain
CN109241042B (en) Data processing method and device and electronic equipment
CN112214454B (en) Data processing method, ETL system, server and storage medium
US10762207B2 (en) Method and device for scanning virus
US20180232353A1 (en) Non-transitory computer-readable storage medium, record data processing method, and record data processing apparatus
US20160364399A1 (en) Aggregating modifications to a database for journal replay
CN111988429A (en) Algorithm scheduling method and system
CN108121514B (en) Meta information updating method and device, computing equipment and computer storage medium
CN110196793B (en) Log analysis method and device for plug-in database
CN113536360A (en) Information security processing method and device based on intelligent manufacturing and electronic equipment
CN109165208B (en) Method and system for loading data into database
CN113505103A (en) File processing method and device, storage medium and electronic device
CN113127479A (en) Method and device for loading Elasticissearch index, computer equipment and storage medium
CN110888865A (en) Data processing method and device based on one-way linked list
CN111736889A (en) Version increment file acquisition method and device
CN117331894A (en) File filtering method and device, computer equipment and storage medium
CN110990475B (en) Batch task inserting method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210430

Address after: No.1, 3rd floor, R & D building, Sany industrial city, No.1, Sany Road, Changsha Economic and Technological Development Zone, Changsha, Hunan 410000

Patentee after: CHANGSHA ROOTCLOUD TECHNOLOGY Co.,Ltd.

Patentee after: Shugen Internet Co.,Ltd.

Address before: No.1, 3rd floor, R & D building, Sany industrial city, No.1, Sany Road, Changsha Economic and Technological Development Zone, Changsha, Hunan 410000

Patentee before: CHANGSHA ROOTCLOUD TECHNOLOGY Co.,Ltd.