CN112328677A - Lost data recovery method, device, equipment and medium based on table association - Google Patents

Lost data recovery method, device, equipment and medium based on table association Download PDF

Info

Publication number
CN112328677A
CN112328677A CN202110005207.8A CN202110005207A CN112328677A CN 112328677 A CN112328677 A CN 112328677A CN 202110005207 A CN202110005207 A CN 202110005207A CN 112328677 A CN112328677 A CN 112328677A
Authority
CN
China
Prior art keywords
data
extracted
increment
association
slave
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110005207.8A
Other languages
Chinese (zh)
Other versions
CN112328677B (en
Inventor
陈伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110005207.8A priority Critical patent/CN112328677B/en
Publication of CN112328677A publication Critical patent/CN112328677A/en
Priority to PCT/CN2021/083104 priority patent/WO2022147908A1/en
Application granted granted Critical
Publication of CN112328677B publication Critical patent/CN112328677B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Abstract

The invention relates to the field of big data, and provides a lost data recovery method, a device, equipment and a medium based on table association, which can solve the problem of associated data loss caused by the data asynchronism of an associated table due to various reasons by combining an increment table and a recovery table, reduce the cost of manual data problem analysis and data supplement correction, and enhance the data integrity of a data warehouse. The invention also relates to a blockchain technology, and the master table, the slave table and the recycle table can be stored in the blockchain.

Description

Lost data recovery method, device, equipment and medium based on table association
Technical Field
The invention relates to the technical field of big data, in particular to a method, a device, equipment and a medium for recovering lost data based on table association.
Background
In the field of data warehouse (ETL), a common data processing strategy is to incrementally Extract data from a source system to a data warehouse system, and then perform transformation and loading of the data at the data warehouse system. In order to improve efficiency and reduce data extraction overhead, an incremental synchronization mode is usually adopted preferentially, that is, the source system synchronizes data to the data warehouse side according to an incremental timestamp.
However, the above-mentioned synchronization method has certain disadvantages. The strategy of the data warehouse is that each table is extracted respectively, if two tables have a master-slave relationship in a source system (such as a client table and an account table), but the extraction time of the two tables is not completely consistent, or the submission time is not completely consistent during extraction due to a source system transaction management strategy, or the consistency of business logic cannot be ensured by an increment timestamp due to any other reason, so that the increment data of the master table and the increment data of the slave table are not matched, and then when the tables are loaded at the data warehouse end, the condition that the dependent keywords of the slave table cannot be found in the master table occurs, so that the data loss occurs in the subsequent data conversion processing at the data warehouse end.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a device and a medium for recovering lost data based on table association, which can solve the problem of associated data loss caused by asynchronous associated table data due to various reasons, reduce the cost of manual data problem analysis and data supplement and correction, and enhance the data integrity of a data warehouse.
A method for recovering lost data based on table association comprises the following steps:
responding to a first data extraction instruction, and acquiring a data table to be extracted from a source system according to the first data extraction instruction;
extracting data increment from the data table to be extracted, synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data;
determining an associated data table associated with the data table to be extracted;
acquiring extracted data of the associated data table from the increment table, and constructing a slave table in the increment table according to the extracted data;
associating the data in the slave table with the data in the master table, and acquiring data with failed association;
writing the data with the association failure into a recovery table;
responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table;
acquiring a current slave table from the increment table, and calculating a union of the current slave table and the recovery table as an updated slave table;
and associating the updated slave table with the updated master table, and removing the successfully associated data from the recovery table.
According to a preferred embodiment of the present invention, the obtaining the data table to be extracted from the source system according to the first data extraction instruction includes:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a preset label;
searching the data with the preset label in the information carried by the first data extraction instruction, and determining the searched data as a target table name;
and acquiring the data table with the target table name from the source system as the data table to be extracted.
According to a preferred embodiment of the present invention, the extracting data increment from the data table to be extracted and synchronizing to the increment table includes:
analyzing the first data extraction instruction to obtain a first timestamp range of data extraction;
acquiring data meeting the first timestamp range from the data table to be extracted as alternative data;
detecting changed data in the alternative data;
synchronizing the changed data to the delta table.
According to a preferred embodiment of the present invention, the determining the associated data table associated with the data table to be extracted includes:
detecting a data table with join operation between the data table to be extracted and the data table to be extracted;
and determining the detected data table as the associated data table.
According to a preferred embodiment of the present invention, said associating data in said slave table with data in said master table comprises:
acquiring a data identifier of each piece of slave data in the slave table, and acquiring a data identifier of each piece of master data in the master table;
calling a pre-configured mapping table, wherein the mapping table stores the corresponding relation between the data identifier of each piece of slave data and the data identifier of each piece of master data;
when finding out that the data identifier of the first data in the slave table and the data identifier of the second data in the master table have a corresponding relation in the mapping table, determining that the first data is associated with the second data, and determining that the first data is successfully associated; or
And when the main data corresponding to the data identifier of the first data is not found in the mapping table, determining that the first data association fails.
According to the preferred embodiment of the present invention, before writing the data with failed association into a recycle table, the method further comprises:
identifying a table structure of the delta table;
creating an isomorphic table of the increment table according to a table structure of the increment table;
and determining the created isomorphic table as the recycle table.
According to a preferred embodiment of the invention, the method further comprises:
detecting the recovered time of each piece of data in the recovery table;
when the recovery time of the data is detected to be longer than or equal to the preset time length, determining the detected data as data to be verified;
checking accounts in the source system according to the data to be verified;
acquiring data meeting the reconciliation standard from the data to be verified, and reserving the data meeting the reconciliation standard to the recovery table;
and acquiring data which does not accord with the reconciliation standard from the data to be verified, and removing the data which does not accord with the reconciliation standard from the recycle table.
A lost data recovery apparatus based on table associations, the lost data recovery apparatus based on table associations comprising:
the acquisition unit is used for responding to a first data extraction instruction and acquiring a data table to be extracted from a source system according to the first data extraction instruction;
the construction unit is used for extracting data increment from the data table to be extracted and synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data;
the determining unit is used for determining an associated data table associated with the data table to be extracted;
the construction unit is further configured to obtain extracted data of the associated data table from the increment table, and construct a slave table in the increment table according to the extracted data;
the association unit is used for associating the data in the slave table with the data in the master table and acquiring the data with failed association;
a write-in unit, configured to write the data with the association failure into a recycle table;
the updating unit is used for responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table;
the updating unit is further used for acquiring a current slave table from the increment table and calculating a union of the current slave table and the recovery table as an updated slave table;
the association unit is further configured to associate the updated slave table with the updated master table, and remove the successfully associated data from the recycle table.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the table association based lost data reclamation method.
A computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executable by a processor in an electronic device to implement the table association based lost data reclamation method.
According to the technical scheme, in response to a first data extraction instruction, the data table to be extracted is obtained from a source system according to the first data extraction instruction, the data increment is extracted from the data table to be extracted and synchronized to the increment table, the increment table is built with a main table according to the extracted data, the associated data table associated with the data table to be extracted is determined, the extracted data of the associated data table is obtained from the increment table, the increment table is built with a secondary table according to the extracted data, the data in the secondary table is associated with the data in the main table, the data with failed association is obtained, the data with failed association is written into a recovery table to ensure that all the data with lost association can be recovered, and in response to a second data extraction instruction of the data table to be extracted, the data increment is extracted from the data table to be extracted and synchronized to the data table according to the second data extraction instruction And the increment table updates the extracted data to the main table, acquires a current slave table from the increment table, calculates a union of the current slave table and the recovery table as an updated slave table, associates the updated slave table with the updated main table, and removes the successfully associated data from the recovery table, thereby solving the problem of associated data loss caused by asynchronous associated table data caused by various reasons, reducing the cost of manual data problem analysis and data supplement correction, and enhancing the data integrity of a data warehouse.
Drawings
FIG. 1 is a flow chart of a method for recovering lost data based on table association according to a preferred embodiment of the present invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the apparatus for recovering missing data based on table association according to the present invention.
FIG. 3 is a schematic structural diagram of an electronic device implementing a method for recovering missing data based on table association according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flow chart of a preferred embodiment of the method for recovering lost data based on table association according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The method for recovering lost data based on table association is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, responding to the first data extraction instruction, and acquiring the data table to be extracted from the source system according to the first data extraction instruction.
The data warehouse (ETL) is used to describe the process of extracting (Extract), converting (Transform), and loading (Load) data from the source end to the destination end.
Wherein the first data extraction instruction may be configured to be triggered periodically, for example: timed triggers every day, etc.
The source system is a source system for storing data, and the data in the source system is extracted to a data warehouse for subsequent use.
Typically, the data warehouse draws incremental data from the source system every day.
In this embodiment, the acquiring the to-be-extracted data table from the source system according to the first data extraction instruction includes:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a preset label;
searching the data with the preset label in the information carried by the first data extraction instruction, and determining the searched data as a target table name;
and acquiring the data table with the target table name from the source system as the data table to be extracted.
Specifically, the first data extraction instruction is substantially a piece of code, and in the first data extraction instruction, contents between { } are referred to as the method body according to the writing principle of the code.
The information carried by the first data extraction instruction may be a specific address, or may be specific various data to be processed, and the content of the information mainly depends on the code composition of the first data extraction instruction.
The preset tag can be configured in a user-defined mode.
The preset tag has a one-to-one correspondence with a table NAME, for example, the preset tag may be configured as a NAME.
Through the embodiment, the data can be directly acquired from the instruction, so that the processing efficiency is improved, and the data is acquired by the tag, so that the data acquisition accuracy is improved due to the uniqueness of the configuration of the tag.
And S11, extracting data increment from the data table to be extracted, synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data.
Specifically, the step of extracting data increment from the data table to be extracted and synchronizing the data increment to the increment table includes:
analyzing the first data extraction instruction to obtain a first timestamp range of data extraction;
acquiring data meeting the first timestamp range from the data table to be extracted as alternative data;
detecting changed data in the alternative data;
synchronizing the changed data to the delta table.
Wherein, the analyzing the first data extraction instruction to obtain the first timestamp range of data extraction includes:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a configuration label;
and searching the data with the configuration in the information carried by the first data extraction instruction, and determining the searched data as the first timestamp range.
For example: and according to the time stamp of the data change, synchronizing the data records which are changed from the last synchronization to the current synchronization, and judging that the extraction condition is not met if the data records are not in the time interval.
Through the embodiment, the incremental synchronization of the data can be realized firstly, so that the efficiency of data synchronization is improved, and the overhead of data extraction is reduced.
And S12, determining an associated data table associated with the data table to be extracted.
Specifically, the determining an associated data table associated with the data table to be extracted includes:
detecting a data table with join operation between the data table to be extracted and the data table to be extracted;
and determining the detected data table as the associated data table.
It is understood that table association between different data tables can be achieved through join operations.
The associated data table detected by the above method has a table association relationship with the data table to be extracted, that is, the two data tables have a master-slave relationship in the source system, such as a client table and an account table.
For tables with a master-slave relationship, the extraction time of the two tables is often not completely consistent, or the commit time is not completely consistent during extraction due to a source system transaction management policy, or the incremental timestamp cannot ensure the consistency of business logic due to any other reason, so that the incremental data of the master-slave table is not matched, and then, when the tables are loaded at the data warehouse end, the condition that the dependent keywords of the slave table cannot be found in the master table occurs, so that the data loss occurs in the subsequent data conversion processing at the data warehouse end.
For example: the source system master-slave table is not updated in one transaction, resulting in different update times for the master-slave table, resulting in the data warehouse extracting data to the master table but not to the slave table.
Therefore, for the above situation, the present embodiment detects the data table having the table association relationship with the data table to be extracted, so as to perform the targeted processing, and avoid the occurrence of data loss.
In this embodiment, performing table association by using the join operation includes:
(1) inner joint (inner connection)
At least one match returns a row, only the rows in both tables with equal join fields.
Such as: select from ticket
inner join job
on ticket.id=job.t_id
Only data of ticket.id = job.t _ id is queried.
(2) left join (left connection)
All rows are returned from the left table even if there is no match in the right table.
Such as: select from ticket
left join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobt.t _ id or not, all data in ticket is returned first; if ticket.id = job.t _ id, returning corresponding job data; if ticket.id | = job.t _ id, the corresponding job data is displayed null.
(3) Right join (Right connection)
All rows are returned from the right table even if there is no match in the left table.
Such as: select from ticket
right join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobt.t _ id or not, all data in job is returned first; returning corresponding ticket data if ticket.id = job.t _ id; if ticket.id | = job.t _ id, the corresponding ticket data is displayed null.
(4) full join (external connection)
As long as there is a match in one of the tables, the row is returned (the rows in both tables are returned).
Such as: select from ticket
full join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobid or not, first returning all data of ticket and jobi; if ticket.id = job.t _ id, job data is displayed after the corresponding ticket data; if ticket.id | = job.t _ id, ticket data and job data are displayed in two lines, and their counterparts are displayed null, respectively.
S13, the extracted data of the association data table is obtained from the increment table, and a slave table is built in the increment table according to the extracted data.
It should be noted that the extraction time of the data in the associated data table is not necessarily the same as the extraction time of the data in the data table to be extracted, and since each table in the data warehouse is extracted separately, the extraction times are often inconsistent, which may cause data loss.
And S14, associating the data in the slave table with the data in the master table, and acquiring the data with failed association.
Specifically, the associating the data in the slave table with the data in the master table includes:
acquiring a data identifier of each piece of slave data in the slave table, and acquiring a data identifier of each piece of master data in the master table;
calling a pre-configured mapping table, wherein the mapping table stores the corresponding relation between the data identifier of each piece of slave data and the data identifier of each piece of master data;
when finding out that the data identifier of the first data in the slave table and the data identifier of the second data in the master table have a corresponding relation in the mapping table, determining that the first data is associated with the second data, and determining that the first data is successfully associated; or
And when the main data corresponding to the data identifier of the first data is not found in the mapping table, determining that the first data association fails.
For example: when a customer ID is associated with an account ID, if the mapping table stores the corresponding relation between the customer ID and the account ID, the data corresponding to the customer ID is associated with the data corresponding to the account ID, and the data corresponding to the customer ID is successfully associated; if the account ID corresponding to the customer ID cannot be found in the mapping table, it is indicated that data associated with the data corresponding to the customer ID does not exist in the main table, and it is determined that the data associated with the customer ID fails.
And S15, writing the data with the association failure into a recovery table.
It should be noted that before writing the data with the association failure into the recycle table, the recycle table needs to be created first.
Specifically, before writing the data with the association failure into a recycle table, the method further includes:
identifying a table structure of the delta table;
creating an isomorphic table of the increment table according to a table structure of the increment table;
and determining the created isomorphic table as the recycle table.
Through the implementation mode, the isomorphic table of the increment table is created to serve as the recovery table, and the structures of the tables are completely consistent, so that data which are failed in association can be completely written into the recovery table, more data loss is avoided, subsequent data recovery has more comprehensive data base, and the error rate is reduced.
That is, the recycle table is a dynamically updated and recycled data table to ensure that all data lost in association is recycled, and the data is repaired by trying to re-associate the data next time.
S16, responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table.
Wherein the second data extraction instruction may also be configured to be triggered periodically, for example: the second data extraction instruction may be triggered the next day after the first data extraction instruction is triggered.
In this embodiment, the extracting, according to the second data extraction instruction, a data increment from the data table to be extracted to synchronize to the increment table includes:
analyzing the second data extraction instruction to obtain a second timestamp range of data extraction;
acquiring data meeting the second timestamp range from the data table to be extracted as second alternative data;
detecting changed data in the second alternative data;
synchronizing the changed data to the delta table.
And S17, acquiring the current slave table from the increment table, and calculating the union of the current slave table and the recovery table as the updated slave table.
It will be appreciated that the current slave table is also an updated slave table.
Similarly, the current slave table is also incrementally synchronized according to the timestamp range, which is not described herein.
In the above embodiment, the union of the current slave table and the recycle table is used as the updated slave table, so that the association is performed again in the current cycle, and data loss is effectively avoided.
And S18, associating the updated slave table with the updated master table, and removing the successfully associated data from the recovery table.
For example: when data are extracted in the morning, C004 of the client table does not meet the extraction condition and is not extracted to the data warehouse side. Because the A004 of the account table needs to be correlated with the client C004 of the main table for carrying out correlation calculation, the record of the main table which is not correlated with the A004 of the account table is discarded under the common condition, the data which is not correlated with the A004 of the account table is written into the recovery table, the data is extracted again in the next morning after the data is used for the subsequent use, and the C004 of the client table is extracted and enters a data warehouse. The account table combines the data set extracted the next day with the data of the recovery table of the previous day to form a new increment table, the recovery table is filled with the data, and the data can be associated.
It should be noted that, in this embodiment, data that fails to be associated is continuously written into the recycle table, the recycle table is merged and written into the increment table in the next increment period, and association is tried again, if association is performed, data is flowed into the next link, and if association is not performed, the data enters the recycle table again, and the process is circulated until association is successful and flowed into the next link.
Meanwhile, the data successfully associated is removed from the recycle table to avoid data redundancy in the recycle table.
The method and the device can solve the problem of associated data loss caused by asynchronous associated table data due to various reasons, reduce the cost of manual data problem analysis and data supplement and correction, and enhance the data integrity of the data warehouse.
However, there may be error data that cannot be associated in the recycle table, so it is also necessary to periodically activate an error detection mechanism to remove the error data in time.
Specifically, the method further comprises:
detecting the recovered time of each piece of data in the recovery table;
when the recovery time of the data is detected to be longer than or equal to the preset time length, determining the detected data as data to be verified;
checking accounts in the source system according to the data to be verified;
acquiring data meeting the reconciliation standard from the data to be verified, and reserving the data meeting the reconciliation standard to the recovery table;
and acquiring data which does not accord with the reconciliation standard from the data to be verified, and removing the data which does not accord with the reconciliation standard from the recycle table.
Through the implementation mode, the data in the recovery table can be regularly updated, and the running burden of a system caused by redundant data is avoided.
It should be noted that, in order to further ensure the security of the data, the master table, the slave table, and the recycle table may also be deployed in the blockchain, so as to prevent the data from being maliciously tampered.
According to the technical scheme, in response to a first data extraction instruction, the data table to be extracted is obtained from a source system according to the first data extraction instruction, the data increment is extracted from the data table to be extracted and synchronized to the increment table, the increment table is built with a main table according to the extracted data, the associated data table associated with the data table to be extracted is determined, the extracted data of the associated data table is obtained from the increment table, the increment table is built with a secondary table according to the extracted data, the data in the secondary table is associated with the data in the main table, the data with failed association is obtained, the data with failed association is written into a recovery table to ensure that all the data with lost association can be recovered, and in response to a second data extraction instruction of the data table to be extracted, the data increment is extracted from the data table to be extracted and synchronized to the data table according to the second data extraction instruction And the increment table updates the extracted data to the main table, acquires a current slave table from the increment table, calculates a union of the current slave table and the recovery table as an updated slave table, associates the updated slave table with the updated main table, and removes the successfully associated data from the recovery table, thereby solving the problem of associated data loss caused by asynchronous associated table data caused by various reasons, reducing the cost of manual data problem analysis and data supplement correction, and enhancing the data integrity of a data warehouse.
FIG. 2 is a functional block diagram of a preferred embodiment of the device for recovering missing data based on table association according to the present invention. The lost data recovery apparatus 11 based on table association includes an obtaining unit 110, a building unit 111, a determining unit 112, an associating unit 113, a writing unit 114, and an updating unit 115. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
In response to the first data extraction instruction, the obtaining unit 110 obtains the data table to be extracted from the source system according to the first data extraction instruction.
The data warehouse (ETL) is used to describe the process of extracting (Extract), converting (Transform), and loading (Load) data from the source end to the destination end.
Wherein the first data extraction instruction may be configured to be triggered periodically, for example: timed triggers every day, etc.
The source system is a source system for storing data, and the data in the source system is extracted to a data warehouse for subsequent use.
Typically, the data warehouse draws incremental data from the source system every day.
In this embodiment, the acquiring unit 110, acquiring the to-be-extracted data table from the source system according to the first data extraction instruction, includes:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a preset label;
searching the data with the preset label in the information carried by the first data extraction instruction, and determining the searched data as a target table name;
and acquiring the data table with the target table name from the source system as the data table to be extracted.
Specifically, the first data extraction instruction is substantially a piece of code, and in the first data extraction instruction, contents between { } are referred to as the method body according to the writing principle of the code.
The information carried by the first data extraction instruction may be a specific address, or may be specific various data to be processed, and the content of the information mainly depends on the code composition of the first data extraction instruction.
The preset tag can be configured in a user-defined mode.
The preset tag has a one-to-one correspondence with a table NAME, for example, the preset tag may be configured as a NAME.
Through the embodiment, the data can be directly acquired from the instruction, so that the processing efficiency is improved, and the data is acquired by the tag, so that the data acquisition accuracy is improved due to the uniqueness of the configuration of the tag.
The construction unit 111 extracts data increment from the data table to be extracted and synchronizes to the increment table, and constructs a main table in the increment table according to the extracted data.
Specifically, the step of extracting, by the constructing unit 111, the data increment from the to-be-extracted data table and synchronizing to the increment table includes:
analyzing the first data extraction instruction to obtain a first timestamp range of data extraction;
acquiring data meeting the first timestamp range from the data table to be extracted as alternative data;
detecting changed data in the alternative data;
synchronizing the changed data to the delta table.
Wherein, the constructing unit 111 analyzes the first data extraction instruction, and obtaining the first timestamp range of data extraction includes:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a configuration label;
and searching the data with the configuration in the information carried by the first data extraction instruction, and determining the searched data as the first timestamp range.
For example: and according to the time stamp of the data change, synchronizing the data records which are changed from the last synchronization to the current synchronization, and judging that the extraction condition is not met if the data records are not in the time interval.
Through the embodiment, the incremental synchronization of the data can be realized firstly, so that the efficiency of data synchronization is improved, and the overhead of data extraction is reduced.
The determination unit 112 determines an associated data table associated with the data table to be extracted.
Specifically, the determining unit 112 determines that the associated data table associated with the data table to be extracted includes:
detecting a data table with join operation between the data table to be extracted and the data table to be extracted;
and determining the detected data table as the associated data table.
It is understood that table association between different data tables can be achieved through join operations.
The associated data table detected by the above method has a table association relationship with the data table to be extracted, that is, the two data tables have a master-slave relationship in the source system, such as a client table and an account table.
For tables with a master-slave relationship, the extraction time of the two tables is often not completely consistent, or the commit time is not completely consistent during extraction due to a source system transaction management policy, or the incremental timestamp cannot ensure the consistency of business logic due to any other reason, so that the incremental data of the master-slave table is not matched, and then, when the tables are loaded at the data warehouse end, the condition that the dependent keywords of the slave table cannot be found in the master table occurs, so that the data loss occurs in the subsequent data conversion processing at the data warehouse end.
For example: the source system master-slave table is not updated in one transaction, resulting in different update times for the master-slave table, resulting in the data warehouse extracting data to the master table but not to the slave table.
Therefore, for the above situation, the present embodiment detects the data table having the table association relationship with the data table to be extracted, so as to perform the targeted processing, and avoid the occurrence of data loss.
In this embodiment, performing table association by using the join operation includes:
(1) inner joint (inner connection)
At least one match returns a row, only the rows in both tables with equal join fields.
Such as: select from ticket
inner join job
on ticket.id=job.t_id
Only data of ticket.id = job.t _ id is queried.
(2) left join (left connection)
All rows are returned from the left table even if there is no match in the right table.
Such as: select from ticket
left join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobt.t _ id or not, all data in ticket is returned first; if ticket.id = job.t _ id, returning corresponding job data; if ticket.id | = job.t _ id, the corresponding job data is displayed null.
(3) Right join (Right connection)
All rows are returned from the right table even if there is no match in the left table.
Such as: select from ticket
right join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobt.t _ id or not, all data in job is returned first; returning corresponding ticket data if ticket.id = job.t _ id; if ticket.id | = job.t _ id, the corresponding ticket data is displayed null.
(4) full join (external connection)
As long as there is a match in one of the tables, the row is returned (the rows in both tables are returned).
Such as: select from ticket
full join job
on ticket.id=job.t_id
Whether ticket.id is equal to jobid or not, first returning all data of ticket and jobi; if ticket.id = job.t _ id, job data is displayed after the corresponding ticket data; if ticket.id | = job.t _ id, ticket data and job data are displayed in two lines, and their counterparts are displayed null, respectively.
The construction unit 111 acquires the extracted data of the associated data table from the increment table, and constructs a slave table in the increment table according to the extracted data.
It should be noted that the extraction time of the data in the associated data table is not necessarily the same as the extraction time of the data in the data table to be extracted, and since each table in the data warehouse is extracted separately, the extraction times are often inconsistent, which may cause data loss.
The association unit 113 associates the data in the slave table with the data in the master table, and acquires data for which association fails.
Specifically, the associating unit 113 associating the data in the slave table with the data in the master table includes:
acquiring a data identifier of each piece of slave data in the slave table, and acquiring a data identifier of each piece of master data in the master table;
calling a pre-configured mapping table, wherein the mapping table stores the corresponding relation between the data identifier of each piece of slave data and the data identifier of each piece of master data;
when finding out that the data identifier of the first data in the slave table and the data identifier of the second data in the master table have a corresponding relation in the mapping table, determining that the first data is associated with the second data, and determining that the first data is successfully associated; or
And when the main data corresponding to the data identifier of the first data is not found in the mapping table, determining that the first data association fails.
For example: when a customer ID is associated with an account ID, if the mapping table stores the corresponding relation between the customer ID and the account ID, the data corresponding to the customer ID is associated with the data corresponding to the account ID, and the data corresponding to the customer ID is successfully associated; if the account ID corresponding to the customer ID cannot be found in the mapping table, it is indicated that data associated with the data corresponding to the customer ID does not exist in the main table, and it is determined that the data associated with the customer ID fails.
The write unit 114 writes the data of which association failed to the recycle table.
It should be noted that before writing the data with the association failure into the recycle table, the recycle table needs to be created first.
Specifically, before the data with the association failure is written into a recovery table, identifying the table structure of the increment table;
creating an isomorphic table of the increment table according to a table structure of the increment table;
and determining the created isomorphic table as the recycle table.
Through the implementation mode, the isomorphic table of the increment table is created to serve as the recovery table, and the structures of the tables are completely consistent, so that data which are failed in association can be completely written into the recovery table, more data loss is avoided, subsequent data recovery has more comprehensive data base, and the error rate is reduced.
That is, the recycle table is a dynamically updated and recycled data table to ensure that all data lost in association is recycled, and the data is repaired by trying to re-associate the data next time.
In response to a second data extraction instruction for the data table to be extracted, the updating unit 115 extracts a data increment from the data table to be extracted according to the second data extraction instruction, synchronizes to the increment table, and updates the extracted data to the main table.
Wherein the second data extraction instruction may also be configured to be triggered periodically, for example: the second data extraction instruction may be triggered the next day after the first data extraction instruction is triggered.
In this embodiment, the step of extracting, by the updating unit 115, the data increment from the data table to be extracted according to the second data extraction instruction and synchronizing to the increment table includes:
analyzing the second data extraction instruction to obtain a second timestamp range of data extraction;
acquiring data meeting the second timestamp range from the data table to be extracted as second alternative data;
detecting changed data in the second alternative data;
synchronizing the changed data to the delta table.
The updating unit 115 acquires the current slave table from the increment table, and calculates a union of the current slave table and the recycle table as an updated slave table.
It will be appreciated that the current slave table is also an updated slave table.
Similarly, the current slave table is also incrementally synchronized according to the timestamp range, which is not described herein.
In the above embodiment, the union of the current slave table and the recycle table is used as the updated slave table, so that the association is performed again in the current cycle, and data loss is effectively avoided.
The associating unit 113 associates the updated slave table with the updated master table, and removes the successfully associated data from the recycle table.
For example: when data are extracted in the morning, C004 of the client table does not meet the extraction condition and is not extracted to the data warehouse side. Because the A004 of the account table needs to be correlated with the client C004 of the main table for carrying out correlation calculation, the record of the main table which is not correlated with the A004 of the account table is discarded under the common condition, the data which is not correlated with the A004 of the account table is written into the recovery table, the data is extracted again in the next morning after the data is used for the subsequent use, and the C004 of the client table is extracted and enters a data warehouse. The account table combines the data set extracted the next day with the data of the recovery table of the previous day to form a new increment table, the recovery table is filled with the data, and the data can be associated.
It should be noted that, in this embodiment, data that fails to be associated is continuously written into the recycle table, the recycle table is merged and written into the increment table in the next increment period, and association is tried again, if association is performed, data is flowed into the next link, and if association is not performed, the data enters the recycle table again, and the process is circulated until association is successful and flowed into the next link.
Meanwhile, the data successfully associated is removed from the recycle table to avoid data redundancy in the recycle table.
The method and the device can solve the problem of associated data loss caused by asynchronous associated table data due to various reasons, reduce the cost of manual data problem analysis and data supplement and correction, and enhance the data integrity of the data warehouse.
However, there may be error data that cannot be associated in the recycle table, so it is also necessary to periodically activate an error detection mechanism to remove the error data in time.
Specifically, the recovered time of each piece of data in the recovery table is detected;
when the recovery time of the data is detected to be longer than or equal to the preset time length, determining the detected data as data to be verified;
checking accounts in the source system according to the data to be verified;
acquiring data meeting the reconciliation standard from the data to be verified, and reserving the data meeting the reconciliation standard to the recovery table;
and acquiring data which does not accord with the reconciliation standard from the data to be verified, and removing the data which does not accord with the reconciliation standard from the recycle table.
Through the implementation mode, the data in the recovery table can be regularly updated, and the running burden of a system caused by redundant data is avoided.
It should be noted that, in order to further ensure the security of the data, the master table, the slave table, and the recycle table may also be deployed in the blockchain, so as to prevent the data from being maliciously tampered.
According to the technical scheme, in response to a first data extraction instruction, the data table to be extracted is obtained from a source system according to the first data extraction instruction, the data increment is extracted from the data table to be extracted and synchronized to the increment table, the increment table is built with a main table according to the extracted data, the associated data table associated with the data table to be extracted is determined, the extracted data of the associated data table is obtained from the increment table, the increment table is built with a secondary table according to the extracted data, the data in the secondary table is associated with the data in the main table, the data with failed association is obtained, the data with failed association is written into a recovery table to ensure that all the data with lost association can be recovered, and in response to a second data extraction instruction of the data table to be extracted, the data increment is extracted from the data table to be extracted and synchronized to the data table according to the second data extraction instruction And the increment table updates the extracted data to the main table, acquires a current slave table from the increment table, calculates a union of the current slave table and the recovery table as an updated slave table, associates the updated slave table with the updated main table, and removes the successfully associated data from the recovery table, thereby solving the problem of associated data loss caused by asynchronous associated table data caused by various reasons, reducing the cost of manual data problem analysis and data supplement correction, and enhancing the data integrity of a data warehouse.
Fig. 3 is a schematic structural diagram of an electronic device implementing a method for recovering missing data based on table association according to a preferred embodiment of the present invention.
The electronic device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as a missing data recycling program based on table associations, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-type structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of a missing data recovery program based on table association, but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules stored in the memory 12 (for example, executing a lost data recovery program based on table association, and the like) and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in each of the above embodiments of the table association based lost data recovery method, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the electronic device 1. For example, the computer program may be divided into an acquisition unit 110, a construction unit 111, a determination unit 112, an association unit 113, a writing unit 114, an updating unit 115.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the lost data recovery method based on table association according to various embodiments of the present invention.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), random-access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 only shows the electronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
With reference to fig. 1, the memory 12 in the electronic device 1 stores a plurality of instructions to implement a lost data recovery method based on table association, and the processor 13 can execute the plurality of instructions to implement:
responding to a first data extraction instruction, and acquiring a data table to be extracted from a source system according to the first data extraction instruction;
extracting data increment from the data table to be extracted, synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data;
determining an associated data table associated with the data table to be extracted;
acquiring extracted data of the associated data table from the increment table, and constructing a slave table in the increment table according to the extracted data;
associating the data in the slave table with the data in the master table, and acquiring data with failed association;
writing the data with the association failure into a recovery table;
responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table;
acquiring a current slave table from the increment table, and calculating a union of the current slave table and the recovery table as an updated slave table;
and associating the updated slave table with the updated master table, and removing the successfully associated data from the recovery table.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for recovering lost data based on table association is characterized in that the method for recovering lost data based on table association comprises the following steps:
responding to a first data extraction instruction, and acquiring a data table to be extracted from a source system according to the first data extraction instruction;
extracting data increment from the data table to be extracted, synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data;
determining an associated data table associated with the data table to be extracted;
acquiring extracted data of the associated data table from the increment table, and constructing a slave table in the increment table according to the extracted data;
associating the data in the slave table with the data in the master table, and acquiring data with failed association;
writing the data with the association failure into a recovery table;
responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table;
acquiring a current slave table from the increment table, and calculating a union of the current slave table and the recovery table as an updated slave table;
and associating the updated slave table with the updated master table, and removing the successfully associated data from the recovery table.
2. The method for recovering lost data based on table association as claimed in claim 1, wherein said obtaining the data table to be extracted from the source system according to the first data extraction instruction comprises:
analyzing a method body of the first data extraction instruction to obtain information carried by the first data extraction instruction;
acquiring a preset label;
searching the data with the preset label in the information carried by the first data extraction instruction, and determining the searched data as a target table name;
and acquiring the data table with the target table name from the source system as the data table to be extracted.
3. The method for recovering lost data based on table association as claimed in claim 1, wherein said extracting data increment from said data table to be extracted to synchronize to an increment table comprises:
analyzing the first data extraction instruction to obtain a first timestamp range of data extraction;
acquiring data meeting the first timestamp range from the data table to be extracted as alternative data;
detecting changed data in the alternative data;
synchronizing the changed data to the delta table.
4. The method for recovering lost data based on table association as claimed in claim 1, wherein said determining the associated data table associated with said data table to be extracted comprises:
detecting a data table with join operation between the data table to be extracted and the data table to be extracted;
and determining the detected data table as the associated data table.
5. The method of claim 1, wherein the associating data in the slave table with data in the master table comprises:
acquiring a data identifier of each piece of slave data in the slave table, and acquiring a data identifier of each piece of master data in the master table;
calling a pre-configured mapping table, wherein the mapping table stores the corresponding relation between the data identifier of each piece of slave data and the data identifier of each piece of master data;
when finding out that the data identifier of the first data in the slave table and the data identifier of the second data in the master table have a corresponding relation in the mapping table, determining that the first data is associated with the second data, and determining that the first data is successfully associated; or
And when the main data corresponding to the data identifier of the first data is not found in the mapping table, determining that the first data association fails.
6. The method of claim 1, wherein prior to writing the data for which the association failed to a reclamation table, the method further comprises:
identifying a table structure of the delta table;
creating an isomorphic table of the increment table according to a table structure of the increment table;
and determining the created isomorphic table as the recycle table.
7. The method for recovering missing data based on table associations according to claim 1, wherein said method further comprises:
detecting the recovered time of each piece of data in the recovery table;
when the recovery time of the data is detected to be longer than or equal to the preset time length, determining the detected data as data to be verified;
checking accounts in the source system according to the data to be verified;
acquiring data meeting the reconciliation standard from the data to be verified, and reserving the data meeting the reconciliation standard to the recovery table;
and acquiring data which does not accord with the reconciliation standard from the data to be verified, and removing the data which does not accord with the reconciliation standard from the recycle table.
8. A device for recovering lost data based on table association, comprising:
the acquisition unit is used for responding to a first data extraction instruction and acquiring a data table to be extracted from a source system according to the first data extraction instruction;
the construction unit is used for extracting data increment from the data table to be extracted and synchronizing the data increment to an increment table, and constructing a main table in the increment table according to the extracted data;
the determining unit is used for determining an associated data table associated with the data table to be extracted;
the construction unit is further configured to obtain extracted data of the associated data table from the increment table, and construct a slave table in the increment table according to the extracted data;
the association unit is used for associating the data in the slave table with the data in the master table and acquiring the data with failed association;
a write-in unit, configured to write the data with the association failure into a recycle table;
the updating unit is used for responding to a second data extraction instruction of the data table to be extracted, extracting data increment from the data table to be extracted according to the second data extraction instruction, synchronizing the data increment to the increment table, and updating the extracted data to the main table;
the updating unit is further used for acquiring a current slave table from the increment table and calculating a union of the current slave table and the recovery table as an updated slave table;
the association unit is further configured to associate the updated slave table with the updated master table, and remove the successfully associated data from the recycle table.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the method of lost data recovery based on table associations according to any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction, which is executed by a processor in an electronic device to implement the method for recovering missing data based on table association according to any one of claims 1 to 7.
CN202110005207.8A 2021-01-05 2021-01-05 Lost data recovery method, device, equipment and medium based on table association Active CN112328677B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110005207.8A CN112328677B (en) 2021-01-05 2021-01-05 Lost data recovery method, device, equipment and medium based on table association
PCT/CN2021/083104 WO2022147908A1 (en) 2021-01-05 2021-03-25 Table association-based lost data recovery method and apparatus, device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110005207.8A CN112328677B (en) 2021-01-05 2021-01-05 Lost data recovery method, device, equipment and medium based on table association

Publications (2)

Publication Number Publication Date
CN112328677A true CN112328677A (en) 2021-02-05
CN112328677B CN112328677B (en) 2021-04-02

Family

ID=74302154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110005207.8A Active CN112328677B (en) 2021-01-05 2021-01-05 Lost data recovery method, device, equipment and medium based on table association

Country Status (2)

Country Link
CN (1) CN112328677B (en)
WO (1) WO2022147908A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420057A (en) * 2021-06-29 2021-09-21 未鲲(上海)科技服务有限公司 Account checking data processing method and related device
WO2022147908A1 (en) * 2021-01-05 2022-07-14 平安科技(深圳)有限公司 Table association-based lost data recovery method and apparatus, device, and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120151262A1 (en) * 2010-12-13 2012-06-14 Hitachi, Ltd. Storage apparatus and method of detecting power failure in storage apparatus
US8874505B2 (en) * 2011-01-11 2014-10-28 Hitachi, Ltd. Data replication and failure recovery method for distributed key-value store
JP2017114241A (en) * 2015-12-22 2017-06-29 日立オートモティブシステムズ株式会社 Vehicle failure diagnostic device
CN107169003A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of data correlation method and device
CN110908995A (en) * 2018-09-17 2020-03-24 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN112015790A (en) * 2019-05-30 2020-12-01 北京沃东天骏信息技术有限公司 Data processing method and device
CN112035463A (en) * 2020-07-22 2020-12-04 武汉达梦数据库有限公司 Bidirectional synchronization method and synchronization device of heterogeneous database based on log analysis

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799634B (en) * 2012-06-26 2014-11-12 中国农业银行股份有限公司 Data storage method and device
CN105320680A (en) * 2014-07-15 2016-02-10 中国移动通信集团公司 Data synchronization method and device
CN106933823B (en) * 2015-12-29 2020-11-27 北京国双科技有限公司 Data synchronization method and device
CN106407360B (en) * 2016-09-07 2020-07-24 广州视源电子科技股份有限公司 Data processing method and device
US10901977B2 (en) * 2018-05-14 2021-01-26 Sap Se Database independent detection of data changes
CN109408565B (en) * 2018-10-19 2021-09-28 浪潮软件科技有限公司 Data synchronous interaction method, system and data interaction platform
CN110347672A (en) * 2019-05-27 2019-10-18 深圳壹账通智能科技有限公司 Verification method and device, the electronic equipment and storage medium of tables of data related update
CN112328677B (en) * 2021-01-05 2021-04-02 平安科技(深圳)有限公司 Lost data recovery method, device, equipment and medium based on table association

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120151262A1 (en) * 2010-12-13 2012-06-14 Hitachi, Ltd. Storage apparatus and method of detecting power failure in storage apparatus
US8874505B2 (en) * 2011-01-11 2014-10-28 Hitachi, Ltd. Data replication and failure recovery method for distributed key-value store
JP2017114241A (en) * 2015-12-22 2017-06-29 日立オートモティブシステムズ株式会社 Vehicle failure diagnostic device
CN107169003A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of data correlation method and device
CN110908995A (en) * 2018-09-17 2020-03-24 阿里巴巴集团控股有限公司 Data processing method, device and equipment
CN112015790A (en) * 2019-05-30 2020-12-01 北京沃东天骏信息技术有限公司 Data processing method and device
CN112035463A (en) * 2020-07-22 2020-12-04 武汉达梦数据库有限公司 Bidirectional synchronization method and synchronization device of heterogeneous database based on log analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022147908A1 (en) * 2021-01-05 2022-07-14 平安科技(深圳)有限公司 Table association-based lost data recovery method and apparatus, device, and medium
CN113420057A (en) * 2021-06-29 2021-09-21 未鲲(上海)科技服务有限公司 Account checking data processing method and related device

Also Published As

Publication number Publication date
WO2022147908A1 (en) 2022-07-14
CN112328677B (en) 2021-04-02

Similar Documents

Publication Publication Date Title
CN112328677B (en) Lost data recovery method, device, equipment and medium based on table association
CN115118738B (en) Disaster recovery method, device, equipment and medium based on RDMA
CN111538573A (en) Asynchronous task processing method and device and computer readable storage medium
CN112559535A (en) Multithreading-based asynchronous task processing method, device, equipment and medium
CN112653760A (en) Cross-server file transmission method and device, electronic equipment and storage medium
CN113806434A (en) Big data processing method, device, equipment and medium
CN115543198A (en) Method and device for lake entering of unstructured data, electronic equipment and storage medium
CN111651426A (en) Data migration method and device and computer readable storage medium
CN114816371B (en) Message processing method, device, equipment and medium
WO2022134820A1 (en) Webpage data extraction method and apparatus, electronic device, and storage medium
CN114626103A (en) Data consistency comparison method, device, equipment and medium
CN115048111A (en) Code generation method, device, equipment and medium based on metadata
CN114547011A (en) Data extraction method and device, electronic equipment and storage medium
CN114185776A (en) Big data point burying method, device, equipment and medium for application program
CN113254446A (en) Data fusion method and device, electronic equipment and medium
CN112685384A (en) Data migration method and device, electronic equipment and storage medium
CN114860349B (en) Data loading method, device, equipment and medium
CN115065642B (en) Code table request method, device, equipment and medium under bandwidth limitation
CN113434359B (en) Data traceability system construction method and device, electronic device and readable storage medium
CN114139199A (en) Data desensitization method, apparatus, device and medium
CN113657076B (en) Page operation record table generation method and device, electronic equipment and storage medium
CN115543214B (en) Data storage method, device, equipment and medium in low-delay scene
CN114706870A (en) Database and cache consistency synchronization method, device, equipment and storage medium
CN113885874A (en) Java class file conflict management method and device, electronic equipment and medium
CN116881284A (en) Data retrieval method, device and equipment for structured query statement and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant