CN112433997B - Data restoration method and device - Google Patents

Data restoration method and device Download PDF

Info

Publication number
CN112433997B
CN112433997B CN202011308360.XA CN202011308360A CN112433997B CN 112433997 B CN112433997 B CN 112433997B CN 202011308360 A CN202011308360 A CN 202011308360A CN 112433997 B CN112433997 B CN 112433997B
Authority
CN
China
Prior art keywords
data
link
repaired
node
repair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011308360.XA
Other languages
Chinese (zh)
Other versions
CN112433997A (en
Inventor
王艺然
吴非
汝玉峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bilibili Technology Co Ltd
Original Assignee
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bilibili Technology Co Ltd filed Critical Shanghai Bilibili Technology Co Ltd
Priority to CN202011308360.XA priority Critical patent/CN112433997B/en
Publication of CN112433997A publication Critical patent/CN112433997A/en
Application granted granted Critical
Publication of CN112433997B publication Critical patent/CN112433997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a data restoration method and a device, wherein the data restoration method comprises the following steps: receiving a data repairing instruction, wherein the data repairing instruction comprises data information to be repaired of a first data link, determining a data repairing node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link with a synchronous processing task with the first data link, acquiring snapshot data corresponding to the data repairing node in the second data link, and repairing the data to be repaired in the first data link according to the snapshot data and a data repairing strategy of the first data link.

Description

Data restoration method and device
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data repairing method. One or more embodiments of the present application relate to a data repair apparatus, a computing device, and a computer-readable storage medium.
Background
With the increasing development of internet technology, users generate a large amount of data in the process of using the internet, in order to conveniently store and process the data and prevent the user data from being lost or damaged due to the failure of a physical server, redundant backup is needed to be performed on the user data while the user data is stored, and when the user data is lost, the lost or damaged user data can be repaired through corresponding backup data so as to ensure the data security of the users.
For data restoration, at present, common practice in the industry is to implement data restoration within a certain time through offset callback of a log message middleware, but the coverage time period of the method for restoration is limited, and whether the abnormal data can be timely and effectively restored or not in the real-time calculation process due to data abnormality problems caused by network faults, system loopholes and the like often has serious influence on a service processing strategy of a target service.
Disclosure of Invention
In view of the foregoing, embodiments of the present application provide a data repair method. One or more embodiments of the present application relate to a data repair device, a computing device, and a computer readable storage medium, so as to solve the technical defects of limited coverage time period capable of repair and low accuracy of repair results in the data repair method in the prior art.
According to a first aspect of embodiments of the present application, there is provided a data repair method, including:
receiving a data repair instruction, wherein the data repair instruction comprises data information to be repaired of a first data link;
determining a data repairing node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link with a synchronous processing task with the first data link;
And obtaining snapshot data corresponding to the data repairing node in the second data link, and repairing the data to be repaired in the first data link according to the snapshot data and the data repairing strategy of the first data link.
According to a second aspect of embodiments of the present application, there is provided a data repair apparatus, including:
the receiving module is configured to receive a data repair instruction, wherein the data repair instruction comprises data information to be repaired of a first data link;
a determining module configured to determine a data repair node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link having a synchronous processing task with the first data link;
the repair module is configured to acquire snapshot data corresponding to the data repair node in the second data link, and repair the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
According to a third aspect of embodiments of the present application, there is provided a computing device comprising:
a memory and a processor;
The memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor performs the steps of the method when executing the computer-executable instructions.
According to a fourth aspect of embodiments of the present application, there is provided a computer readable storage medium storing computer executable instructions which, when executed by a processor, perform the steps of data repair.
An embodiment of the application realizes a data repairing method and a device, wherein the data repairing method comprises the steps of receiving a data repairing instruction, determining a data repairing node for repairing a first data link according to data information to be repaired contained in the data repairing instruction and a data processing period of a second data link with a synchronous processing task with the first data link, acquiring snapshot data corresponding to the data repairing node in the second data link, and repairing the data to be repaired in the first data link according to the snapshot data and a data repairing strategy of the first data link;
by creating the real-time stream link and the batch stream link with synchronous processing tasks, automatic data restoration is performed through the batch stream link and the real-time stream link, and accuracy of effect data of the game advertisement monitoring platform and timeliness of abnormal data restoration are improved.
Drawings
FIG. 1 is a flow chart of a method for data repair provided in one embodiment of the present application;
FIG. 2 is a schematic diagram of a repair system provided in one embodiment of the present application;
FIG. 3 is a schematic diagram of a data repair process provided in one embodiment of the present application;
FIG. 4 is a flowchart of a process of applying the data repair method to the field of game advertisement monitoring according to one embodiment of the present application;
FIG. 5 is a flowchart of a process for applying another method for repairing data to the field of game advertisement monitoring according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a data repair device according to an embodiment of the present application;
FIG. 7 is a block diagram of a computing device provided in one embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is, however, susceptible of embodiment in many other ways than those herein described and similar generalizations can be made by those skilled in the art without departing from the spirit of the application and the application is therefore not limited to the specific embodiments disclosed below.
The terminology used in one or more embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of one or more embodiments of the application. As used in this application in one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of the present application to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present application will be explained.
Advertisement monitoring platform: and monitoring and measuring advertisement putting effect, providing data support for market personnel, and optimizing a putting data monitoring platform.
The data is due to: a set of rules or algorithms for determining which channels to assign conversion effects to when a user accesses through a plurality of channels.
And (3) calculating in real time: processing of the data in motion directly calculates the computation of the data as it is generated or received.
Offline calculation: all input data is known before the calculation starts, and the input data does not produce a changed calculation.
lambda architecture: a big data computing system architecture with batch processing.
In the present application, a data repair method is provided. One or more embodiments of the present application relate to a data repair apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments.
The data restoration method provided by the embodiment of the application can be applied to any field needing to restore data, such as restoration of game data in the game field, restoration of advertisement click quantity in the advertisement field, restoration of voice dialogue data in the communication field, restoration of voice messages in the self-media field and the like; for easy understanding, the embodiment of the application will be described in detail by taking the application of the data repair method to repair advertisement data in the game field as an example, but is not limited to this.
Then, in the case that the data repair method is applied to the repair of game advertisement data in the game field, for example, the data repair instruction received in the data repair method can be understood as a game advertisement data repair instruction.
Referring to fig. 1, fig. 1 shows a flowchart of a data repair method according to an embodiment of the present application, including the following steps:
Step 102, receiving a data repair instruction, where the data repair instruction includes data information to be repaired of the first data link.
The data restoration method is applied to a data restoration system, a schematic diagram of the restoration system is shown in fig. 2, and the restoration system consists of five parts, namely a data warehouse, a log message middleware, a attribution calculation component, a data analysis system and an external interface of the restoration system. The data warehouse is used for storing backup data, the repairing interface is an entry and a trigger point of the whole repairing system, the log message middleware, the attribution calculating component and the data analysis system are components of the data monitoring platform, and the data monitoring platform is used as a final product form to provide data report services.
In addition, the repair system is realized by adopting a large data processing model based on a lambda architecture, and because the lambda architecture supports real-time service and Batch processing service in one system simultaneously through layering design of a Batch Layer and a Speed Layer, and interfaces of two data sources are logically unified through a Serving Layer, an application can be developed and deployed in a unified data view, so that fusion of data and the application is achieved.
Therefore, in the actual data detection process, two links, namely a real-time stream link and a batch processing link, can be simultaneously created, the data processed by the two links are the same, the processing period is different, the real-time stream link can take the data arrived within 5 seconds as a data window, and real-time data are aggregated (the program performs calculation operations such as accumulation, average and the like on each incoming single data in a memory so as to generate an advertisement data report with the granularity of days/hours in the time dimension and the granularity of channels/sub-channels in the business dimension, such as click quantity, conversion rate and the like) and attribution persistence (the data aggregation result is written into a database from the memory); the batch link may be in the form of snapshot plus delta, performed periodically in hours, and the data snapshot s2 of the batch link is generated from snapshot s1 plus period delta data.
In addition, the data generated by the real-time stream link and the batch processing link are mutually backed up, and when any one link is abnormal, the other link can be used as a backup link to restore the data of the abnormal link.
Specifically, the first data link is a link with data exception, so the first data link may be a real-time streaming link or a batch processing link; under the condition that the real-time stream link is abnormal, the first data link is a real-time stream link, under the condition that the batch processing stream link is abnormal, the first data link is a batch processing stream link, and in practical application, the real-time stream link is a link for processing real-time data, and the batch processing stream link is a link for batch processing data.
In the embodiment of the application, the data monitoring platform monitors data abnormality, and under the condition of abnormality, takes a link with abnormality as a first data link and generates a data repair instruction of the first data link, which can be realized in the following manner:
performing anomaly monitoring on a business data report of a target business;
under the condition that the data in the business data report is abnormal, determining a starting time node and an ending time node of abnormal data, and determining a business theme corresponding to the abnormal data;
and generating the data repair instruction based on the starting time node, the ending time node and/or the service theme.
Specifically, since the business data report includes business topics, taking the data report of the game advertisement detection platform as an example, the business topics include, but are not limited to, clicking, reserving, activating, registering, logging, paying and the like, and whether the data link is abnormal can be determined by performing abnormal monitoring on the business data report of the target business.
Under the condition that the data link is abnormal according to the data report, a starting time node and an ending time node of abnormal data can be determined, a service theme corresponding to the abnormal data is determined, a data repairing instruction is generated based on the starting time node, the ending time node and/or the service theme, and the generated data repairing instruction is sent to an external interface of a repairing system.
Step 104, determining a data repairing node for repairing the first data link according to the data information to be repaired and the data processing period of the second data link with the synchronous processing task with the first data link.
Specifically, as described above, in the actual data detection process, two links may be created simultaneously, that is, a real-time streaming link and a batch processing link, where the data processed by the two links are the same and the processing periods are different, so that the real-time streaming link and the batch processing link are said to have synchronous processing tasks, that is, the first data link and the second data link have synchronous processing tasks.
Because the data processing period of the first data link is different from the data processing period of the second data link, if the data in the first data link is to be repaired by using the data of the second data link as backup data under the condition that the first data link is abnormal, the data repairing node for repairing the first data link is determined according to the data information to be repaired contained in the data repairing instruction and the data processing period of the second data link.
In specific implementation, according to the data information to be repaired and the data processing period of the second data link having the synchronous processing task with the first data link, determining a data repairing node for repairing the first data link, namely, according to the ending time node of the data to be repaired contained in the data information to be repaired and the data processing period of the second data link, determining the data repairing node for repairing the first data link.
Further, the determining, according to the end time node of the data to be repaired included in the data information to be repaired and the data processing period of the second data link, a data repairing node for repairing the first data link may be specifically implemented by the following manner:
determining an operating state of the second data link at the end time node;
if the running state is running, determining the running ending time corresponding to the running period of the ending time node, and determining any time node which is greater than or equal to the running ending time as the data restoration node.
Specifically, if the running state of the running period to which the end time node belongs in the second data link (the running state of the second data link in the node time node) is running, the current running period is not ended, so that the data repair node can be determined by determining the running end time corresponding to the running period to which the end time node belongs and determining any time node greater than or equal to the running end time as the data repair node.
In practical application, the data monitoring platform generates a data repairing instruction, and after the repairing system receives the data repairing instruction, the data repairing node for repairing the first data link can be determined according to the data information to be repaired and the data processing period of the second data link, namely, the data repairing node for repairing the first data link is determined according to the end time node of the data to be repaired in the data repairing instruction and the data processing period of the second data link.
Because the data processing period of the first data link is different from the data processing period of the second data link, if the data in the first data link is to be repaired by using the data of the second data link as backup data under the condition that the first data link is abnormal, the data repairing node for repairing the first data link is determined according to the data information to be repaired contained in the data repairing instruction and the data processing period of the second data link.
The schematic diagram of the data repair process provided in this embodiment is shown in fig. 3, where a first real-time streaming link created in the data monitoring process is a real-time streaming link R1, a batch streaming link created is a batch streaming link O, the data in the real-time streaming link R1 is abnormal, the data in the batch streaming link O is taken as backup data to repair the data in the real-time streaming link R1, the real-time streaming link R1 outputs an aggregate result (the data window is 5 s) every 5 seconds, the offline streaming link O outputs a calculation result (the data processing period is 1 hour) every whole point operation, an h+1 data snapshot is generated, the two links operate independently under normal conditions, at the moment of t1, the real-time streaming link R1 begins to be abnormal due to network errors or program defects, and at the moment, the calculation of the offline streaming link O is normal, and the data is not affected; and positioning and successfully repairing the fault problem of the real-time flow link R1 at the moment t2, wherein the newly added real-time data is accurate data.
Therefore, the starting time node of the data to be repaired in the first data link is determined to be t1, the ending time node is determined to be t2, the service theme can be determined according to the data report, and after the service theme is determined, the data repairing instruction is generated based on the starting time node, the ending time node and/or the service theme.
Under the condition that the repair system receives the data repair instruction, determining a data repair node according to an end time node t2 of data to be repaired and a data processing period of a batch flow link, as can be seen from fig. 3, the data processing period of the batch flow link O is respectively a first, a second, a third and an nth data processing period, and as the end time node of the data to be repaired of the first real-time flow link R1 is t2, and the running state of the batch flow link at the end time node t2 is running, determining the running end time corresponding to the running period to which the end time node t2 belongs in the batch flow link, and determining any time node greater than or equal to the running end time as the data repair node.
Since the running period to which the end time node t2 belongs in the batch flow link is Δd3, and the running end time corresponding to Δd3 is t3, any time node greater than or equal to t3 can be used as the data repair node (data recovery point), and in practical application, t3 is usually selected as the data repair node to ensure timeliness of data repair.
And 106, obtaining snapshot data corresponding to the data repair node in the second data link, and performing repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
Specifically, the snapshot mainly has the function of being capable of carrying out online data backup and recovery. When the storage device has application failure or file damage, quick data recovery can be performed, and the data is recovered to a state at a certain available time point. The snapshot has the other function of providing another data access channel for the storage user, and when the original data is subjected to online application processing, the user can access the snapshot data and can also use the snapshot to perform testing and other works.
Therefore, in the embodiment of the present application, when the first data link has data anomalies, the snapshot data of the data recovery node in the second data link may be obtained, so as to perform data recovery on the data with anomalies in the first data link based on the snapshot data.
In addition, as different data links correspond to different data restoration strategies, for example, the data restoration strategy corresponding to the real-time stream link is to newly create a new real-time stream link, and the snapshot data of the data restoration nodes of the batch processing links and the real-time data generated after the data restoration nodes are synchronized to the new real-time stream link; the data recovery policy of the batch processing link is to synchronize the snapshot data of the data recovery node of the real-time flow link to the batch processing link, so that after receiving the data recovery instruction, the data to be recovered in the first data link needs to be recovered according to the snapshot data and the data recovery policy of the first data link.
In the implementation, when the first data link is a first real-time streaming link, the second data link is a batch streaming link, and the to-be-repaired data information includes an end time node of to-be-repaired data, according to the snapshot data and a data repair policy of the first data link, repair operation is performed on the to-be-repaired data in the first data link, which may be specifically implemented by:
creating a second real-time streaming link according to the data restoration strategy;
and synchronizing the snapshot data to the second real-time stream link, and synchronizing the data generated after the ending time node to the second real-time stream link so as to repair the data to be repaired in the first data link.
Further, creating a second real-time streaming link according to the data repair policy includes:
determining a computing component corresponding to a start time node according to the start time node of the data to be repaired contained in the data information to be repaired;
and newly adding a real-time computing consumer program in a message queue corresponding to an upstream component of the computing component so as to create the second real-time streaming link.
Specifically, because the first data link is a first real-time stream link, when the first real-time stream link has data abnormality, the first real-time stream link can be repaired by newly creating a second real-time stream link and synchronizing snapshot data of the data repairing node in the batch stream link to the second real-time stream link.
As shown in fig. 3, the newly created second real-time streaming link is a real-time streaming link R2, and the data repair node determined in the foregoing manner is t3, so that the real-time streaming link R2 is newly added at the time t3, real-time data is synchronously written into R2 from the time t3, after the snapshot data at the time t3 is generated, the generated snapshot data is written into the real-time streaming link R2, and the real-time streaming link R2 is used to replace the real-time streaming link R1, thereby realizing the repair operation of the real-time streaming link R1.
Taking the example of 10-minute advertisement click event data missing from the log message middleware to the attribution computing element at 14 days 22 of 8 months 2020, a specific data repair process comprises the following steps:
step 1: uploading the log file missing in the year 22 of 8/14 of 2020 to the corresponding date partition;
step 2: newly creating a repair task, setting three parameters of start=2020081422, end= 2020081423 and topic=click in this example, and setting the event type as advertisement click (when the parameters are not specified, the repair system performs data repair in the order of click > reservation > activation > registration > payment, so as to ensure the core attribution logic correctness of the advertisement service);
Step 3: setting a data recovery time point as 2020081423, and calculating in real time to start double writing;
step 4: synchronizing 2020081423 snapshot data of the offline flow link O to the real-time flow link R2, and observing a corresponding table;
step 5: and (3) finishing double writing by real-time calculation, switching the system data source to a real-time flow link R2, reading the corrected result, providing service, and finishing the repair flow.
Aiming at the problem of data abnormality caused by network faults or program errors in the attribution process of the game advertisement monitoring platform data, the real-time stream link and the batch processing stream link are created, so that the automatic data restoration is carried out on the real-time stream link through the batch processing stream link, and the accuracy of the game advertisement monitoring platform effect data and the timeliness of abnormal data restoration are improved.
In addition, when the first data link is a batch stream link, the second data link is a real-time stream link, and the to-be-repaired data information includes an end time node of to-be-repaired data, according to the snapshot data and a data repair policy of the first data link, repairing the to-be-repaired data in the first data link, that is, according to the data repair policy, synchronizing the snapshot data to the batch stream link, so as to repair the to-be-repaired data in the batch stream link.
Specifically, because the first data link is a batch flow link, when the first real-time flow link has data abnormality, the batch flow link can be repaired by synchronizing snapshot data of the data repairing node in the first real-time flow link to the batch flow link.
Taking the example of advertisement click event data from the log message middleware to the attribution computing component missing for a period of time at time t4, a specific data repair process comprises the following steps:
step 1: the real-time streaming link R1 outputs the aggregation result every 5 seconds. The offline flow link O outputs a calculation result in one hour every whole point operation, generates an H+1 data snapshot, and the two links operate independently;
step 2: at time t4, the offline flow link O data starts to be abnormal due to network errors or program defects, and at the moment, the real-time flow link R2 is normally calculated, and the data is not affected;
step 3: the fault problem is positioned and successfully repaired at the time t5, and the newly added batch data are accurate data, but the historical data of the batch processing flow link are abnormal due to the data in part of time periods, so that overall data errors are caused;
step 4: and selecting a time t6 after the time t5 as a fault recovery point, and covering the offline flow link O by the real-time flow link R2.
By creating the real-time stream link and the batch stream link, the batch stream link is automatically repaired through the real-time stream link, so that the accuracy of the effect data of the game advertisement monitoring platform and the timeliness of abnormal data repair are improved.
In the implementation, the data information of the data to be repaired includes a data type of the data to be repaired, so that snapshot data corresponding to the data repair node in the second data link is obtained, that is, snapshot data corresponding to the data type of the data repair node in the second data link is obtained.
Or under the condition that the data type of the data to be repaired is not contained in the data information of the data to be repaired, obtaining snapshot data corresponding to the data repairing node in the second data link, namely determining service attribution logic of a service to which the data to be repaired belongs, determining a processing sequence corresponding to each service theme in the service according to the service attribution logic, and obtaining the snapshot data of each service theme in the service of the data repairing node in the second data link according to the processing sequence.
Specifically, if the service to which the data to be modified belongs is a game advertisement data detection service, the attribution logic may be: activating a matching click, registering, logging in for matching activation, paying for matching logging in, reserving for matching clicking.
Attribution logic corresponding to different services can be determined according to actual requirements, and is not limited herein.
According to the embodiment of the application, by receiving a data repairing instruction, determining a data repairing node for repairing a first data link according to data information to be repaired of the first data link and a data processing period of a second data link with a synchronous processing task with the first data link, which are contained in the data repairing instruction, acquiring snapshot data corresponding to the data repairing node in the second data link, and repairing the data to be repaired in the first data link according to the snapshot data and a data repairing strategy of the first data link;
by creating the real-time stream link and the batch stream link with synchronous processing tasks, automatic data restoration is performed through the batch stream link and the real-time stream link, and accuracy of effect data of the game advertisement monitoring platform and timeliness of abnormal data restoration are improved.
The embodiment of the application aims at the problem of data abnormality caused by network faults or program errors in the attribution process of the game advertisement monitoring platform data,
referring to fig. 4, an application of the data repairing method provided in the embodiment of the present application in the field of game advertisement monitoring is taken as an example, and the data repairing method is further described. Fig. 4 shows a process flow chart of a data repairing method applied to the field of game advertisement monitoring, which specifically includes the following steps:
Step 402, a data repair instruction is received.
Specifically, the data repair instruction includes to-be-repaired data information of the first real-time streaming link, where the to-be-repaired data information includes a start time node and an end time node of to-be-repaired data and a data type of the to-be-repaired data, and the to-be-repaired data is determined to be advertisement click times according to the data type.
Step 404, determining a data processing period of a batch streaming link having synchronous processing tasks with the first real-time streaming link.
Step 406, determining an operation state of the batch flow link at an end time node of the data to be repaired.
If the running state is running, executing step 408;
and if the running state is the ending running, determining any time node which is greater than or equal to the ending time node as a data restoration node.
Step 408, determining an operation ending time corresponding to an operation period to which the ending time node belongs in the batch processing flow link, and determining any time node greater than or equal to the operation ending time as a data repairing node.
Step 410, determining a data repair node for repairing the first real-time streaming link according to the data information to be repaired and the data processing period of the batch streaming link.
Step 412, determining a computing component corresponding to the start time node according to the start time node of the data to be repaired included in the data information of the data to be repaired.
In step 414, the real-time computing consumer program is added to the message queue corresponding to the upstream component of the computing component to create a second real-time streaming link.
Step 416, obtaining snapshot data corresponding to the data type of the data repair node in the batch flow link.
And 418, synchronizing snapshot data to the second real-time stream link, and synchronizing data generated after the end time node to the second real-time stream link, so as to repair the data to be repaired in the first real-time stream link.
Aiming at the problem of data abnormality caused by network faults or program errors in the data attribution process of the game advertisement monitoring platform, the embodiment of the application automatically repairs the real-time streaming links through the batch streaming links by creating the real-time streaming links and the batch streaming links, thereby being beneficial to improving the accuracy of the effect data of the game advertisement monitoring platform and the timeliness of the abnormal data repair.
Referring to fig. 5, an application of the data repairing method provided in the embodiment of the present application in the field of game advertisement monitoring is taken as an example, and the data repairing method is further described. Fig. 5 shows a flowchart of a processing procedure of another data repair method applied to the field of game advertisement monitoring according to an embodiment of the present application, which specifically includes the following steps:
Step 502, a data repair instruction is received.
Specifically, the data repair instruction includes to-be-repaired data information of the batch processing flow link, the to-be-repaired data information includes a start time node, an end time node and a data type of the to-be-repaired data, and the to-be-repaired data is determined to be advertisement click times according to the data type.
Step 504 determines a data processing period of a first real-time streaming link having synchronous processing tasks with the batch streaming link.
Step 506, determining an operation state of the first real-time streaming link at an end time node of the data to be repaired.
If the running state is running, executing step 508;
and if the running state is the ending running, determining any time node which is greater than or equal to the ending time node as a data restoration node.
Step 508, determining an operation ending time corresponding to an operation period to which the ending time node belongs in the first real-time streaming link, and determining any time node greater than or equal to the operation ending time as a data repair node.
Step 510, determining a data repairing node for repairing the batch processing flow link according to the data information to be repaired and the data processing period of the first real-time flow link.
And step 512, obtaining snapshot data corresponding to the data type of the data repair node in the first real-time streaming link.
And step 514, synchronizing the snapshot data to the batch processing flow link so as to repair the data to be repaired in the batch processing flow link.
Aiming at the problem of data abnormality caused by network faults or program errors in the attribution process of the game advertisement monitoring platform data, the embodiment of the application automatically repairs the batch processing flow link through the real-time flow link by creating the real-time flow link and the batch processing flow link, thereby being beneficial to improving the accuracy of the game advertisement monitoring platform effect data and the timeliness of abnormal data repair.
Corresponding to the method embodiment, the present application further provides an embodiment of a data repairing device, and fig. 6 shows a schematic structural diagram of the data repairing device according to one embodiment of the present application. As shown in fig. 6, the apparatus includes:
a receiving module 602, configured to receive a data repair instruction, where the data repair instruction includes data information to be repaired of the first data link;
a determining module 604 configured to determine a data repair node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link having a synchronous processing task with the first data link;
The repair module 606 is configured to obtain snapshot data corresponding to the data repair node in the second data link, and perform repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair policy of the first data link.
Optionally, the determining module 604 includes:
and the determining submodule is configured to determine a data repairing node for repairing the first data link according to the ending time node of the data to be repaired contained in the data information to be repaired and the data processing period of the second data link.
Optionally, the determining submodule includes:
a first determining unit configured to determine an operation state of the second data link at the end time node;
and the second determining unit is configured to determine an operation ending time corresponding to an operation period to which the ending time node belongs if the operation state is running, and determine any time node which is greater than or equal to the operation ending time as the data repair node.
Optionally, the first data link is a first real-time streaming link, the second data link is a batch streaming link, and the data information to be repaired includes an end time node of the data to be repaired;
Accordingly, the repair module 606 includes:
a creation sub-module configured to create a second real-time streaming link according to the data repair policy;
and the first repair sub-module is configured to synchronize the snapshot data to the second real-time streaming link and synchronize the data generated after the ending time node to the second real-time streaming link so as to repair the data to be repaired in the first data link.
Optionally, the creating sub-module includes:
a computing component determining unit configured to determine a computing component corresponding to a start time node of the data to be repaired included in the data information to be repaired according to the start time node;
and the creation unit is configured to add a real-time calculation consumer program in a message queue corresponding to an upstream component of the calculation component so as to create the second real-time flow link.
Optionally, the first data link is a batch processing flow link, the second data link is a real-time flow link, and the data information to be repaired includes an end time node of the data to be repaired;
accordingly, the repair module 606 includes:
and the second repair sub-module is configured to synchronize the snapshot data to the batch processing flow link according to the data repair strategy so as to repair the data to be repaired in the batch processing flow link.
Optionally, the data information of the data to be repaired includes a data type of the data to be repaired;
accordingly, the repair module 606 includes:
and the first acquisition submodule is configured to acquire snapshot data corresponding to the data type of the data repair node in the second data link.
Optionally, the repair module 606 includes:
the processing sequence determining submodule is configured to determine service attribution logic of a service to which data to be repaired belongs and determine a processing sequence corresponding to each service theme in the service according to the service attribution logic;
and the second acquisition sub-module is configured to acquire snapshot data of each service theme in the service of the data repair node in the second data link according to the processing sequence.
Optionally, the data repairing apparatus further includes:
the monitoring module is configured to perform abnormal monitoring on the business data report of the target business;
the business topic determining module is configured to determine a start time node and an end time node of abnormal data and determine a business topic corresponding to the abnormal data under the condition that the data in the business data report is abnormal;
An instruction generation module configured to generate the data repair instruction based on the start time node, the end time node, and/or the business topic.
The above is a schematic scheme of a data repairing apparatus of the present embodiment. It should be noted that, the technical solution of the data repairing device and the technical solution of the data repairing method belong to the same conception, and details of the technical solution of the data repairing device which are not described in detail can be referred to the description of the technical solution of the data repairing method.
Fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with one embodiment of the present application. The components of computing device 700 include, but are not limited to, memory 710 and processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 740 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present application, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 7 is for exemplary purposes only and is not intended to limit the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
The processor 720 is configured to execute computer-executable instructions, where the processor is configured to implement the steps of the data repair method when executing the computer-executable instructions.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the data repairing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the data repairing method.
An embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data repair method.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the data repair method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the data repair method.
The foregoing describes specific embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may take other order or occur simultaneously in accordance with the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments of the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The above-disclosed preferred embodiments of the present application are provided only as an aid to the elucidation of the present application. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teachings of the embodiments of the present application. These embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This application is to be limited only by the claims and the full scope and equivalents thereof.

Claims (11)

1. A method of data repair, comprising:
receiving a data repair instruction, wherein the data repair instruction comprises to-be-repaired data information of a first data link, the first data link is a real-time stream link or a link with data abnormality in a batch processing link, and the to-be-repaired data information comprises an end time node of to-be-repaired data;
Determining a data repairing node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link with a synchronous processing task with the first data link;
obtaining snapshot data corresponding to the data repairing node in the second data link, and repairing the data to be repaired in the first data link according to the snapshot data and a data repairing strategy of the first data link;
when the first data link is a first real-time streaming link, the repairing operation on the data to be repaired in the first data link according to the snapshot data and the data repairing policy of the first data link includes:
creating a second real-time streaming link according to the data restoration strategy;
and synchronizing the snapshot data to the second real-time stream link, and synchronizing the data generated after the ending time node to the second real-time stream link so as to repair the data to be repaired in the first data link.
2. The method for repairing data according to claim 1, wherein the determining a data repairing node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link having a synchronous processing task with the first data link includes:
And determining a data repair node for repairing the first data link according to the end time node of the data to be repaired contained in the data information to be repaired and the data processing period of the second data link.
3. The method according to claim 2, wherein the determining a data repair node for repairing the first data link according to the end time node of the data to be repaired included in the data information to be repaired and the data processing period of the second data link includes:
determining an operating state of the second data link at the end time node;
if the running state is running, determining the running ending time corresponding to the running period of the ending time node, and determining any time node which is greater than or equal to the running ending time as the data restoration node.
4. The method of claim 1, wherein creating a second real-time streaming link according to the data repair policy comprises:
determining a computing component corresponding to a start time node according to the start time node of the data to be repaired contained in the data information to be repaired;
And newly adding a real-time computing consumer program in a message queue corresponding to an upstream component of the computing component so as to create the second real-time streaming link.
5. A data repair method according to any one of claims 1 to 3 wherein the first data link is a batch streaming link and the second data link is a real-time streaming link;
correspondingly, the repairing operation on the data to be repaired in the first data link according to the snapshot data and the data repairing policy of the first data link includes:
and synchronizing the snapshot data to the batch processing flow link according to the data restoration strategy so as to restore the data to be restored in the batch processing flow link.
6. The method for repairing data according to claim 1, wherein the data information of the data to be repaired includes a data type of the data to be repaired;
correspondingly, the obtaining snapshot data corresponding to the data repair node in the second data link includes:
and obtaining snapshot data corresponding to the data type of the data repair node in the second data link.
7. The method for repairing data according to claim 1, wherein the obtaining snapshot data corresponding to the data repairing node in the second data link includes:
Determining service attribution logic of a service to which data to be repaired belongs, and determining a processing sequence corresponding to each service theme in the service according to the service attribution logic;
and obtaining snapshot data of each service theme in the service of the data repair node in the second data link according to the processing sequence.
8. The method of claim 1, further comprising, prior to receiving the data repair instruction:
performing anomaly monitoring on a business data report of a target business;
under the condition that the data in the business data report is abnormal, determining a starting time node and an ending time node of abnormal data, and determining a business theme corresponding to the abnormal data;
and generating the data repair instruction based on the starting time node, the ending time node and/or the service theme.
9. A data repair device, comprising:
the device comprises a receiving module, a data restoration module and a data restoration module, wherein the receiving module is configured to receive a data restoration instruction, the data restoration instruction comprises to-be-restored data information of a first data link, the first data link is a real-time stream link or a link with data abnormality in a batch processing link, and the to-be-restored data information comprises an end time node of to-be-restored data;
A determining module configured to determine a data repair node for repairing the first data link according to the data information to be repaired and a data processing period of a second data link having a synchronous processing task with the first data link;
the repair module is configured to acquire snapshot data corresponding to the data repair node in the second data link, and repair the data to be repaired in the first data link according to the snapshot data and a data repair strategy of the first data link;
when the first data link is a first real-time streaming link, the repairing operation on the data to be repaired in the first data link according to the snapshot data and the data repairing policy of the first data link includes:
creating a second real-time streaming link according to the data restoration strategy;
and synchronizing the snapshot data to the second real-time stream link, and synchronizing the data generated after the ending time node to the second real-time stream link so as to repair the data to be repaired in the first data link.
10. A computing device, comprising:
A memory and a processor;
the memory is configured to store computer executable instructions and the processor is configured to execute the computer executable instructions, wherein the processor, when executing the computer executable instructions, performs the steps of the data repair method of claims 1-8.
11. A computer readable storage medium, characterized in that it stores computer instructions which, when executed by a processor, implement the steps of the data repair method of claims 1-8.
CN202011308360.XA 2020-11-20 2020-11-20 Data restoration method and device Active CN112433997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011308360.XA CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011308360.XA CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Publications (2)

Publication Number Publication Date
CN112433997A CN112433997A (en) 2021-03-02
CN112433997B true CN112433997B (en) 2023-07-04

Family

ID=74693009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011308360.XA Active CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Country Status (1)

Country Link
CN (1) CN112433997B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7869350B1 (en) * 2003-01-15 2011-01-11 Cisco Technology, Inc. Method and apparatus for determining a data communication network repair strategy
CN101953124A (en) * 2008-02-15 2011-01-19 思科技术公司 Constructing repair paths around multiple non-available links in a data communications network
WO2015149358A1 (en) * 2014-04-04 2015-10-08 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for establishing a repair path
CN109636388A (en) * 2018-12-07 2019-04-16 深圳市智税链科技有限公司 Data processing method, device, medium and electronic equipment in block chain network
CN110245154A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Multichannel links abnormality eliminating method and relevant device
CN110516928A (en) * 2019-08-09 2019-11-29 阿里巴巴集团控股有限公司 A kind of decision-making technique, device, equipment and the computer-readable medium of business special line

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7693043B2 (en) * 2005-07-22 2010-04-06 Cisco Technology, Inc. Method and apparatus for advertising repair capability
US7969929B2 (en) * 2007-05-15 2011-06-28 Broadway Corporation Transporting GSM packets over a discontinuous IP based network
JP2015049633A (en) * 2013-08-30 2015-03-16 富士通株式会社 Information processing apparatus, data repair program, and data repair method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7869350B1 (en) * 2003-01-15 2011-01-11 Cisco Technology, Inc. Method and apparatus for determining a data communication network repair strategy
CN101953124A (en) * 2008-02-15 2011-01-19 思科技术公司 Constructing repair paths around multiple non-available links in a data communications network
WO2015149358A1 (en) * 2014-04-04 2015-10-08 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for establishing a repair path
CN109636388A (en) * 2018-12-07 2019-04-16 深圳市智税链科技有限公司 Data processing method, device, medium and electronic equipment in block chain network
CN110245154A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Multichannel links abnormality eliminating method and relevant device
CN110516928A (en) * 2019-08-09 2019-11-29 阿里巴巴集团控股有限公司 A kind of decision-making technique, device, equipment and the computer-readable medium of business special line

Also Published As

Publication number Publication date
CN112433997A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
US11368506B2 (en) Fault handling for computer nodes in stream computing system
EP2561444B1 (en) Automated recovery and escalation in complex distributed applications
CN103761309A (en) Operation data processing method and system
CN110795503A (en) Multi-cluster data synchronization method and related device of distributed storage system
CN110032444B (en) Distributed system and distributed task processing method
CN109634730A (en) Method for scheduling task, device, computer equipment and storage medium
CN112256523B (en) Service data processing method and device
CN116701043A (en) Heterogeneous computing system-oriented fault node switching method, device and equipment
CN116016531A (en) Batch shutdown processing method and device
CN113296991B (en) Abnormality detection method and device
CN112433997B (en) Data restoration method and device
CN105025179A (en) Method and system for monitoring service agents of call center
US8903774B2 (en) Techniques for leveraging replication to provide rolling point in time backup with simplified restoration through distributed transactional re-creation
CN110875832B (en) Abnormal service monitoring method, device and system and computer readable storage medium
CN116400987A (en) Continuous integration method, device, electronic equipment and storage medium
CN116414914A (en) Data synchronization method and device, processor and electronic equipment
CN114968947B (en) Fault file storage method and related device
CN115412592A (en) Service processing system and method
CN101894119B (en) Mass data storage system for monitoring
CN112905457A (en) Software testing method and device
CN115934428B (en) Main disaster recovery and backup switching method and device of MYSQL database and electronic equipment
CN109005059A (en) A kind of system and method for realizing Redis automated back-up
CN117667362B (en) Method, system, equipment and readable medium for scheduling process engine
CN117873737B (en) Numerical mode rolling operation method and device, storage medium and electronic equipment
US11656926B1 (en) Systems and methods for automatically applying configuration changes to computing clusters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant