CN112433997A - Data restoration method and device - Google Patents

Data restoration method and device Download PDF

Info

Publication number
CN112433997A
CN112433997A CN202011308360.XA CN202011308360A CN112433997A CN 112433997 A CN112433997 A CN 112433997A CN 202011308360 A CN202011308360 A CN 202011308360A CN 112433997 A CN112433997 A CN 112433997A
Authority
CN
China
Prior art keywords
data
link
repair
node
repaired
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011308360.XA
Other languages
Chinese (zh)
Other versions
CN112433997B (en
Inventor
王艺然
吴非
汝玉峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bilibili Technology Co Ltd
Original Assignee
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bilibili Technology Co Ltd filed Critical Shanghai Bilibili Technology Co Ltd
Priority to CN202011308360.XA priority Critical patent/CN112433997B/en
Publication of CN112433997A publication Critical patent/CN112433997A/en
Application granted granted Critical
Publication of CN112433997B publication Critical patent/CN112433997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a data recovery method and a data recovery device, wherein the data recovery method comprises the following steps: receiving a data repair instruction, wherein the data repair instruction includes to-be-repaired data information of a first data link, determining a data repair node for repairing the first data link according to the to-be-repaired data information and a data processing cycle of a second data link having a synchronous processing task with the first data link, acquiring snapshot data corresponding to the data repair node in the second data link, and performing repair operation on to-be-repaired data in the first data link according to the snapshot data and a data repair strategy of the first data link.

Description

Data restoration method and device
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data repairing method. One or more embodiments of the present application also relate to a data recovery apparatus, a computing device, and a computer-readable storage medium.
Background
With the increasing development of internet technology, a user generates a large amount of data in the process of using the internet, in order to be able to conveniently store and process the data, and in order to prevent the user data from being lost or damaged due to the failure of a physical server, the user data needs to be redundantly backed up while being stored, and when the user data is lost, the lost or damaged user data can be repaired through corresponding backup data to ensure the data security of the user.
For data restoration, it is a common practice in the industry to implement data restoration within a certain time by using offset callback of a log message middleware, but the coverage time period for the data restoration is limited by such a method, and if the abnormal data can be timely and effectively restored in a real-time calculation process due to a network fault or a system bug, the abnormal data often has a serious influence on a service processing strategy of a target service.
Disclosure of Invention
In view of this, the present application provides a data recovery method. One or more embodiments of the present application relate to a data recovery apparatus, a computing device, and a computer-readable storage medium, so as to solve the technical defects of a limited coverage time period for recovery and a low accuracy rate of recovery results in the data recovery method in the prior art.
According to a first aspect of embodiments of the present application, there is provided a data recovery method, including:
receiving a data repair instruction, wherein the data repair instruction comprises to-be-repaired data information of a first data link;
determining a data repair node for repairing the first data link according to the data information to be repaired and the data processing cycle of a second data link having a synchronous processing task with the first data link;
and acquiring snapshot data corresponding to the data repair node in the second data link, and performing repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
According to a second aspect of embodiments of the present application, there is provided a data recovery apparatus, including:
the data recovery device comprises a receiving module, a processing module and a processing module, wherein the receiving module is configured to receive a data recovery instruction, and the data recovery instruction comprises to-be-recovered data information of a first data link;
the determining module is configured to determine a data repairing node for repairing the first data link according to the information of the data to be repaired and a data processing cycle of a second data link having a synchronous processing task with the first data link;
and the repair module is configured to acquire snapshot data corresponding to the data repair node in the second data link and perform repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
According to a third aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the method when executing the computer-executable instructions.
According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, perform the steps of data repair.
An embodiment of the application realizes a data repair method and a device, wherein the data repair method includes receiving a data repair instruction, determining a data repair node for repairing a first data link according to-be-repaired data information contained in the data repair instruction and a data processing cycle of a second data link having a synchronous processing task with the first data link, acquiring snapshot data corresponding to the data repair node in the second data link, and performing repair operation on to-be-repaired data in the first data link according to the snapshot data and a data repair strategy of the first data link;
the real-time streaming link and the batch streaming link with synchronous processing tasks are created, so that automatic data restoration is performed through the batch streaming link and the real-time streaming link, and the accuracy of the effect data of the game advertisement monitoring platform and the timeliness of abnormal data restoration are improved.
Drawings
FIG. 1 is a flow chart of a data repair method provided by an embodiment of the present application;
FIG. 2 is a schematic view of a repair system provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a data repair process provided by an embodiment of the present application;
FIG. 4 is a flowchart of a processing procedure of applying the data recovery method to the field of game advertisement monitoring according to an embodiment of the present application;
FIG. 5 is a flow chart of another processing procedure of the data recovery method applied to the field of game advertisement monitoring according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a data recovery apparatus according to an embodiment of the present application;
fig. 7 is a block diagram of a computing device according to an embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The terminology used in the one or more embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the present application. As used in one or more embodiments of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments of the present application to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first aspect may be termed a second aspect, and, similarly, a second aspect may be termed a first aspect, without departing from the scope of one or more embodiments of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
First, the noun terms to which one or more embodiments of the present application relate are explained.
An advertisement monitoring platform: the system comprises a data monitoring platform for monitoring and measuring the advertisement putting effect, providing data support for marketers and optimizing putting.
The data are attributed to: a set of rules or algorithms for determining to which channel a conversion effect is assigned when a user accesses through multiple channels.
And (3) calculating in real time: the processing of data in motion calculates the calculation of data directly when generating or receiving data.
And (3) off-line calculation: all input data is known before the calculation starts, and the input data does not produce a changed calculation.
lambda architecture: a big data computing system architecture with batch stream processing.
In the present application, a data repair method is provided. One or more embodiments of the present application are also directed to a data recovery apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
The data restoration method provided by the embodiment of the application can be applied to any field needing data restoration, such as restoration of game data in a game field, restoration of advertisement click volume in an advertisement field, restoration of voice conversation data in a communication field, restoration of voice messages in a self-media field, and the like; for convenience of understanding, the embodiment of the present application describes in detail an example in which the data restoring method is applied to restoring advertisement data in a game field, but is not limited to this.
Then, in the case that the data repair method is applied to repair of game advertisement data in the game field, for example, the data repair instruction received in the data repair method may be understood as a game advertisement data repair instruction.
Referring to fig. 1, fig. 1 shows a flowchart of a data repair method provided according to an embodiment of the present application, including the following steps:
step 102, receiving a data repair instruction, where the data repair instruction includes information of data to be repaired of a first data link.
The data restoration method is applied to a data restoration system, a schematic diagram of the restoration system is shown in fig. 2, and the restoration system is composed of a data warehouse, a log message middleware, a cause calculation component, a data analysis system and an external interface of the restoration system. The data warehouse is used for storing backup data, the repair interface is an entrance and a trigger point of the whole repair system, and the log message middleware, the cause calculation component and the data analysis system are components of the data monitoring platform and used as final product forms to provide data report services for the outside.
In addition, the repair system is realized by adopting a large data processing model based on a lambda architecture, the lambda architecture simultaneously supports real-time services and Batch processing services in one system through the layered design of a Batch Layer and a Speed Layer, and interfaces of two data sources are logically unified through a Serving Layer, so that the application can be developed and deployed in a unified data view, and the fusion of data and application is realized.
Therefore, in the actual data detection process, two links, namely a real-time streaming link and a batch processing link, can be created at the same time, the data processed by the two links are the same, the processing periods are different, the real-time streaming link can use the data arriving within 5 seconds as a data window, and the real-time data is aggregated (the program performs calculation operations such as accumulation, averaging and the like on each arriving single data in the memory to generate an advertisement data report form with day/hour granularity on the time dimension and channel/sub-channel granularity on the business dimension, such as the click rate, the conversion rate and the like) and attributed persistence (the data aggregation result is written into the database from the memory); the batch link may take the form of a snapshot plus delta, executed periodically in hours, and a data snapshot s2 of the batch link is generated from snapshot s1 plus the epoch delta data.
In addition, the data generated by the real-time streaming link and the batch processing link are mutually backed up, and when any one link is abnormal, the other link can be used as a backup link to recover the data of the abnormal link.
Specifically, the first data link is a link with data exception, and thus, the first data link may be a real-time streaming link or a batch processing link; in practical application, the real-time flow link is a link for processing real-time data, and the batch flow link is a link for batch processing data.
In the embodiment of the application, the data monitoring platform monitors data abnormality, and when the data abnormality occurs, the link with the abnormality is used as the first data link, and a data repair instruction of the first data link is generated, which can be specifically realized in the following manner:
monitoring the abnormity of the business data report of the target business;
under the condition that the data abnormality exists in the business data report, determining a starting time node and an ending time node of abnormal data, and determining a business theme corresponding to the abnormal data;
generating the data repair instruction based on the start time node, the end time node, and/or the business topic.
Specifically, since the data report of the service data includes the service theme, taking the data report of the game advertisement detection platform as an example, the included service theme includes but is not limited to click, reservation, activation, registration, login, payment, and the like, and by monitoring the abnormality of the service data report of the target service, it can be determined whether the data link is abnormal or not.
Under the condition that data abnormity exists in a certain data link according to the data report, the starting time node and the ending time node of abnormal data can be determined, the business theme corresponding to the abnormal data is determined, a data repairing instruction is generated based on the starting time node, the ending time node and/or the business theme, and the generated data repairing instruction is sent to an external interface of a repairing system.
And step 104, determining a data repair node for repairing the first data link according to the data information to be repaired and the data processing cycle of the second data link having a synchronous processing task with the first data link.
Specifically, as described above, in the actual data detection process, two links, namely the real-time streaming link and the batch processing link, may be created at the same time, where the two links process the same data and have different processing periods, and therefore, the real-time streaming link and the batch processing link are said to have the synchronous processing task, that is, the first data link and the second data link have the synchronous processing task.
Because the data processing cycle of the first data link is different from the data processing cycle of the second data link, when the first data link is abnormal, if the data in the first data link is to be repaired by using the data in the second data link as backup data, the data repair node for repairing the first data link needs to be determined according to the information of the data to be repaired contained in the data repair instruction and the data processing cycle of the second data link.
In specific implementation, the data repairing node for repairing the first data link is determined according to the data information to be repaired and the data processing cycle of the second data link having a synchronous processing task with the first data link, that is, the data repairing node for repairing the first data link is determined according to the end time node of the data to be repaired contained in the data information to be repaired and the data processing cycle of the second data link.
Further, the determining a data repair node for repairing the first data link according to the end time node of the data to be repaired contained in the data information to be repaired and the data processing cycle of the second data link may specifically be implemented in the following manner:
determining an operational status of the second data link at the end time node;
and if the running state is running, determining running end time corresponding to the running period to which the end time node belongs, and determining any time node which is greater than or equal to the running end time as the data recovery node.
Specifically, if the operation state of the operation cycle to which the end time node belongs in the second data link (the operation state of the second data link at the node hui time node) is running, it indicates that the current operation cycle is not ended, and therefore, the operation end time corresponding to the operation cycle to which the end time node belongs may be determined, and any time node greater than or equal to the operation end time may be determined as the data recovery node.
In practical application, the data monitoring platform generates a data repair instruction, and after the repair system receives the data repair instruction, the data repair node for repairing the first data link can be determined according to the information of the data to be repaired contained in the data repair instruction and the data processing cycle of the second data link, specifically, the data repair node for repairing the first data link is determined according to the end time node of the data to be repaired in the data repair instruction and the data processing cycle of the second data link.
Because the data processing cycle of the first data link is different from the data processing cycle of the second data link, when the first data link is abnormal, if the data in the first data link is to be repaired by using the data in the second data link as backup data, the data repair node for repairing the first data link needs to be determined according to the information of the data to be repaired contained in the data repair instruction and the data processing cycle of the second data link.
Fig. 3 shows a schematic diagram of a data repair process provided in an embodiment of the present application, where a first real-time streaming link created in a data monitoring process is a real-time streaming link R1, a batch processing streaming link created in the data monitoring process is a batch processing streaming link O, taking the data in the real-time flow link R1 abnormal and taking the data of the batch flow link O as backup data to repair the data in the real-time flow link R1 as an example, the real-time flow link R1 outputs an aggregation result (data window is 5s) every 5 seconds, the offline flow link O outputs a calculation result of last hour (data processing period is 1 hour) every whole point of operation, an H +1 data snapshot is generated, the two links operate independently in a normal condition, at the time of t1, due to network errors or program defects, the data of the real-time flow link R1 starts to be abnormal, and at this time, the offline flow link O calculates normally without being affected by the data; and (4) positioning the fault problem of the real-time streaming link R1 at the time t2 and successfully repairing the fault problem, wherein the newly added real-time data are accurate data.
Therefore, the starting time node and the ending time node of the data to be repaired in the first data link are determined as t1 and t2 respectively, the business theme can be determined according to the data report, and after the business theme is determined, the data repairing instruction is generated based on the starting time node, the ending time node and/or the business theme.
When the repair system receives the data repair instruction, the data repair node is determined according to the end time node t2 of the data to be repaired and the data processing cycle of the batch processing flow link, as shown in fig. 3, the data processing cycle of the batch processing flow link O is Δ d, and Δ d1, Δ d2, Δ d3, and Δ dn are respectively the first, second, third, and nth data processing cycles, and since the end time node of the data to be repaired of the first real-time flow link R1 is t2, and the operation state of the batch processing flow link at the end time node t2 is in operation, it is necessary to determine the operation end time corresponding to the operation cycle to which the end time node t2 belongs in the batch processing flow link, and determine any time node greater than or equal to the operation end time as the data repair node.
Since the operation cycle of the end time node t2 in the batch flow link is Δ d3, and the operation end time corresponding to Δ d3 is t3, any time node greater than or equal to t3 can be used as the data repair node (data recovery point), and in practical applications, t3 is usually selected as the data repair node to ensure the timeliness of data repair.
Step 106, obtaining snapshot data corresponding to the data repair node in the second data link, and performing repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair policy of the first data link.
Specifically, the snapshot mainly functions to enable online data backup and recovery. When the storage device has application failure or file damage, the data can be quickly recovered, and the data can be recovered to the state of an available time point. The snapshot has another function of providing another data access channel for the storage user, so that when the original data is subjected to online application processing, the user can access the snapshot data and can also utilize the snapshot to perform work such as testing.
Therefore, in the embodiment of the present application, when there is a data abnormality in the first data link, the snapshot data of the data recovery node in the second data link may be acquired, so as to perform data recovery on the data having the abnormality in the first data link based on the snapshot data.
In addition, different data links correspond to different data repair strategies, for example, the data repair strategy corresponding to the real-time streaming link is to create a new real-time streaming link again, and synchronize snapshot data of a data recovery node of the batch processing link and real-time data generated after the data recovery node to the new real-time streaming link; and the data recovery strategy of the batch processing link is to synchronize the snapshot data of the data recovery node of the real-time streaming link to the batch processing link, so after receiving the data recovery instruction, the data to be recovered in the first data link needs to be recovered according to the snapshot data and the data recovery strategy of the first data link.
In specific implementation, when the first data link is a first real-time streaming link, the second data link is a batch streaming link, and the information of the data to be repaired includes an end time node of the data to be repaired, the data to be repaired in the first data link is repaired according to the snapshot data and the data repair policy of the first data link, which may be specifically implemented in the following manner:
creating a second real-time streaming link according to the data repair strategy;
and synchronizing the snapshot data to the second real-time streaming link, and synchronizing the data generated after the end time node to the second real-time streaming link, so as to perform a repair operation on the data to be repaired in the first data link.
Further, creating a second real-time streaming link according to the data repair policy includes:
determining a computing component corresponding to the initial time node according to the initial time node of the data to be repaired contained in the data information to be repaired;
and adding a real-time computation consumer program in a message queue corresponding to an upstream component of the computation components to create the second real-time streaming link.
Specifically, since the first data link is the first real-time streaming link, when the first real-time streaming link has data abnormality, the first real-time streaming link may be repaired by creating the second real-time streaming link and synchronizing snapshot data of the data repair node in the batch streaming link to the second real-time streaming link.
As shown in fig. 3, the newly created second real-time streaming link is a real-time streaming link R2, and the data repair node determined in the foregoing manner is t3, so that the real-time streaming link R2 is newly added at time t3, the real-time data is synchronously written into R2 from time t3, after the snapshot data at time t3 is generated, the generated snapshot data is written into the real-time streaming link R2, and the real-time streaming link R2 is used to replace the real-time streaming link R1, so that the repair operation on the real-time streaming link R1 can be realized.
Taking the example of 10-minute advertisement click event data missing from the log message middleware to the cause calculation component at 8, month, 14 and 22 in 2020, a specific data recovery process comprises the following steps:
step 1: uploading the missing log files at 8, month, 14 and 22 in 2020 to the corresponding date partition;
step 2: newly building a repair task, setting three parameters of start-stop time and event type, in this example, start is 2020081422, end is 2020081423 and topic is click, and in this example, the event type is set as advertisement click (when the parameter is not specified, the repair system performs data repair in the order of click > reservation > activation > registration > login > payment, so as to ensure the core attribution logic correctness of the advertisement service);
and step 3: setting the data recovery time point to 2020081423, and calculating in real time to start double writing;
and 4, step 4: synchronizing 2020081423 snapshot data of the offline streaming link O to the real-time streaming link R2, and observing a corresponding table;
and 5: and (4) finishing the double writing by real-time calculation, switching a system data source to the real-time flow link R2 to read the corrected result to provide service, and finishing the repair process.
Aiming at the problem of data abnormity caused by network faults or program errors in the process of attributing the game advertisement monitoring platform data, the real-time streaming link and the batch processing streaming link are established, so that the automatic data restoration is performed on the real-time streaming link through the batch processing streaming link, and the accuracy of the game advertisement monitoring platform effect data and the timeliness of abnormal data restoration are improved.
In addition, when the first data link is a batch processing stream link, the second data link is a real-time stream link, and the information of the data to be repaired includes an end time node of the data to be repaired, the data to be repaired in the first data link is repaired according to the snapshot data and the data repair policy of the first data link, that is, the snapshot data is synchronized to the batch processing stream link according to the data repair policy, so as to repair the data to be repaired in the batch processing stream link.
Specifically, since the first data link is a batch processing stream link, when the first real-time stream link has data abnormality, the repair operation may be performed on the batch processing stream link in a manner of synchronizing snapshot data of the data repair node in the first real-time stream link to the batch processing stream link.
Taking the advertisement click event data from the log message middleware to the computing component missing a period of time at time t4 as an example, the specific data recovery process includes the following steps:
step 1: the real-time streaming link R1 outputs the aggregated result every 5 seconds. Outputting a calculation result of the last hour every time the offline stream link O runs at a certain point, generating an H +1 data snapshot, and independently running the two links;
step 2: at the time of t4, due to a network error or a program defect, the data of the offline flow link O begins to be abnormal, at the time, the real-time flow link R2 is normally calculated, and the data is not influenced;
and step 3: positioning and successfully repairing the fault problem at the time t5, wherein the newly added batch data are accurate data, but the historical data of the batch processing flow link cause overall data errors due to data abnormality of part of time periods;
and 4, step 4: and selecting a time t6 after the time t5 as a fault recovery point, and covering the offline flow link O by using the real-time flow link R2.
The real-time streaming link and the batch processing streaming link are created, so that the batch processing streaming link is subjected to automatic data restoration through the real-time streaming link, and the accuracy of the effect data of the game advertisement monitoring platform and the timeliness of abnormal data restoration are improved.
In specific implementation, the data information to be repaired includes a data type of the data to be repaired, so that snapshot data corresponding to the data repair node in the second data link is obtained, that is, snapshot data corresponding to the data type of the data repair node in the second data link is obtained.
Or, under the condition that the data type of the data to be repaired is not included in the data information to be repaired, the snapshot data corresponding to the data repair node in the second data link is obtained, that is, the service attribution logic of the service to which the data to be repaired belongs is determined, the processing sequence corresponding to each service topic in the service is determined according to the service attribution logic, and the snapshot data of each service topic in the service of the data repair node in the second data link is obtained according to the processing sequence.
Specifically, if the service to which the data to be repaired belongs is a game advertisement data detection service, the attribution logic may be: activating matching click, registering, logging matching activation, paying matching logging, and reserving matching click.
The attribution logic corresponding to different services can be determined according to actual requirements, and is not limited herein.
The method includes the steps that a data repair instruction is received, a data repair node used for repairing a first data link is determined according to-be-repaired data information of the first data link and a data processing period of a second data link which has a synchronous processing task with the first data link, the to-be-repaired data information is included in the data repair instruction, snapshot data corresponding to the data repair node in the second data link are obtained, and repair operation is conducted on to-be-repaired data in the first data link according to the snapshot data and a data repair strategy of the first data link;
the real-time streaming link and the batch streaming link with synchronous processing tasks are created, so that automatic data restoration is performed through the batch streaming link and the real-time streaming link, and the accuracy of the effect data of the game advertisement monitoring platform and the timeliness of abnormal data restoration are improved.
The embodiment of the application aims at the problem of data abnormity caused by network failure or program error in the process of attributing the game advertisement monitoring platform data,
referring to fig. 4, the data recovery method provided in the embodiment of the present application is further described by taking an application of the data recovery method in the field of game advertisement monitoring as an example. Fig. 4 shows a flow chart of a processing procedure of applying the data recovery method provided by an embodiment of the present application to the field of game advertisement monitoring, which specifically includes the following steps:
step 402, a data repair instruction is received.
Specifically, the data repair instruction includes information of data to be repaired of the first real-time streaming link, the information of the data to be repaired includes a start time node and an end time node of the data to be repaired and a data type of the data to be repaired, and the data to be repaired is determined as advertisement click times according to the data type.
At step 404, a data processing cycle of the batch streaming link having a synchronous processing task with the first real-time streaming link is determined.
Step 406, determining the running state of the batch processing stream link at the end time node of the data to be repaired.
If the operation status is running, go to step 408;
and if the running state is the running ending state, determining any time node which is greater than or equal to the running ending time node as a data repairing node.
Step 408, determining the operation ending time corresponding to the operation period to which the ending time node belongs in the batch processing stream link, and determining any time node greater than or equal to the operation ending time as a data recovery node.
Step 410, determining a data repair node for repairing the first real-time streaming link according to the data information to be repaired and the data processing cycle of the batch processing streaming link.
Step 412, determining a computing component corresponding to the start time node according to the start time node of the data to be repaired contained in the data information to be repaired.
Step 414, add a new real-time computation consumer program in the message queue corresponding to the upstream component of the computation component to create a second real-time streaming link.
Step 416, obtaining snapshot data of the data repair node in the batch processing stream link and corresponding to the data type.
Step 418, synchronizing snapshot data to the second real-time streaming link, and synchronizing data generated after the end time node to the second real-time streaming link, so as to perform a repair operation on the data to be repaired in the first real-time streaming link.
According to the embodiment of the application, aiming at the problem of data abnormity caused by network faults or program errors in the process of attributing the game advertisement monitoring platform data, the real-time streaming link and the batch processing streaming link are created, so that the automatic data restoration is performed on the real-time streaming link through the batch processing streaming link, and the accuracy of the game advertisement monitoring platform effect data and the timeliness of abnormal data restoration are improved.
Referring to fig. 5, the data recovery method provided in the embodiment of the present application is further described by taking an application of the data recovery method in the field of game advertisement monitoring as an example. Fig. 5 is a flowchart illustrating a processing procedure of another data recovery method applied to the field of game advertisement monitoring according to an embodiment of the present application, and specifically includes the following steps:
step 502, a data repair instruction is received.
Specifically, the data repair instruction includes data information to be repaired of the batch processing streaming link, the data information to be repaired includes a start time node and an end time node of the data to be repaired and a data type of the data to be repaired, and the data to be repaired is determined as the advertisement click frequency according to the data type.
At step 504, a data processing cycle of a first real-time streaming link having a synchronous processing task with the batch streaming link is determined.
Step 506, determining the running state of the first real-time streaming link at the end time node of the data to be repaired.
If the operation status is running, go to step 508;
and if the running state is the running ending state, determining any time node which is greater than or equal to the running ending time node as a data repairing node.
Step 508, determining the operation ending time corresponding to the operation period to which the ending time node belongs in the first real-time streaming link, and determining any time node greater than or equal to the operation ending time as a data recovery node.
Step 510, determining a data repair node for repairing the batch processing stream link according to the data information to be repaired and the data processing cycle of the first real-time stream link.
Step 512, obtaining snapshot data of the data repair node in the first real-time streaming link, which corresponds to the data type.
Step 514, synchronizing the snapshot data to the batch processing stream link, so as to perform a repair operation on the data to be repaired in the batch processing stream link.
According to the embodiment of the application, aiming at the problem of data abnormity caused by network faults or program errors in the process of attributing the game advertisement monitoring platform data, the real-time streaming link and the batch processing streaming link are created, so that the batch processing streaming link is subjected to automatic data restoration through the real-time streaming link, and the accuracy of the game advertisement monitoring platform effect data and the timeliness of abnormal data restoration are improved.
Corresponding to the above method embodiment, the present application further provides an embodiment of a data recovery device, and fig. 6 shows a schematic structural diagram of a data recovery device provided in an embodiment of the present application. As shown in fig. 6, the apparatus includes:
a receiving module 602, configured to receive a data repair instruction, where the data repair instruction includes information of data to be repaired of a first data link;
a determining module 604, configured to determine a data repair node for repairing the first data link according to the information of the data to be repaired and a data processing cycle of a second data link having a synchronous processing task with the first data link;
a repair module 606 configured to obtain snapshot data corresponding to the data repair node in the second data link, and perform a repair operation on data to be repaired in the first data link according to the snapshot data and the data repair policy of the first data link.
Optionally, the determining module 604 includes:
and the determining submodule is configured to determine a data repair node for repairing the first data link according to an end time node of the data to be repaired contained in the data information to be repaired and the data processing period of the second data link.
Optionally, the determining sub-module includes:
a first determining unit configured to determine an operation status of the second data link at the end time node;
and the second determining unit is configured to determine an operation end time corresponding to the operation cycle to which the end time node belongs if the operation state is in operation, and determine any time node which is greater than or equal to the operation end time as the data recovery node.
Optionally, the first data link is a first real-time streaming link, the second data link is a batch streaming link, and the to-be-repaired data information includes an end time node of the to-be-repaired data;
accordingly, the repair module 606 includes:
a creation sub-module configured to create a second real-time streaming link according to the data repair policy;
a first repair submodule configured to synchronize the snapshot data to the second real-time streaming link and synchronize data generated after the end time node to the second real-time streaming link, so as to perform a repair operation on data to be repaired in the first data link.
Optionally, the creating sub-module includes:
a computing component determining unit configured to determine a computing component corresponding to an initial time node according to the initial time node of the data to be repaired contained in the data information to be repaired;
a creating unit configured to add a new real-time computation consumer program in a message queue corresponding to an upstream component of the computation components to create the second real-time streaming link.
Optionally, the first data link is a batch streaming link, the second data link is a real-time streaming link, and the information of the data to be repaired includes an end time node of the data to be repaired;
accordingly, the repair module 606 includes:
and the second repair submodule is configured to synchronize the snapshot data to the batch processing stream link according to the data repair policy, so as to perform repair operation on the data to be repaired in the batch processing stream link.
Optionally, the information of the data to be repaired includes a data type of the data to be repaired;
accordingly, the repair module 606 includes:
a first obtaining sub-module configured to obtain snapshot data of the data repair node in the second data link, where the snapshot data corresponds to the data type.
Optionally, the repair module 606 includes:
the processing sequence determining submodule is configured to determine the service attribution logic of the service to which the data to be repaired belongs, and determine the processing sequence corresponding to each service theme in the service according to the service attribution logic;
and the second obtaining submodule is configured to obtain snapshot data of each service topic in the service of the data repair node in the second data link according to the processing sequence.
Optionally, the data recovery apparatus further includes:
the monitoring module is configured to monitor the abnormity of the business data report of the target business;
the business theme determining module is configured to determine a starting time node and an ending time node of abnormal data and determine a business theme corresponding to the abnormal data under the condition that the abnormal data exists in the business data report;
an instruction generation module configured to generate the data repair instruction based on the start time node, the end time node, and/or the business topic.
The foregoing is a schematic configuration of a data recovery apparatus according to the present embodiment. It should be noted that the technical solution of the data recovery apparatus and the technical solution of the data recovery method belong to the same concept, and details that are not described in detail in the technical solution of the data recovery apparatus can be referred to the description of the technical solution of the data recovery method.
FIG. 7 illustrates a block diagram of a computing device 700 provided according to an embodiment of the present application. The components of the computing device 700 include, but are not limited to, memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the application, the above-described components of the computing device 700 and other components not shown in fig. 7 may also be connected to each other, for example, by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
Wherein the processor 720 is configured to execute the computer-executable instructions, and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the data recovery method when executing the computer-executable instructions.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data recovery method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the data recovery method.
An embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor, implement the steps of the data recovery method.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data recovery method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the data recovery method.
The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application embodiment is not limited by the described acts or sequences, because some steps may be performed in other sequences or simultaneously according to the present application embodiment. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that acts and modules referred to are not necessarily required to implement the embodiments of the application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments of the application and its practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims (12)

1. A method of data repair, comprising:
receiving a data repair instruction, wherein the data repair instruction comprises to-be-repaired data information of a first data link;
determining a data repair node for repairing the first data link according to the data information to be repaired and the data processing cycle of a second data link having a synchronous processing task with the first data link;
and acquiring snapshot data corresponding to the data repair node in the second data link, and performing repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
2. The data repair method according to claim 1, wherein the determining, according to the data information to be repaired and a data processing cycle of a second data link having a synchronous processing task with the first data link, a data repair node for repairing the first data link includes:
and determining a data repair node for repairing the first data link according to an end time node of the data to be repaired contained in the data information to be repaired and the data processing period of the second data link.
3. The method according to claim 2, wherein the determining, according to the end time node of the data to be repaired included in the data information to be repaired and the data processing cycle of the second data link, a data repair node for repairing the first data link includes:
determining an operational status of the second data link at the end time node;
and if the running state is running, determining running end time corresponding to the running period to which the end time node belongs, and determining any time node which is greater than or equal to the running end time as the data recovery node.
4. The data recovery method according to any one of claims 1 to 3, wherein the first data link is a first real-time streaming link, the second data link is a batch streaming link, and the data information to be recovered includes an end time node of the data to be recovered;
correspondingly, the performing a repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair policy of the first data link includes:
creating a second real-time streaming link according to the data repair strategy;
and synchronizing the snapshot data to the second real-time streaming link, and synchronizing the data generated after the end time node to the second real-time streaming link, so as to perform a repair operation on the data to be repaired in the first data link.
5. The data repair method of claim 4, wherein creating the second real-time streaming link according to the data repair policy comprises:
determining a computing component corresponding to the initial time node according to the initial time node of the data to be repaired contained in the data information to be repaired;
and adding a real-time computation consumer program in a message queue corresponding to an upstream component of the computation components to create the second real-time streaming link.
6. The data recovery method according to any one of claims 1 to 3, wherein the first data link is a batch streaming link, the second data link is a real-time streaming link, and the data information to be recovered includes an end time node of the data to be recovered;
correspondingly, the performing a repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair policy of the first data link includes:
and synchronizing the snapshot data to the batch processing flow link according to the data repair strategy so as to repair the data to be repaired in the batch processing flow link.
7. The data recovery method according to claim 1, wherein the data information to be recovered includes a data type of the data to be recovered;
correspondingly, the obtaining of the snapshot data corresponding to the data repair node in the second data link includes:
and acquiring snapshot data, corresponding to the data type, of the data repair node in the second data link.
8. The method according to claim 1, wherein the obtaining snapshot data corresponding to the data repair node in the second data link includes:
determining service attribution logic of a service to which data to be repaired belongs, and determining a processing sequence corresponding to each service theme in the service according to the service attribution logic;
and acquiring snapshot data of each service theme in the service of the data repair node in the second data link according to the processing sequence.
9. The data repair method of claim 1, wherein, prior to receiving the data repair instruction, further comprising:
monitoring the abnormity of the business data report of the target business;
under the condition that the data abnormality exists in the business data report, determining a starting time node and an ending time node of abnormal data, and determining a business theme corresponding to the abnormal data;
generating the data repair instruction based on the start time node, the end time node, and/or the business topic.
10. A data recovery apparatus, comprising:
the data recovery device comprises a receiving module, a processing module and a processing module, wherein the receiving module is configured to receive a data recovery instruction, and the data recovery instruction comprises to-be-recovered data information of a first data link;
the determining module is configured to determine a data repairing node for repairing the first data link according to the information of the data to be repaired and a data processing cycle of a second data link having a synchronous processing task with the first data link;
and the repair module is configured to acquire snapshot data corresponding to the data repair node in the second data link and perform repair operation on the data to be repaired in the first data link according to the snapshot data and the data repair strategy of the first data link.
11. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor when executing the computer-executable instructions performs the steps of the data repair method of claims 1-9.
12. A computer-readable storage medium, characterized in that it stores computer instructions which, when executed by a processor, implement the steps of the data repair method of claims 1-9.
CN202011308360.XA 2020-11-20 2020-11-20 Data restoration method and device Active CN112433997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011308360.XA CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011308360.XA CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Publications (2)

Publication Number Publication Date
CN112433997A true CN112433997A (en) 2021-03-02
CN112433997B CN112433997B (en) 2023-07-04

Family

ID=74693009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011308360.XA Active CN112433997B (en) 2020-11-20 2020-11-20 Data restoration method and device

Country Status (1)

Country Link
CN (1) CN112433997B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070041379A1 (en) * 2005-07-22 2007-02-22 Previdi Stefano B Method and apparatus for advertising repair capability
US20080285478A1 (en) * 2007-05-15 2008-11-20 Radioframe Networks, Inc. Transporting GSM packets over a discontinuous IP Based network
US20090207728A1 (en) * 2008-02-15 2009-08-20 Stewart Frederick Bryant Constructing repair paths around multiple non-available links in a data communications network
US7869350B1 (en) * 2003-01-15 2011-01-11 Cisco Technology, Inc. Method and apparatus for determining a data communication network repair strategy
US20150067442A1 (en) * 2013-08-30 2015-03-05 Fujitsu Limited Information processing apparatus and data repairing method
WO2015149358A1 (en) * 2014-04-04 2015-10-08 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for establishing a repair path
CN109636388A (en) * 2018-12-07 2019-04-16 深圳市智税链科技有限公司 Data processing method, device, medium and electronic equipment in block chain network
CN110245154A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Multichannel links abnormality eliminating method and relevant device
CN110516928A (en) * 2019-08-09 2019-11-29 阿里巴巴集团控股有限公司 A kind of decision-making technique, device, equipment and the computer-readable medium of business special line

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7869350B1 (en) * 2003-01-15 2011-01-11 Cisco Technology, Inc. Method and apparatus for determining a data communication network repair strategy
US20070041379A1 (en) * 2005-07-22 2007-02-22 Previdi Stefano B Method and apparatus for advertising repair capability
US20080285478A1 (en) * 2007-05-15 2008-11-20 Radioframe Networks, Inc. Transporting GSM packets over a discontinuous IP Based network
US20090207728A1 (en) * 2008-02-15 2009-08-20 Stewart Frederick Bryant Constructing repair paths around multiple non-available links in a data communications network
CN101953124A (en) * 2008-02-15 2011-01-19 思科技术公司 Constructing repair paths around multiple non-available links in a data communications network
US20150067442A1 (en) * 2013-08-30 2015-03-05 Fujitsu Limited Information processing apparatus and data repairing method
WO2015149358A1 (en) * 2014-04-04 2015-10-08 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for establishing a repair path
CN109636388A (en) * 2018-12-07 2019-04-16 深圳市智税链科技有限公司 Data processing method, device, medium and electronic equipment in block chain network
CN110245154A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Multichannel links abnormality eliminating method and relevant device
CN110516928A (en) * 2019-08-09 2019-11-29 阿里巴巴集团控股有限公司 A kind of decision-making technique, device, equipment and the computer-readable medium of business special line

Also Published As

Publication number Publication date
CN112433997B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
US11368506B2 (en) Fault handling for computer nodes in stream computing system
US20180365264A1 (en) Telemetry system for a cloud synchronization system
CN110516971B (en) Anomaly detection method, device, medium and computing equipment
US20130007772A1 (en) Method and system for automated system migration
US11133989B2 (en) Automated remediation and repair for networked environments
US20100180155A1 (en) Dynamic testing of networks
US7555751B1 (en) Method and system for performing a live system upgrade
CN110032444B (en) Distributed system and distributed task processing method
CN108810127A (en) Disaster recovery method based on block chain and device
US11500763B1 (en) Distributed canary testing with test artifact caching
CN112527567A (en) System disaster tolerance method, device, equipment and storage medium
CN112256523A (en) Service data processing method and device
US20060282831A1 (en) Method and hardware node for customized upgrade control
Vizarreta et al. Dason: Dependability assessment framework for imperfect distributed sdn implementations
US9703693B1 (en) Regression testing system for software applications
CN111752545A (en) Stream computing method supporting data replay
CN111259066A (en) Server cluster data synchronization method and device
CN113227978A (en) Automatic anomaly detection in computer processing pipelines
Trivedi et al. Computing the number of calls dropped due to failures
CN113296991B (en) Abnormality detection method and device
CN112433997A (en) Data restoration method and device
CN108241543B (en) Method, service server and system for executing service operation breakpoint
CN105025179A (en) Method and system for monitoring service agents of call center
CN116400987A (en) Continuous integration method, device, electronic equipment and storage medium
CN116701043A (en) Heterogeneous computing system-oriented fault node switching method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant