CN111290881A - Data recovery method, device, equipment and storage medium - Google Patents

Data recovery method, device, equipment and storage medium Download PDF

Info

Publication number
CN111290881A
CN111290881A CN202010071604.0A CN202010071604A CN111290881A CN 111290881 A CN111290881 A CN 111290881A CN 202010071604 A CN202010071604 A CN 202010071604A CN 111290881 A CN111290881 A CN 111290881A
Authority
CN
China
Prior art keywords
parallel
log
packet
logs
log packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010071604.0A
Other languages
Chinese (zh)
Other versions
CN111290881B (en
Inventor
王海龙
王蒙蒙
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN202010071604.0A priority Critical patent/CN111290881B/en
Publication of CN111290881A publication Critical patent/CN111290881A/en
Application granted granted Critical
Publication of CN111290881B publication Critical patent/CN111290881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery

Abstract

The invention discloses a data recovery method, a data recovery device, data recovery equipment and a storage medium. The method comprises the following steps: reading a parallel log packet from the online log file when the database is restarted after the fault is eliminated; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; the parallel logs are sequenced according to the self-description information, and data recovery is sequentially performed according to the sequenced parallel logs.

Description

Data recovery method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of databases, in particular to a data recovery method, a data recovery device, data recovery equipment and a storage medium.
Background
During the operation of the database system, various fault conditions may be encountered, such as operating system fault or hardware fault, and after the fault is eliminated, the system may be recovered to the time before the fault by means of the REDO log when the database is restarted.
The REDO Log is used for recording modification operations executed by the database on the data, a new LSN (Log Sequence Number) is used for identifying each time the REDO Log is generated by data modification, the LSN value of the REDO Log is automatically added with 1 every time the REDO Log is written, one LSN value represents one database modification operation, the size relationship of the LSNs can represent the modification Sequence of the database, and replay is carried out according to the Sequence of the LSNs from small to large when the REDO Log is replayed.
The method comprises the steps that firstly, REDO logs generated by a database system are placed in a log cache area, online log files are written when the log cache area is full or a transaction is submitted, in the case of high concurrency, all sessions write logs into the same log cache area to generate higher concurrency conflict, a common optimization mode is that parallel logs are adopted, a log cache area is independently distributed for each working thread, the REDO logs generated by each working thread are written into the respective log cache area, and finally the REDO logs are combined together and flushed into the online logs. Because the LSN is globally unique, the REDO logs in each log cache region cannot guarantee that the LSN is continuously incremented, and before writing the online logs, the logs in all log cache regions must be sorted according to the LSN size, and this sorting action becomes a performance bottleneck in a scenario where the system pressure is high and a large number of REDO logs are generated.
Disclosure of Invention
The embodiment of the invention provides a data recovery method, a data recovery device, a data recovery equipment and a data recovery storage medium, which can completely avoid parallel log sorting action when a system normally operates, sort each path of parallel logs in a parallel log packet only through self-description information when a database is restarted due to faults, and can effectively improve the system performance under a high-concurrency and high-pressure operation scene.
In a first aspect, an embodiment of the present invention provides a data recovery method, including:
reading a parallel log packet from the online log file when the database is restarted after the fault is eliminated;
obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet;
and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
Further, the obtaining the self-description information of the parallel log packet includes:
and acquiring self-description information of the packet header stored in the parallel log packet.
Further, when the database is restarted after troubleshooting, before reading the parallel log packet from the online log file, the method further includes:
when any log packet buffer area is full, creating a parallel log packet, and distributing a parallel log packet sequence number for the parallel log packet;
sequentially copying database logs in a log packet cache area to the parallel log packets;
counting and recording self-description information of the parallel log packets;
and writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into an online log file.
Further, when any log packet buffer is full, before creating a parallel log packet, the method further includes:
and writing database logs generated by at least two working threads into a log packet buffer area, wherein each working thread is allocated with one log packet buffer area.
Further, the minimum log sequence value in the currently created parallel log packet is the maximum log sequence value in the previous parallel log packet plus one.
Further, the sorting the parallel logs according to the self-description information, and the sequentially performing data recovery according to the sorted parallel logs comprises:
reading out parallel logs of each path in sequence according to the self-description information;
sequencing the parallel logs according to the sequence of the log sequence values from small to large;
and sequentially recovering data according to the sequenced parallel logs.
Further, the serial numbers of the parallel log packets are sequentially increased according to the sequence of the parallel log packet creation.
In a second aspect, an embodiment of the present invention further provides a data recovery apparatus, where the apparatus includes:
the reading module is used for reading the parallel log packets from the online log file when the database is restarted after the fault is eliminated;
an obtaining module, configured to obtain self-description information of the parallel log packet, where the self-description information includes: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet;
and the recovery module is used for sequencing the parallel logs according to the self-description information and sequentially recovering data according to the sequenced parallel logs.
Further, the obtaining module includes:
and the information acquisition submodule is used for acquiring the self-description information stored in the packet header of the parallel log packet.
Further, the method also comprises the following steps:
the creating submodule is used for creating a parallel log packet when any log packet cache area is full, and distributing a parallel log packet sequence number for the parallel log packet;
the copying module is used for sequentially copying the database logs in the log packet cache area into the parallel log packets;
the statistical module is used for counting and recording the self-description information of the parallel log packets;
and the writing module is used for writing the self-description information into the packet head of the parallel log packet and writing the parallel log packet into an online log file.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the data recovery method according to any one of the embodiments of the present invention.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data recovery method according to any one of the embodiments of the present invention.
In the embodiment of the invention, when the database is restarted after the fault is eliminated, the parallel log packet is read from the online log file;
obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; the parallel logs are sequenced according to the self-description information, data recovery is sequentially performed according to the sequenced parallel logs, parallel log sequencing action during normal operation of the system can be completely avoided, the parallel logs in parallel log packets are sequenced only through the self-description information during fault restart of the database, and system performance can be effectively improved in a high-concurrency high-pressure operation scene.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a data recovery method according to a first embodiment of the present invention;
FIG. 2A is a flowchart of a data recovery method according to a second embodiment of the present invention;
FIG. 2B is a diagram of a parallel log packet format in a second embodiment of the invention;
FIG. 2C is a flow chart of parallel log packet generation in a second embodiment of the present invention;
fig. 2D is a flowchart of parallel log packet replay in the second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a data recovery apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer device in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Example one
Fig. 1 is a flowchart of a data recovery method according to an embodiment of the present invention, where this embodiment is applicable to a data recovery situation, and the method may be executed by a data recovery apparatus according to an embodiment of the present invention, where the apparatus may be implemented in a software and/or hardware manner, as shown in fig. 1, the method specifically includes the following steps:
s110, when the database is restarted after the fault is eliminated, reading the parallel log packet from the online log file.
The online log file is generally created and recycled when the database is initialized, and the online log file is used for storing log packets generated during normal operation of the system.
The parallel log packets are stored with at least two parallel logs, and the parallel log packets are complete parallel log packets, it should be noted that, due to a fault, the last parallel log packet may not be completely written into the online log file, and when reading the parallel log packets from the online log file, the incomplete parallel log packets are discarded, and the complete parallel log packets are read.
The fault may be an operating system fault, a hardware fault, or the like, which is not limited in this embodiment of the present invention.
Specifically, after the fault is eliminated and the database is restarted, the parallel log packets in the online log file are read.
S120, self-description information of the parallel log packet is obtained, wherein the self-description information comprises: the writing starting position of each path of parallel log in the parallel log packet, the number of the parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet.
The acquisition mode of the self-description information comprises statistics and record acquisition according to parallel logs stored in parallel log packets and distribution when the parallel log packets are created, wherein the acquisition modes of the writing initial position of each path of parallel logs in the parallel log packets, the number of the parallel logs stored in the parallel log packets, the minimum log sequence value and the maximum log sequence value in the parallel log packets and the length of the parallel log packets are the statistics and record acquisition according to the parallel logs stored in the parallel log packets, and the serial numbers of the parallel log packets are distributed when the parallel log packets are created.
And the N working threads correspond to the N paths of parallel logs.
Wherein the log sequence value is globally unique.
Each parallel log packet has a parallel log packet serial number for unique identification and is sequentially increased or decreased progressively so as to identify the sequence of the log packets.
Optionally, the serial numbers of the parallel log packets are sequentially increased according to the sequence of creating the parallel log packets.
For example, the parallel log packet creation, the parallel log packet serial number 001, the parallel log packet creation, the parallel log packet serial number 002, the parallel log packet creation, the parallel log packet serial number 003, the parallel log packet creation, and the parallel log packet serial number 004 may be created or may be created, the parallel log packet serial number 100000, the parallel log packet creation, the parallel log packet serial number 99999, the parallel log packet creation, and the parallel log packet serial number 99998.
Specifically, the writing initial position of each path of parallel log in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet are obtained.
And S130, sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
Specifically, each path of parallel log in the parallel log packet is read sequentially according to the self-description information, the LSNs in each path of parallel log are necessarily increased in size but are not necessarily continuous, and the read paths of parallel logs are merged and sorted in the sequence from small LSNs to large LSNs (the LSNs are necessarily continuously increased after the merged and sorted), and are replayed sequentially, thereby realizing the recovery of data.
Optionally, the sorting the parallel logs according to the self-description information, and sequentially performing data recovery according to the sorted parallel logs includes:
and reading out parallel logs of each path in sequence according to the self-description information.
And storing the parallel logs in a parallel log packet.
Specifically, each of the parallel logs is sequentially read from the parallel log packet based on the self-description information, and for example, if there are M parallel logs in the parallel log packet, the M parallel logs in the parallel log packet may be sequentially read.
Sequencing the parallel logs according to the sequence of the log sequence values from small to large;
and sequentially recovering data according to the sequenced parallel logs.
Specifically, when replaying the parallel logs, replay must be performed according to the sequence from small to large of the LSNs, the parallel logs need to be sorted, the REDO logs are stored in a parallel log packet form, the parallel logs generated by each working thread are directly stored in the parallel log packet, no parallel log is sorted in the parallel log packet, the LSNs in the parallel logs in the parallel log packet are necessarily increased in number but are not necessarily continuous, the LSNs are guaranteed to be continuously increased in number among the parallel log packets, when replaying the REDO logs in a fault, only the parallel logs in the parallel log packet need to be sorted and replayed, log sorting operation is not required when the system is normally operated, and in an operation scene with high concurrency and high pressure, the system performance can be effectively improved by the method.
In the prior art, even when a system normally operates, logs in all log cache areas need to be sorted according to the size of LSN, and under the condition that the system is high in pressure and generates a large number of REDO logs, the sorting action can become a performance bottleneck.
According to the technical scheme of the embodiment, when the database is restarted after the fault is eliminated, the parallel log packets are read from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; the parallel logs are sequenced according to the self-description information, data recovery is sequentially performed according to the sequenced parallel logs, parallel log sequencing action during normal operation of the system can be completely avoided, the parallel logs in parallel log packets are sequenced only through the self-description information during fault restart of the database, and system performance can be effectively improved in a high-concurrency high-pressure operation scene.
Example two
Fig. 2A is a flowchart of a data recovery method in a second embodiment of the present invention, where the embodiment is optimized based on the foregoing embodiment, and in the embodiment, acquiring self-description information of the parallel log packet includes: and acquiring self-description information of the packet header stored in the parallel log packet. When the database is restarted after the fault is eliminated and before the parallel log packet is read from the online log file, the method further comprises the following steps: when any log packet buffer area is full, creating a parallel log packet, and distributing a parallel log packet sequence number for the parallel log packet; sequentially copying database logs in a log packet cache area to the parallel log packets; counting and recording self-description information of the parallel log packets; and writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into an online log file.
As shown in fig. 2A, the method of this embodiment specifically includes the following steps:
s210, when any log packet buffer area is full, a parallel log packet is created, and a parallel log packet sequence number is distributed to the parallel log packet.
Each working thread is allocated with a log packet buffer area, wherein the number of the working threads is configured by a user, N working threads correspond to N paths of parallel logs, N log packet buffer areas are correspondingly allocated, REDO logs generated by each working thread are written into the corresponding log packet buffer area, and LSN in the log packet buffer areas is increased progressively but is not necessarily continuous.
Specifically, when a certain path of log packet buffer is full, a new parallel log packet is created, and a unique parallel log packet sequence number is allocated to the parallel log packet.
Optionally, when any log packet cache is full, before creating a parallel log packet, the method further includes:
and writing database logs generated by at least two working threads into a log packet buffer area, wherein each working thread is allocated with one log packet buffer area.
Optionally, the minimum log sequence value in the currently created parallel log packet is the maximum log sequence value in the previous parallel log packet plus one.
Specifically, the minimum LSN of the next parallel log packet is the maximum LSN of the previous parallel log packet plus 1, so as to ensure that the LSNs between the parallel log packets are continuously increased, wherein each parallel log packet has a parallel log packet sequence number for unique identification and is sequentially increased, so as to identify the sequence of the parallel log packets.
S220, sequentially copying the database logs in the log packet cache area to the parallel log packets.
Wherein the database log is an REDO log.
Specifically, the database logs in the log packet cache area are sequentially copied to the newly created parallel log packet, for example, a log copy operation may be triggered, the REDO logs in the N log packet cache areas are sequentially copied to the newly created parallel log packet, and at most M segments of REDO logs are stored in the newly created parallel log packet (M < ═ N, possibly a certain log packet cache area is empty, and no log needs to be copied).
And S230, counting and recording self-description information of the parallel log packets.
Specifically, the writing starting position of each path of parallel log of the parallel log packet in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value and the length of the parallel log packet are counted according to the REDO log copied to the newly-built parallel log packet, and the serial number of the parallel log packet, the writing starting position of each path of parallel log of the parallel log packet in the parallel log packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value and the length of the parallel log packet are recorded.
S240, writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into the online log file.
Specifically, as shown in fig. 2B, the writing start position of each path of parallel log in the packet, the number of parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number, and the length of the parallel log packet are recorded in the packet header of the parallel log packet, and the parallel log packet is written in the online log file.
S250, obtaining self-description information stored in a header of the parallel log packet, where the self-description information includes: the writing starting position of each path of parallel log in the parallel log packet, the number of the parallel logs stored in the parallel log packet, the minimum log sequence value, the maximum log sequence value, the parallel log packet sequence number and the length of the parallel log packet.
And S260, sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
In a specific example, as shown in fig. 2C, in the N-way parallel log, when a parallel log packet buffer of a certain way is full, a log copy action is triggered. And creating a new parallel log packet, and allocating a unique parallel log packet serial number to the parallel log packet. And copying the REDO logs in the N log packet cache regions into new parallel log packets in sequence, emptying the log packet cache regions, and skipping if the REDO logs do not exist in the log packet cache regions. And counting and recording self-description information of the parallel log packets in the copying process, wherein the self-description information comprises the writing initial position of each path of parallel log in the parallel log packets, the number M (M < ═ N) of the parallel logs actually stored in the parallel log packets, the minimum LSN and the maximum LSN in the parallel log packets and the length of the parallel log packets. After copying, writing self-description information of the serial number of the parallel log packet, the packet length, the minimum LSN, the maximum LSN, the number of parallel logs in the parallel log packet and the initial position of each path of parallel logs in the packet into a parallel log packet head, and writing the parallel log packet into an online log file. The REDO log is not used under the condition that the system normally operates, and only when the REDO log needs to be replayed to restore the system to the time before the fault when the fault is restarted, the REDO log in the parallel log packet needs to be sequenced at the moment, and replay is carried out according to the sequence from the small LSN to the large LSN. As shown in fig. 2D, the parallel log packet replay steps are as follows: a complete parallel log packet is read from the online log file. And taking out the number (M) of the parallel logs in the parallel log packet and the initial offset of each path of parallel logs from the parallel log packet header, and sequentially reading out the M paths of parallel logs in the parallel log packet. And sequencing the M paths of parallel logs according to the sequence of the LSNs from small to large, and sequentially replaying.
According to the technical scheme, the REDO logs are stored in the self-described log packet mode, the parallel logs generated by each working thread are directly stored in the log packets, the parallel logs in each path are not sequenced any more, LSN is ensured to be continuously increased among the log packets, parallel log sequencing actions in normal operation of the system can be completely avoided, the parallel logs in the parallel log packets are sequenced through self-description information only when a database is restarted due to faults, and the system performance can be effectively improved in a high-concurrency high-pressure operation scene.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a data recovery apparatus according to a third embodiment of the present invention. The present embodiment may be applicable to the case of data recovery, where the apparatus may be implemented in a software and/or hardware manner, and the apparatus may be integrated in any device that provides a function of data recovery, as shown in fig. 3, where the apparatus specifically includes: a read module 310, an acquisition module 320, and a recovery module 330.
The reading module 310 is configured to read a parallel log packet from the online log file when the database is restarted after the fault is eliminated;
an obtaining module 320, configured to obtain self-description information of the parallel log packet, where the self-description information includes: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet;
and the recovery module 330 is configured to sort the parallel logs according to the self-description information, and sequentially perform data recovery according to the sorted parallel logs.
Optionally, the obtaining module includes:
and the information acquisition submodule is used for acquiring the self-description information stored in the packet header of the parallel log packet.
Optionally, the method further includes:
the creating submodule is used for creating a parallel log packet when any log packet cache area is full, and distributing a parallel log packet sequence number for the parallel log packet;
the copying module is used for sequentially copying the database logs in the log packet cache area into the parallel log packets;
the statistical module is used for counting and recording the self-description information of the parallel log packets;
and the writing module is used for writing the self-description information into the packet head of the parallel log packet and writing the parallel log packet into an online log file.
Optionally, the method further includes:
and the writing submodule is used for writing the database logs generated by at least two working threads into a log packet cache region, wherein each working thread is allocated with one log packet cache region.
Optionally, the recovery module is specifically configured to:
reading out parallel logs in sequence according to the self-description information;
sequencing the parallel logs according to the sequence of the log sequence values from small to large;
and sequentially recovering data according to the sequenced parallel logs.
The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
According to the technical scheme of the embodiment, when the database is restarted after the fault is eliminated, the parallel log packets are read from the online log file; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; the parallel logs are sequenced according to the self-description information, data recovery is sequentially performed according to the sequenced parallel logs, parallel log sequencing action during normal operation of the system can be completely avoided, the parallel logs in parallel log packets are sequenced only through the self-description information during fault restart of the database, and system performance can be effectively improved in a high-concurrency high-pressure operation scene.
Example four
Fig. 4 is a schematic structural diagram of a computer device in the fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 4 is only one example and should not bring any limitations to the functionality or scope of use of embodiments of the present invention.
As shown in FIG. 4, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. In the computer device 12 of the present embodiment, the display 24 is not provided as a separate body but is embedded in the mirror surface, and when the display surface of the display 24 is not displayed, the display surface of the display 24 and the mirror surface are visually integrated. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, implementing a data recovery method provided by an embodiment of the present invention: reading a parallel log packet from the online log file when the database is restarted after the fault is eliminated; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
EXAMPLE five
An embodiment five of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data recovery method provided in all the inventive embodiments of the present application: reading a parallel log packet from the online log file when the database is restarted after the fault is eliminated; obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet; and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for data recovery, comprising:
reading a parallel log packet from the online log file when the database is restarted after the fault is eliminated;
obtaining self-description information of the parallel log packet, wherein the self-description information comprises: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet;
and sequencing the parallel logs according to the self-description information, and sequentially recovering data according to the sequenced parallel logs.
2. The method of claim 1, wherein obtaining self-description information for the parallel log packets comprises:
and acquiring self-description information of the packet header stored in the parallel log packet.
3. The method of claim 2, wherein when the database is restarted after troubleshooting, before reading the parallel log packets from the online log file, further comprising:
when any log packet buffer area is full, creating a parallel log packet, and distributing a parallel log packet sequence number for the parallel log packet;
sequentially copying database logs in a log packet cache area to the parallel log packets;
counting and recording self-description information of the parallel log packets;
and writing the self-description information into the packet head of the parallel log packet, and writing the parallel log packet into an online log file.
4. The method of claim 3, wherein before creating a parallel log packet when any of the log packet buffers is full, further comprising:
and writing database logs generated by at least two working threads into a log packet buffer area, wherein each working thread is allocated with one log packet buffer area.
5. The method of claim 3, wherein a minimum log sequence value within a currently created parallel log packet is a maximum log sequence value within a previous parallel log packet plus one.
6. The method of claim 1, wherein the sorting the parallel logs according to the self-description information, and the sequentially performing data recovery according to the sorted parallel logs comprises:
reading out parallel logs of each path in sequence according to the self-description information;
sequencing the parallel logs according to the sequence of the log sequence values from small to large;
and sequentially recovering data according to the sequenced parallel logs.
7. The method according to claim 1, wherein the parallel log packet sequence numbers are sequentially incremented according to the order of creation of the parallel log packets.
8. A data recovery apparatus, comprising:
the reading module is used for reading the parallel log packets from the online log file when the database is restarted after the fault is eliminated;
an obtaining module, configured to obtain self-description information of the parallel log packet, where the self-description information includes: writing an initial position of each path of parallel log in a parallel log packet, the number of parallel logs stored in the parallel log packet, a minimum log sequence value, a maximum log sequence value, a parallel log packet sequence number and the length of the parallel log packet;
and the recovery module is used for sequencing the parallel logs according to the self-description information and sequentially recovering data according to the sequenced parallel logs.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202010071604.0A 2020-01-21 2020-01-21 Data recovery method, device, equipment and storage medium Active CN111290881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010071604.0A CN111290881B (en) 2020-01-21 2020-01-21 Data recovery method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010071604.0A CN111290881B (en) 2020-01-21 2020-01-21 Data recovery method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111290881A true CN111290881A (en) 2020-06-16
CN111290881B CN111290881B (en) 2023-09-19

Family

ID=71023428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010071604.0A Active CN111290881B (en) 2020-01-21 2020-01-21 Data recovery method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111290881B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416654A (en) * 2020-11-26 2021-02-26 上海达梦数据库有限公司 Database log replay method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061537A1 (en) * 2001-07-16 2003-03-27 Cha Sang K. Parallelized redo-only logging and recovery for highly available main memory database systems
CN102945278A (en) * 2012-11-09 2013-02-27 华为技术有限公司 Method and device for redoing logs of database records
CN103092903A (en) * 2011-11-07 2013-05-08 Sap股份公司 Database Log Parallelization
CN104516959A (en) * 2014-12-18 2015-04-15 杭州华为数字技术有限公司 Method and device for managing database logs
CN109144963A (en) * 2017-06-26 2019-01-04 阿里巴巴集团控股有限公司 One kind redoing log persistence method and equipment
CN110442560A (en) * 2019-08-14 2019-11-12 上海达梦数据库有限公司 Method, apparatus, server and storage medium are recurred in a kind of log

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061537A1 (en) * 2001-07-16 2003-03-27 Cha Sang K. Parallelized redo-only logging and recovery for highly available main memory database systems
CN103092903A (en) * 2011-11-07 2013-05-08 Sap股份公司 Database Log Parallelization
CN102945278A (en) * 2012-11-09 2013-02-27 华为技术有限公司 Method and device for redoing logs of database records
CN104516959A (en) * 2014-12-18 2015-04-15 杭州华为数字技术有限公司 Method and device for managing database logs
CN109144963A (en) * 2017-06-26 2019-01-04 阿里巴巴集团控股有限公司 One kind redoing log persistence method and equipment
CN110442560A (en) * 2019-08-14 2019-11-12 上海达梦数据库有限公司 Method, apparatus, server and storage medium are recurred in a kind of log

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HONG ZHU: "Dynamic data recovery for database systems based on fine grained transaction log" *
周晓云;覃雄派;: "基于网络内存的内存数据库高效恢复技术" *
江泽源: "内存数据管理中日志恢复关键技术的研究" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416654A (en) * 2020-11-26 2021-02-26 上海达梦数据库有限公司 Database log replay method, device, equipment and storage medium
CN112416654B (en) * 2020-11-26 2024-04-09 上海达梦数据库有限公司 Database log replay method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111290881B (en) 2023-09-19

Similar Documents

Publication Publication Date Title
CN110442560B (en) Log replay method, device, server and storage medium
CN108664359B (en) Database recovery method, device, equipment and storage medium
US10042695B1 (en) Program exception recovery
CN109542682B (en) Data backup method, device, equipment and storage medium
JP4916892B2 (en) Log information management system and method for transaction processing
CN109471851B (en) Data processing method, device, server and storage medium
US20060200500A1 (en) Method of efficiently recovering database
US6944635B2 (en) Method for file deletion and recovery against system failures in database management system
CN112559140B (en) Transaction control method, system, equipment and storage medium for data consistency
CN112416654B (en) Database log replay method, device, equipment and storage medium
CN111930489B (en) Task scheduling method, device, equipment and storage medium
US11880290B2 (en) Scalable exactly-once data processing using transactional streaming writes
CN111046024A (en) Data processing method, device, equipment and medium for sharing storage database
US20090157767A1 (en) Circular log amnesia detection
CN113672350A (en) Application processing method and device and related equipment
CN113190384B (en) Data recovery control method, device, equipment and medium based on erasure codes
CN111290881B (en) Data recovery method, device, equipment and storage medium
CN117492661A (en) Data writing method, medium, device and computing equipment
CN117112522A (en) Concurrent process log management method, device, equipment and storage medium
CN112231403A (en) Consistency checking method, device, equipment and storage medium for data synchronization
US6854038B2 (en) Global status journaling in NVS
CN112818204B (en) Service processing method, device, equipment and storage medium
CN112395141B (en) Data page management method and device, electronic equipment and storage medium
CN114564388A (en) Program testing method and device, electronic equipment and storage medium
CN109189746B (en) Method, device, equipment and storage medium for realizing universal stream type Shuffle engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant