CN113868027B - Data snapshot method and device - Google Patents

Data snapshot method and device Download PDF

Info

Publication number
CN113868027B
CN113868027B CN202111447801.9A CN202111447801A CN113868027B CN 113868027 B CN113868027 B CN 113868027B CN 202111447801 A CN202111447801 A CN 202111447801A CN 113868027 B CN113868027 B CN 113868027B
Authority
CN
China
Prior art keywords
time
snapshot
target system
nodes
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111447801.9A
Other languages
Chinese (zh)
Other versions
CN113868027A (en
Inventor
汪峰
黄岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunhe Enmo Beijing Information Technology Co ltd
Original Assignee
Yunhe Enmo Beijing Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunhe Enmo Beijing Information Technology Co ltd filed Critical Yunhe Enmo Beijing Information Technology Co ltd
Priority to CN202111447801.9A priority Critical patent/CN113868027B/en
Publication of CN113868027A publication Critical patent/CN113868027A/en
Application granted granted Critical
Publication of CN113868027B publication Critical patent/CN113868027B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Abstract

The invention discloses a data snapshot method and device. Wherein, the method comprises the following steps: receiving a snapshot instruction for performing data snapshot on a target system, wherein the snapshot instruction comprises snapshot time of a snapshot; controlling a plurality of nodes of the target system to close an input/output port before the snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to perform time synchronization; and under the condition that the data on the target system is in a consistency state of the snapshot time, performing data snapshot on the target system in response to the snapshot instruction. The invention solves the technical problems of time delay and low snapshot efficiency caused by the fact that input and output ports of nodes are required to be suspended when system snapshot is carried out in a distributed system in the prior art.

Description

Data snapshot method and device
Technical Field
The invention relates to the field of data snapshot, in particular to a data snapshot method and device.
Background
A snapshot is an image of the data of the storage system at a certain time. The snapshot mainly has the function of online data backup and recovery. When the application failure or file damage occurs to the storage device, the data recovery can be performed quickly, and the data can be recovered to the state at a certain available time point. The snapshot has another function of providing another data access channel for the storage user, so that when the original data is subjected to online application processing, the user can access the snapshot data and can also utilize the snapshot to perform work such as testing. All storage systems, whether high, medium, and low, are applied to online systems, and then snapshot becomes an indispensable function. For the snapshot, a problem is often encountered in that the data of the snapshot is not complete, and the data loss after the snapshot is rolled back does not occur. This is in fact a snapshot data consistency problem.
Snapshot data consistency is largely divided into crash consistency and application consistency. Crash consistency means that when a snapshot is created, cache data of a file system is not stored in the snapshot, so that the data of the snapshot is inconsistent with the data at the current time point. Data inconsistencies may result in data loss when the snapshot is used for rollback. This is similar to the sudden power failure of the system, and when the system is powered off, the data in the memory is not written into the persistent storage medium in time, which causes the data in the storage system to be inconsistent. For example, in a current file system, before a real data write operation occurs, metadata of a file needs to be written first, such as where the data is written to a disk. When writing a block of data, the system allocates storage space for the written data from the disk, and modifies the metadata of the disk to record which space is used, while modifying the metadata of the file to record the location of the file data. The order of these two IO operations does not matter, but must be performed as one atomic operation. If the disk space is allocated but not recorded in the metadata of the file, the allocated disk space cannot be used due to power failure, the disk space is wasted, and the space needs to be reapplied when the data is actually written. Similarly, if the allocated location is recorded in the metadata of a file, but the disk metadata is not modified, the allocated disk space is allocated again when other files need to allocate space, resulting in data loss. This is the data inconsistency that results from a crash.
Unlike crash consistency, application consistency is that when taking the underlying snapshot, the database or application file is first left unchanged, for example, for the application of the database, the data in the closed state must be application consistency because no new transaction will be opened. If the database is allowed to be closed for a long time in service, directly carrying out snapshot on the database files is an application consistency snapshot; if the service only allows the database to be closed for a short time, the database is opened immediately after the data files are snapshot, and the application consistency snapshot is also formed.
Fig. 1 is a schematic diagram of recovering data according to a snapshot log in the prior art, and as shown in fig. 1, a method for solving crash consistency by a file system is to record each step by using the log, and write the data into the log first, and then actually write the data into a persistent medium after the data is submitted. When the crash happens, the file can be recovered through the content recorded in the log, and the redo operation is carried out on the data which is submitted but not yet available to be written into the hard disk, and the data is rewritten; and performing undo operation on half of the executed transactions.
Snapshots differ from file systems in that after a file system crash, no new IOs are generated, but new data is still written to when the snapshot is created. One solution is to suspend IO operations on the system (mainly write operations, which do not cause data inconsistency of snapshot volumes), and after the snapshot creation is completed, view the operation sequence recorded in the log to recover IO.
In a distributed environment, a plurality of nodes jointly form a storage system, and the snapshot is created not only for the operation of a single storage node but also for informing all the storage nodes of the IO suspension operation. According to the prior art, a scheduling center sends a suspend IO command to storage nodes, each node notifies the scheduling center to start creating a snapshot after completing the suspend IO operation, and the storage nodes resume the IO operation after the snapshot is created.
In the existing distributed storage system, when a scheduling center notifies each node to create a snapshot, it is necessary to wait for the IO of the node to be suspended successfully, and when all the nodes are suspended successfully, the system will start the operation of creating the snapshot next time, which may result in an excessively long time for the storage node to suspend the IO. The service provided by the node outside the node is also suspended when the IO is suspended, so that the period should not be too long, and the too long IO suspension may reduce the experience of the user on the service provided by the node.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a data snapshot method and a data snapshot device, which are used for at least solving the technical problems of time delay and low snapshot efficiency caused by the fact that input and output ports of nodes are required to be completely suspended when system snapshot is carried out in a distributed system in the related art.
According to an aspect of an embodiment of the present invention, there is provided a data snapshot method, including: receiving a snapshot instruction for performing data snapshot on a target system, wherein the snapshot instruction comprises snapshot time of a snapshot; controlling a plurality of nodes of the target system to close an input/output port before the snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to perform time synchronization; and under the condition that the data on the target system is in a consistency state of the snapshot time, performing data snapshot on the target system in response to the snapshot instruction.
Optionally, receiving a snapshot instruction for performing a data snapshot on the target system includes; and receiving a snapshot instruction sent by a system manager at a preset time before the snapshot time, wherein the preset time is determined by the system manager according to the snapshot time of the snapshot.
Optionally, controlling the plurality of nodes of the target system to close the input/output port before the snapshot time includes: controlling the plurality of nodes of the target system to close corresponding input and output ports at closing time, wherein the snapshot instruction further comprises the closing time of the nodes of the target system to close the corresponding input and output ports, and the closing time is earlier than the snapshot time and later than the preset time; and continuously writing the data needing to be written into the hard disk in the target system into the hard disk until the data needing to be written into the hard disk is written into the hard disk.
Optionally, before controlling the plurality of nodes of the target system to close the corresponding input/output ports at the closing time, the method further includes: synchronizing the time of a plurality of nodes by the same time synchronization device, wherein the time synchronization device comprises a time server, a network time service and a network time protocol; and detecting time errors of the synchronized nodes, and determining that the time synchronization of the nodes is completed under the condition that the time errors are smaller than a preset error threshold.
Optionally, synchronizing the time of the plurality of nodes through the same time synchronization device includes: the control node calls a corresponding network time service and sends a message to the time server, wherein each node is configured with the corresponding network time service, and the message comprises a first timestamp of the node; receiving a return message processed by the time server, and marking a fourth timestamp of the received return message, wherein the return message comprises a second timestamp marked when the time server receives the message, and a third timestamp added after the message is processed and used for marking the sending time of the return message; synchronizing the time of the node with the time of the time server according to the first timestamp, the second timestamp, the third timestamp and the fourth timestamp.
Optionally, in a case that the data on the target system is in a consistency state of the snapshot time, after performing data snapshot on the target system in response to the snapshot instruction, the method includes: determining the restart time of the target system for starting the input/output port of the node according to the snapshot time, wherein the restart time is after the snapshot time; and starting input/output ports of a plurality of nodes of the target system at the restart time.
Optionally, the time synchronization device is a time service system, and the time service system includes at least one of the following: the system comprises an atomic time system, a coordinated universal time system, a short wave time service system, a long wave time service system, a low frequency time code time service system, a Beidou time service system, network gestures, television time service and broadcast time service.
According to another aspect of the embodiments of the present invention, there is also provided a data snapshot apparatus, including: the system comprises a receiving module, a snapshot module and a sending module, wherein the receiving module is used for receiving a snapshot instruction for performing data snapshot on a target system, and the snapshot instruction comprises snapshot time of the snapshot; a closing module, configured to control multiple nodes of the target system to close an input/output port before the snapshot time, where clocks of the multiple nodes of the target system are synchronized with time using a same time synchronization device; and the snapshot module is used for responding to the snapshot instruction to carry out data snapshot on the target system under the condition that the data on the target system is in the consistency state of the snapshot time.
According to another aspect of the embodiments of the present invention, there is also provided a processor, where the processor is configured to execute a program, where the program executes the data snapshot method described in any one of the above.
According to another aspect of the embodiments of the present invention, a computer storage medium is further provided, where the computer storage medium includes a stored program, and when the program runs, a device where the computer storage medium is located is controlled to execute any one of the above data snapshot methods.
In the embodiment of the invention, a snapshot instruction for receiving a data snapshot of a target system is adopted, wherein the snapshot instruction comprises snapshot time; controlling a plurality of nodes of a target system to close an input/output port before snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to perform time synchronization; under the condition that data on a target system is in a consistency state of snapshot time, a mode of performing data snapshot on the target system in response to a snapshot instruction is adopted, a plurality of different nodes are enabled to simultaneously close an input/output port by adopting the same time synchronization equipment, then the snapshot is performed, and the purpose of accurately and efficiently realizing the system snapshot is achieved, so that the technical effects of improving the efficiency and the accuracy of the system snapshot are achieved, and the technical problems that when the system snapshot is performed in a distributed system in the related art, the input/output ports of the nodes are required to be completely hung up, the time is prolonged, and the snapshot efficiency is low are further solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a diagram of prior art recovery of data from a snapshot log;
FIG. 2 is a flow diagram of a data snapshot method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an overall architecture according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a snapshot creation flow according to an embodiment of the invention;
fig. 5 is a schematic diagram of a data snapshot apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In accordance with an embodiment of the present invention, there is provided a method embodiment of a data snapshot method, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 2 is a flowchart of a data snapshot method according to an embodiment of the present invention, as shown in fig. 2, the method includes the following steps:
step S202, receiving a snapshot instruction for performing data snapshot on a target system, wherein the snapshot instruction comprises snapshot time of the snapshot;
step S204, controlling a plurality of nodes of the target system to close the input/output port before the snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to carry out time synchronization;
and step S206, under the condition that the data on the target system is in the consistency state of the snapshot time, responding to the snapshot instruction to perform data snapshot on the target system.
Through the steps, receiving a snapshot instruction for performing data snapshot on the target system, wherein the snapshot instruction comprises snapshot time of the snapshot; controlling a plurality of nodes of a target system to close an input/output port before snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to carry out time synchronization; under the condition that data on a target system is in a consistency state of snapshot time, a mode of responding to a snapshot instruction to carry out data snapshot on the target system is adopted, the input and output ports of a plurality of different nodes are simultaneously closed by adopting the same time synchronization equipment, then the snapshot is carried out, and the purpose of accurately and efficiently realizing the system snapshot is achieved, so that the technical effects of improving the efficiency and the accuracy of the system snapshot are achieved, and the technical problems that the input and output ports of the nodes are required to be completely suspended when the system snapshot is carried out in a distributed system in the related art, the time is prolonged, and the snapshot efficiency is low are solved.
The execution subject of the above steps may be a controller, a processor, or other devices with data operation processing capability, such as a storage device. Through the steps, data snapshot is carried out on the target system. The target system may be a data system for storing or processing data, such as a distributed system, a distributed system comprising a plurality of nodes for storing or processing data. In the prior art, when a snapshot is performed on a distributed system, a scheduling center needs to send an IO command for suspending an input/output port to a storage node, after each node completes an IO operation, the scheduling center is notified that a snapshot can be created, and after the snapshot is created, the storage node resumes the IO operation.
Through the steps, the storage system suspends the input and output operations of the nodes by closing the IO (input/output) ports on the nodes when creating the snapshot, so that the consistency of snapshot data is maintained, when the nodes receive the snapshot instruction, the IO operations on the local machine are suspended, the contents of the buffer area are written into the storage disk, and then the scheduling center is informed that the completion of the IO operations is suspended. Because the required time is different among different nodes, the snapshot starting time returned to the scheduling center is also different, and therefore the scheduling center needs to wait until all the nodes are successfully returned and then can start to create the snapshot. The invention mainly solves the problem and improves the efficiency of creating the snapshot in the existing distributed storage system.
The snapshot instruction described above may be automatically triggered by a task of the system, for example, a timed snapshot task. The snapshot instruction includes the snapshot time of the current snapshot. Considering that the system needs to perform snapshot before the snapshot time of the snapshot, the time of receiving the snapshot instruction is earlier than the snapshot time, that is, when the system task triggers the snapshot instruction, the snapshot instruction needs to be sent before the snapshot time, so as to ensure that the system performs snapshot preparation according to the snapshot time of the snapshot instruction.
Specifically, after receiving the snapshot instruction, the plurality of nodes of the control target system are controlled to close the input/output port before the snapshot time according to the snapshot time. And the mode of the nodes is synchronized according to the unified time synchronization equipment, and the clock consistency of the nodes is ensured, so that the input and output ports of the nodes are simultaneously closed before the snapshot time, and the input and output operation of the nodes is suspended. And further, when the snapshot time is reached, the system can be effectively and accurately snapshot.
The above consistency state may be understood as that the data of the target system conforms to the definition of the data in the system, that is, the data of the system is in a normal state, including that the data being written has been written, the data being migrated has been migrated, the data being deleted has been deleted, and the like. The accuracy of the snapshot data can be guaranteed only when the data of the system is in a consistency state.
The system may be quickly popped in a snapshot manner in the prior art.
Optionally, receiving a snapshot instruction for performing a data snapshot on the target system includes; and receiving a snapshot instruction sent by the system manager at a preset time before the snapshot time, wherein the preset time is determined by the system manager according to the snapshot time of the snapshot.
The preset time is before the snapshot time, the time interval between the preset time and the snapshot time is not more than the time interval between the snapshot times of two adjacent snapshots, the time interval between the preset time and the snapshot time is usually less than half of the time interval between the snapshot times of two adjacent snapshots, and meanwhile, the time interval between the preset time and the quick finding time is also ensured to enable the system to complete the operation of simultaneously closing the input and output ports of a plurality of nodes. Therefore, the system can complete the preparation of the snapshot in enough time, and the efficiency of the snapshot is improved.
Optionally, the controlling the plurality of nodes of the target system to close the input/output port before the snapshot time includes: controlling a plurality of nodes of the target system to close corresponding input and output ports at closing time, wherein the snapshot instruction further comprises the closing time of the nodes of the target system to close the corresponding input and output ports, and the closing time is earlier than the snapshot time and later than the preset time; and continuously writing the data needing to be written into the hard disk in the target system into the hard disk until the data needing to be written into the hard disk is written into the hard disk.
In one embodiment, the snapshot instruction may further include a shutdown time for shutting down input/output ports of the plurality of nodes, and the plurality of nodes may be shut down simultaneously at the shutdown time according to the shutdown time and the same time synchronization device. The closing time is later than the preset time for sending the snapshot instruction and is earlier than the snapshot time. In other embodiments, the preset time for sending the snapshot instruction may also be determined according to a closing time for closing the input/output ports of the plurality of nodes. The method and the device can ensure that the snapshot instruction is sent, the input and output ports of a plurality of nodes are closed, and the three actions of snapshot can be executed according to a strict sequence, so that the accuracy of the snapshot is ensured.
Optionally, before the plurality of nodes of the control target system close the corresponding input/output ports at the closing time, the method further includes: synchronizing the time of a plurality of nodes by the same time synchronization device, wherein the time synchronization device comprises a time server, a network time service and a network time protocol; and detecting time errors of the synchronized nodes, and determining that the time synchronization of the nodes is completed under the condition that the time errors are smaller than a preset error threshold.
The time synchronization device may be a time server, for example, an NTP time server, and may synchronize the time of each storage node by configuring an NTP network time service on each node, and using one NTP time server and an NTP network time protocol to unify the time of different nodes. The time error between the storage nodes is reduced. And ensuring that the storage nodes suspend the IO at the same time and creating a snapshot.
Optionally, synchronizing, by using the same time synchronization device, the times of the multiple nodes includes: the control node calls a corresponding network time service and sends a message to a time server, wherein each node is provided with the corresponding network time service, and the message comprises a first timestamp of the node; receiving a return message processed by the time server, and marking a fourth timestamp of the received return message, wherein the return message comprises a second timestamp marked when the time server receives the message and a third timestamp added after the message is processed and used for marking the sending time of the return message; and synchronizing the time of the node with the time of the time server according to the first time stamp, the second time stamp, the third time stamp and the fourth time stamp.
Specifically, the node sends a message to the time server, the message carries a first timestamp of the node, the time server adds a received second timestamp after receiving the message, a returned third timestamp is added after processing the message twice, the message is returned, the client receives a fourth timestamp which is received and returned, the value of the four timestamps can synchronize the time of the node with the server, and the error can be controlled within 1 ms. The IO suspension time of each node should exceed the error of NTP protocol, so that it can be ensured that each storage node suspends IO at the same time, and a snapshot is created.
Optionally, in a case that the data on the target system is in a consistency state of the snapshot time, after performing data snapshot on the target system in response to the snapshot instruction, the method includes: determining the restart time of the target system for starting the input/output port of the node according to the snapshot time, wherein the restart time is after the snapshot time; and starting input and output ports of a plurality of nodes of the target system at the restarting time.
After finishing the snapshot, the input/output ports of the plurality of nodes are opened in a unified manner during restarting so as to restart the input/output operation of the plurality of nodes.
Optionally, the time synchronization device is a time service system, and the time service system includes at least one of the following: the system comprises an atomic time system, a coordinated universal time system, a short wave time service system, a long wave time service system, a low frequency time code time service system, a Beidou time service system, network gestures, television time service and broadcast time service.
It should be noted that the present application also provides an alternative implementation, and the details of the implementation are described below.
The embodiment provides a method for ensuring snapshot consistency, which aims to solve the problem that the system efficiency is reduced in order to maintain data consistency when a snapshot is created in the existing distributed storage. When the storage system creates the snapshot, the consistency of the snapshot data is maintained by suspending the IO operation on the node, when the node receives the snapshot instruction, the IO operation on the local machine is suspended, the content of the buffer area is written into the storage disk, and then the scheduling center is informed that the suspension of the IO operation is completed. Because different nodes require different time, the snapshot starting time returned to the dispatching center is different, and therefore the dispatching center needs to wait until all the nodes return success to start creating. The embodiment mainly solves the problem and improves the efficiency of creating the snapshot in the existing distributed storage system.
The technical scheme of the embodiment has the following key points:
notifying nodes to suspend IO in advance;
IO operations on the nodes are suspended to ensure consistency of the snapshots;
unifying the time of each storage node by using a time synchronization protocol or other means;
errors are reduced by adjusting the time for suspending IO, so that the purposes of improving the node performance and improving the snapshot efficiency are achieved.
Fig. 3 is a schematic diagram of an overall architecture according to an embodiment of the present invention, and as shown in fig. 3, a scheduling center sends a snapshot creation command to a node in advance, after the snapshot creation command is received, the nodes enter a suspend IO state uniformly before reaching an appointed time point, suspend data writing operations, and write data that a storage system should write to a hard disk into the hard disk as soon as possible, and stop all new data writing when the data on the storage system is in a consistent state. After a certain time, the storage system starts to continue to work.
The distributed system is different from the centralized system in that the time is not maintained by the same set of hardware among multiple nodes, and the time of the nodes is not uniform due to different hardware, different temperatures, different ages and the like. For example, when ten points are notified to create a snapshot, node B may be worse, and usually the difference is not too large, but in some systems with higher requirements, the error may cause inconsistency of snapshot data, some records are snapshot time data, and some records are covered by new data.
One solution is to use the NTP protocol to unify the time of different nodes, reducing the time error between storage nodes. NTP service can be configured on each node, and a NTP time server is used for synchronizing the time of each storage node. The storage node is through sending the message to the server, carries own time stamp in this message, and after the server received, the time stamp of adding the receipt was returned through handling once more, returns the message, and the client receives the time stamp of accepting the return of writing back again, just can arrive the time synchronization of oneself the same with the server through the value of these four time stamps to can control the error in 1 ms. The IO suspension time of each node should exceed the error of the NTP protocol, so that the IO suspension time of each storage node can be ensured at the same time, and a snapshot is created.
Another method is to implement NTP-like protocol by itself, and also to use a higher precision time service system, such as: a cesium atomic clock. Or a GPS or Beidou time service system.
In summary, these timing systems need to output two results: a relatively accurate time, a time error range.
Fig. 4 is a schematic diagram of a snapshot creation process according to an embodiment of the present invention, as shown in fig. 4, a scheduling center manager first sends a command for creating a snapshot on time to all storage nodes FrontEnd, after an appointed time point is reached, all storage nodes FrontEnd suspend IO uniformly, and after the state of suspending IO is set, a message is returned to the manager, so that a snapshot may be created, otherwise, the snapshot creation is cancelled. When the snapshot is created, information of the snapshot, such as a version number, is written into the metadata server MDS, and then an undoo operation or a redo operation is performed according to the content of the log record to ensure the crash consistency of the snapshot.
Fig. 5 is a schematic diagram of a data snapshot apparatus according to an embodiment of the present invention, and as shown in fig. 5, according to another aspect of the embodiment of the present invention, there is also provided a data snapshot apparatus, including: a receiving module 52, a closing module 54 and a snapshot module 56, which are described in detail below.
A receiving module 52, configured to receive a snapshot instruction for performing a data snapshot on a target system, where the snapshot instruction includes a snapshot time of the snapshot; a closing module 54, connected to the receiving module 52, configured to control multiple nodes of the target system to close an input/output port before the snapshot time, where clocks of the multiple nodes of the target system use the same time synchronization device for time synchronization; and a snapshot module 56, connected to the closing module 54, configured to perform data snapshot on the target system in response to the snapshot instruction when the data on the target system is in a consistency state of the snapshot time.
By the device, a snapshot instruction for carrying out data snapshot on the target system is received, wherein the snapshot instruction comprises snapshot time of the snapshot; controlling a plurality of nodes of a target system to close an input/output port before snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to perform time synchronization; under the condition that data on a target system is in a consistency state of snapshot time, a mode of performing data snapshot on the target system in response to a snapshot instruction is adopted, a plurality of different nodes are enabled to simultaneously close an input/output port by adopting the same time synchronization equipment, then the snapshot is performed, and the purpose of accurately and efficiently realizing the system snapshot is achieved, so that the technical effects of improving the efficiency and the accuracy of the system snapshot are achieved, and the technical problems that when the system snapshot is performed in a distributed system in the related art, the input/output ports of the nodes are required to be completely hung up, the time is prolonged, and the snapshot efficiency is low are further solved.
According to another aspect of the embodiments of the present invention, there is further provided a processor, where the processor is configured to execute a program, where the program executes the data snapshot method in any one of the above descriptions.
According to another aspect of the embodiments of the present invention, there is also provided a computer storage medium, where the computer storage medium includes a stored program, and when the program runs, the apparatus on which the computer storage medium is located is controlled to execute the data snapshot method of any one of the above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (6)

1. A method for data snapshot, comprising:
receiving a snapshot instruction for performing data snapshot on a target system, wherein the snapshot instruction comprises snapshot time of a snapshot;
controlling a plurality of nodes of the target system to close an input/output port before the snapshot time, wherein clocks of the plurality of nodes of the target system adopt the same time synchronization equipment to perform time synchronization;
under the condition that the data on the target system is in a consistency state of the snapshot time, responding to the snapshot instruction to carry out data snapshot on the target system;
the step of receiving a snapshot instruction for performing data snapshot on the target system comprises the following steps: receiving a snapshot instruction sent by a system manager at a preset time before the snapshot time, wherein the preset time is determined by the system manager according to the snapshot time of the snapshot;
wherein controlling the plurality of nodes of the target system to turn off an input output port before the snapshot time comprises: controlling a plurality of nodes of the target system to close corresponding input and output ports at closing time, wherein the snapshot instruction further comprises the closing time for closing the corresponding input and output ports by the nodes of the target system, and the closing time is earlier than the snapshot time and later than the preset time; continuously writing the data needing to be written into the hard disk in the target system into the hard disk until the data needing to be written into the hard disk is written into the hard disk;
controlling the plurality of nodes of the target system to close the corresponding input/output ports before the closing time, wherein the method further comprises: synchronizing the time of a plurality of nodes through the same time synchronization device, wherein the time synchronization device comprises a time server, a network time service and a network time protocol; detecting time errors of the synchronized nodes, and determining that the time synchronization of the nodes is completed under the condition that the time errors are smaller than a preset error threshold;
under the condition that the data on the target system is in a consistency state of the snapshot time, after the data snapshot is carried out on the target system in response to the snapshot instruction, the method comprises the following steps:
determining the restart time of the target system for starting the input/output port of the node according to the snapshot time, wherein the restart time is after the snapshot time;
and starting input/output ports of a plurality of nodes of the target system at the restart time.
2. The method of claim 1, wherein synchronizing the time of the plurality of nodes via the same time synchronization device comprises:
the control node calls a corresponding network time service and sends a message to the time server, wherein each node is configured with the corresponding network time service, and the message comprises a first timestamp of the node;
receiving a return message processed by the time server, and marking a fourth timestamp of the received return message, wherein the return message comprises a second timestamp marked when the time server receives the message, and a third timestamp added after the message is processed and used for marking the sending time of the return message;
synchronizing the time of the node with the time of the time server according to the first time stamp, the second time stamp, the third time stamp and the fourth time stamp.
3. The method according to claim 1, wherein the time synchronization device is a time service system, the time service system comprising at least one of:
the system comprises an atomic time system, a coordinated universal time system, a short wave time service system, a long wave time service system, a low frequency time code time service system, a Beidou time service system, network gestures, television time service and broadcast time service.
4. A data snapshot apparatus, comprising:
the system comprises a receiving module, a snapshot processing module and a snapshot processing module, wherein the receiving module is used for receiving a snapshot instruction for carrying out data snapshot on a target system, and the snapshot instruction comprises snapshot time of a snapshot;
a closing module, configured to control multiple nodes of the target system to close an input/output port before the snapshot time, where clocks of the multiple nodes of the target system are synchronized with time using a same time synchronization device;
the snapshot module is used for responding to the snapshot instruction to carry out data snapshot on the target system under the condition that the data on the target system is in the consistency state of the snapshot time;
wherein the receiving module: the receiving submodule is used for receiving a snapshot instruction sent by a system manager at a preset time before the snapshot time, wherein the preset time is determined by the system manager according to the snapshot time of the snapshot;
wherein the shutdown module comprises: the control submodule is used for controlling the plurality of nodes of the target system to close corresponding input and output ports at closing time, wherein the snapshot instruction further comprises the closing time for closing the corresponding input and output ports by the nodes of the target system, and the closing time is earlier than the snapshot time and later than the preset time; the writing submodule is used for continuously writing the data needing to be written into the hard disk in the target system into the hard disk until the data needing to be written into the hard disk is written into the hard disk;
a synchronization module, configured to synchronize, by using the same time synchronization device, times of multiple nodes before controlling the multiple nodes of the target system to close corresponding input/output ports at the closing time, where the time synchronization device includes a time server, a network time service, and a network time protocol; the processing module is used for detecting time errors of the synchronized nodes and determining that the time synchronization of the nodes is finished under the condition that the time errors are smaller than a preset error threshold;
wherein, in the case that the data on the target system is in a consistency state of the snapshot time, after performing data snapshot on the target system in response to the snapshot instruction, the apparatus is further configured to: determining the restart time of the target system for starting the input/output port of the node according to the snapshot time, wherein the restart time is after the snapshot time; and starting input and output ports of a plurality of nodes of the target system at the restart time.
5. A processor, configured to run a program, wherein the program when running performs the data snapshot method of any one of claims 1 to 3.
6. A computer storage medium, comprising a stored program, wherein when the program runs, a device on which the computer storage medium is located is controlled to execute the data snapshot method according to any one of claims 1 to 3.
CN202111447801.9A 2021-12-01 2021-12-01 Data snapshot method and device Active CN113868027B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111447801.9A CN113868027B (en) 2021-12-01 2021-12-01 Data snapshot method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111447801.9A CN113868027B (en) 2021-12-01 2021-12-01 Data snapshot method and device

Publications (2)

Publication Number Publication Date
CN113868027A CN113868027A (en) 2021-12-31
CN113868027B true CN113868027B (en) 2022-12-23

Family

ID=78985490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111447801.9A Active CN113868027B (en) 2021-12-01 2021-12-01 Data snapshot method and device

Country Status (1)

Country Link
CN (1) CN113868027B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495982A (en) * 2011-11-30 2012-06-13 成都七巧软件有限责任公司 Process threading-based copy-protection system and copy-protection storage medium
WO2020069654A1 (en) * 2018-10-01 2020-04-09 Huawei Technologies Co., Ltd. Method of handling snapshot creation request and storage device thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7475098B2 (en) * 2002-03-19 2009-01-06 Network Appliance, Inc. System and method for managing a plurality of snapshots
CN106919471B (en) * 2015-12-25 2020-03-20 中国电信股份有限公司 Method and system for snapshot creation
CN107402848A (en) * 2017-07-31 2017-11-28 郑州云海信息技术有限公司 A kind of implementation method of snapshot data uniformity
CN111506453B (en) * 2019-01-31 2023-06-16 阿里巴巴集团控股有限公司 Disk snapshot creation method, device, system and storage medium
CN111522689B (en) * 2019-02-01 2022-04-29 阿里巴巴集团控股有限公司 Global snapshot method, device, electronic equipment and computer-readable storage medium
CN111984472B (en) * 2020-08-27 2022-08-02 苏州浪潮智能科技有限公司 Data snapshot method, device and related equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495982A (en) * 2011-11-30 2012-06-13 成都七巧软件有限责任公司 Process threading-based copy-protection system and copy-protection storage medium
WO2020069654A1 (en) * 2018-10-01 2020-04-09 Huawei Technologies Co., Ltd. Method of handling snapshot creation request and storage device thereof

Also Published As

Publication number Publication date
CN113868027A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
US11397648B2 (en) Virtual machine recovery method and virtual machine management device
CN110309218B (en) Data exchange system and data writing method
US9785523B2 (en) Managing replicated virtual storage at recovery sites
WO2018177107A1 (en) Data migration method, migration server, and storage medium
US8578203B2 (en) Providing a backup service from a remote backup data center to a computer through a network
JP5452615B2 (en) Method for implementing multi-array consistency groups using a write queuing mechanism
CN105354113B (en) A kind of system and method for server, management server
CN109189860A (en) A kind of active and standby increment synchronization method of MySQL based on Kubernetes system
US9563462B2 (en) Suspending and resuming virtual machines
WO2013131448A1 (en) Method and system for data synchronization and data access apparatus
US20170255528A1 (en) Smart data replication recoverer
WO2008092912A1 (en) System and method of error recovery for backup applications
AU2012273366A1 (en) Managing replicated virtual storage at recovery sites
CN104965879A (en) Method and device for altering table structure of data table
CN103970834A (en) Recovery method for incremental data synchronization fault in isomerous database synchronizing system
CN104615511A (en) Host batch recovery processing method and device based on double centers
JP4560074B2 (en) Virtual computer system and virtual computer restoration method in the same system
WO2017014814A1 (en) Replicating memory volumes
CN104516796A (en) Command set based network element backup and recovery method and device
EP3696658A1 (en) Log management method, server and database system
CN106874343B (en) Data deletion method and system for time sequence database
CN110442648A (en) Method of data synchronization and device
CN113868027B (en) Data snapshot method and device
CN107005434B (en) Method, device and equipment for synchronizing Virtual Network Function (VNF) state

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant