CN116991636B - Data incremental backup method, system and storage medium based on distributed storage - Google Patents

Data incremental backup method, system and storage medium based on distributed storage Download PDF

Info

Publication number
CN116991636B
CN116991636B CN202311248908.XA CN202311248908A CN116991636B CN 116991636 B CN116991636 B CN 116991636B CN 202311248908 A CN202311248908 A CN 202311248908A CN 116991636 B CN116991636 B CN 116991636B
Authority
CN
China
Prior art keywords
data
storage
incremental
determining
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311248908.XA
Other languages
Chinese (zh)
Other versions
CN116991636A (en
Inventor
周高登
陈立军
钟楷锋
黄轩辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Barda Technology Co ltd
Original Assignee
Wuhan Barda Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Barda Technology Co ltd filed Critical Wuhan Barda Technology Co ltd
Priority to CN202311248908.XA priority Critical patent/CN116991636B/en
Publication of CN116991636A publication Critical patent/CN116991636A/en
Application granted granted Critical
Publication of CN116991636B publication Critical patent/CN116991636B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1461Backup scheduling policy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data incremental backup method, a system and a storage medium based on distributed storage, wherein the method comprises the following steps: acquiring incremental data to be backed up in a target virtual machine; determining the data loss risk rate of each preset storage device in the distributed storage system; determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate; the incremental data is backed up to a plurality of first storage devices. According to the method and the device for backing up the incremental data, the incremental data is acquired, the plurality of first storage devices matched with the incremental data are screened out from the plurality of preset storage devices of the distributed storage system based on the data loss risk rate, and then the incremental data is backed up to the plurality of first storage devices, so that the incremental data is prevented from being backed up to the preset storage devices with larger data loss risk, and the data safety of the incremental backup is improved.

Description

Data incremental backup method, system and storage medium based on distributed storage
Technical Field
The application relates to the technical field of incremental backup, in particular to a data incremental backup method, system and storage medium based on distributed storage.
Background
The distributed storage system includes a plurality of independent storage devices arranged in a distributed fashion. When performing incremental backup on a virtual machine, incremental data in the virtual machine is generally split into a plurality of data blocks, and stored in different storage devices in a distributed manner.
However, due to the difference of different storage devices in the distributed storage system, the risk of data loss of part of the storage devices is larger, and the data security of incremental backup is reduced.
Disclosure of Invention
The embodiment of the application provides a data incremental backup method, a system and a storage medium based on distributed storage, aiming at improving the data security of incremental backup.
In one aspect, the present application provides a method for incremental backup of data based on distributed storage, the method comprising:
acquiring incremental data to be backed up in a target virtual machine;
determining the data loss risk rate of each preset storage device in the distributed storage system;
determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate;
and backing up the incremental data to the plurality of first storage devices.
In some embodiments, the determining a risk of data loss of each preset storage device in the distributed storage system includes:
acquiring the historical data loss times of each preset storage device;
acquiring accumulated operation time of each preset storage device;
determining the historical data loss frequency of each preset storage device based on the historical data loss times and the accumulated operation time length;
and determining the data loss risk rate of each preset storage device based on the historical data loss frequency.
In some embodiments, the determining the risk of data loss of each of the preset storage devices based on the historical data loss frequency includes:
acquiring equipment performance parameters of each preset storage equipment;
determining a device performance score of each preset storage device based on the device performance parameters;
determining a frequency correction coefficient of each preset storage device based on the device performance score;
and correcting the historical data loss frequency by adopting the frequency correction coefficient to obtain the data loss risk rate of each preset storage device.
In some embodiments, the determining the frequency correction coefficient of each preset storage device based on the device performance score includes:
acquiring the used storage capacity of each preset storage device;
acquiring the ratio between the used storage capacity of each preset storage device and the maximum storage capacity of the preset storage device;
determining a storage capacity score of each preset storage device based on the used storage capacity and the ratio;
and determining the frequency correction coefficient of each preset storage device based on the storage capacity score and the device performance score.
In some embodiments, the determining, based on the data loss risk rate, a plurality of first storage devices matching the incremental data from a plurality of preset storage devices includes:
acquiring the occupied storage capacity of the incremental data;
determining a plurality of second storage devices in a plurality of preset storage devices based on the occupied storage capacity;
and determining a plurality of first storage devices matched with the incremental data in a plurality of second storage devices based on the data loss risk rate.
In some embodiments, the determining, based on the data loss risk rate, a plurality of first storage devices that match the incremental data among a plurality of second storage devices includes:
determining a data loss risk level of each second storage device based on the data loss risk rate;
determining the data importance level of the incremental data;
determining a plurality of third storage devices of data loss risk levels matched with the data importance levels in the plurality of second storage devices;
and determining a plurality of first storage devices in the plurality of third storage devices.
In some embodiments, the determining the data importance level of the incremental data includes:
obtaining a virtual machine level of the target virtual machine;
acquiring historical use frequency of the target virtual machine in a preset historical time period;
acquiring the incremental backup frequency of the target virtual machine;
determining the data importance of the incremental data based on the virtual machine level, the historical use frequency and the incremental backup frequency;
and determining the data importance level of the incremental data based on the data importance level.
In some embodiments, the determining, among the plurality of third storage devices, the plurality of first storage devices includes:
acquiring the number of data blocks into which the incremental data is divided when the incremental data is backed up;
sorting the plurality of third storage devices from large to small according to the percentage of the remaining storage space to obtain device ranks;
and ranking the third storage devices which are ranked top in the device ranking and matched with the number of the data blocks as a plurality of first storage devices.
In another aspect, the present application further provides a distributed storage-based data incremental backup system, where the distributed storage-based data incremental backup system includes:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the steps in any of the distributed storage based incremental backup methods of data.
In another aspect, the present application also provides a computer readable storage medium having stored thereon a computer program to be loaded by a processor to perform the steps of any of the distributed storage based incremental backup methods of data.
The method comprises the steps of obtaining incremental data to be backed up in a target virtual machine; determining the data loss risk rate of each preset storage device in the distributed storage system; determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate; the incremental data is backed up to a plurality of first storage devices. According to the method and the device for backing up the incremental data, the incremental data is acquired, the plurality of first storage devices matched with the incremental data are screened out from the plurality of preset storage devices of the distributed storage system based on the data loss risk rate, and then the incremental data is backed up to the plurality of first storage devices, so that the incremental data is prevented from being backed up to the preset storage devices with larger data loss risk, and the data safety of the incremental backup is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of one embodiment of a distributed storage based incremental backup method for data provided in an embodiment of the present application;
FIG. 2 is a flow chart of another embodiment of a distributed storage based incremental backup method for data provided in an embodiment of the present application;
FIG. 3 is a flow chart of yet another embodiment of a distributed storage based incremental backup method provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of one embodiment of a distributed storage based incremental backup system provided in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In this application, the term "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been shown in detail to avoid obscuring the description of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
The embodiment of the application provides a data incremental backup method, a system and a storage medium based on distributed storage, and the method, the system and the storage medium are respectively described in detail below.
Referring to fig. 1, fig. 1 is a flowchart illustrating an embodiment of a distributed storage-based incremental backup method according to an embodiment of the present application. The incremental backup method based on the distributed storage can comprise the following steps:
101. acquiring incremental data to be backed up in a target virtual machine;
in this embodiment, the target virtual machine is a virtual machine that needs to be backed up, and by backing up the target virtual machine, the data of the target virtual machine can be recovered by using the backed up data after the target virtual machine fails, so as to ensure the continuous normal operation of the target virtual machine. In the step of backing up the target virtual machine, firstly, full-volume backup is performed on all data in the target virtual machine, and then incremental backup is performed on the target virtual machine regularly. The incremental data, that is, the data that needs to be backed up when the target virtual machine is backed up in an incremental manner, may be generated based on the data that changes in the target virtual machine after the target virtual machine is backed up last time.
102. Determining the data loss risk rate of each preset storage device in the distributed storage system;
in this embodiment, the distributed storage system includes a plurality of preset storage devices, where each preset storage device may be used to store backup data. Because each preset storage device in the distributed storage system is generally an existing storage device, and different existing storage devices often have differences in hardware and software, so that the data loss risks of different preset storage devices are different, and therefore, the data loss risk rate of each preset storage device in the distributed storage system needs to be calculated.
In some embodiments of the present application, determining a risk of data loss for each preset storage device in a distributed storage system may include: acquiring the historical data loss times of each preset storage device; summing the historical data loss times of all preset storage devices in the distributed storage system to obtain the total historical data loss times; and determining the ratio of the historical data loss times of each preset storage device in the total historical data loss times, and taking the ratio as the data loss risk rate of each preset storage device.
103. Determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate;
in some embodiments of the present application, determining, based on the data loss risk rate, a plurality of first storage devices that match the incremental data from a plurality of preset storage devices may include: determining the data importance of the incremental data, wherein the data importance is calculated according to the embodiment shown in fig. 3; determining a data risk loss rate threshold corresponding to the data importance, wherein the corresponding relation between the data risk loss rate threshold and the data importance is preset, and in general, the data risk loss rate threshold is positively related to the data importance; and taking the plurality of preset storage devices with the data loss risk rate smaller than the data risk loss rate threshold value as the first storage device.
104. The incremental data is backed up to a plurality of first storage devices.
In this embodiment, since the set of multiple preset storage devices in the distributed storage system forms a network disk, and the storage spaces of different preset storage devices form different partitions in the network disk, multiple partitions to which incremental data should be backed up can be selected, so that the incremental data is backed up to a specific multiple storage devices in a distributed manner. Backing up the incremental data to the plurality of first storage devices may include: dividing the incremental data into a plurality of data blocks; the plurality of data blocks are stored in a distributed manner to the partitions where the plurality of first storage devices are located.
According to the data incremental backup method based on distributed storage, the incremental data is acquired, the plurality of first storage devices matched with the incremental data are screened out from the plurality of preset storage devices of the distributed storage system based on the data loss risk rate, and then the incremental data is backed up to the plurality of first storage devices, so that the incremental data is prevented from being backed up to the preset storage devices with larger data loss risk, and the data safety of the incremental backup is improved.
As shown in fig. 2, on the basis of the embodiment shown in fig. 1, determining a data loss risk rate of each preset storage device in the distributed storage system may include:
201. acquiring the historical data loss times of each preset storage device;
in this embodiment, the number of times of historical data loss characterizes: the accumulated times of the stored data lost from the first running time point to the current time point of the storage device are preset. The historical data loss times can be extracted from the operation log of the preset storage device. When the preset storage device detects data loss, a corresponding data loss record is automatically generated and stored in a running log of the preset storage device.
202. Acquiring the accumulated operation time length of each preset storage device;
in this embodiment, the cumulative operating time length characterizes: the time difference between the current time point and the time point of the first run of the preset storage device. The accumulated running time can also be extracted from a running log of the preset storage device. It should be noted that, since each preset storage device in the distributed storage system is generally an existing storage device, the time points of the first operation of different existing storage devices are often different, so the accumulated operation duration of each preset storage device may also be different.
203. Determining the historical data loss frequency of each preset storage device based on the historical data loss times and the accumulated operation duration;
in this embodiment, for each preset storage device, the ratio of the number of times of data loss of the preset storage device to the accumulated running duration may be used as the frequency of data loss of the preset storage device.
204. And determining the data loss risk rate of each preset storage device based on the historical data loss frequency.
In some embodiments of the present application, the historical data loss frequency of the preset storage device may be directly used as the data loss risk rate of the preset storage device.
In some embodiments of the present application, determining the data loss risk rate of each preset storage device based on the historical data loss frequency may include: acquiring device performance parameters of each preset storage device, wherein the device performance parameters can comprise device hardware performance parameters and device software performance parameters, the device hardware performance parameters are for example data read-write capability values of the preset storage devices, and the device software performance parameters are for example network bandwidth values of the preset storage devices; determining the equipment performance score of each preset storage equipment based on the equipment performance parameters; determining a frequency correction coefficient of each preset storage device based on the device performance score, wherein the frequency correction coefficient of the preset storage device is in positive correlation with the device performance score of the preset storage device in general; and correcting the historical data loss frequency by using the frequency correction coefficient to obtain the data loss risk rate of each preset storage device, wherein the product of the frequency correction coefficient and the historical data loss frequency can be used as the data loss risk rate of the preset storage device, and the corrected historical data loss frequency is smaller when the frequency correction coefficient is larger. It will be appreciated that the higher the device performance of the preset storage device, the less the risk that the preset storage device will lose data.
In a further embodiment, taking an example that the device performance parameters may include a device hardware performance parameter and a device software performance parameter, determining a device performance score for each preset storage device based on the device performance parameters may include: carrying out data normalization processing on the equipment hardware performance parameters of a plurality of preset storage equipment to obtain normalized equipment hardware performance parameters; carrying out data normalization processing on the device software performance parameters of a plurality of preset storage devices to obtain normalized device software performance parameters; summing the normalized device hardware performance parameters and the normalized device software performance parameters of the same preset storage device to obtain normalized device performance parameters of the preset storage device; and carrying out data normalization processing again on the normalized equipment performance parameters of the plurality of preset storage equipment, and carrying out data normalization processing again to obtain a result, namely equipment performance scores of the preset storage equipment.
In a further embodiment, determining the frequency correction coefficient for each preset storage device based on the device performance score may include: acquiring the used storage capacity of each preset storage device; acquiring the ratio between the used storage capacity of each preset storage device and the maximum storage capacity of the preset storage device, wherein the ratio is the ratio of the used storage capacity of the preset storage device; determining a storage capacity score of each preset storage device based on the used storage capacity and the ratio; the frequency correction coefficient of each preset storage device is determined based on the storage capacity score and the device performance score, and in general, the frequency correction coefficient of the preset storage device is inversely related to the storage capacity score of the preset storage device. It will be appreciated that the more data a preset storage device stores, the greater the risk that the preset storage device will typically lose data.
In a further embodiment, determining the storage capacity score for each preset storage device based on the used storage capacity, the ratio may include: carrying out data normalization processing on the used storage capacities of a plurality of preset storage devices to obtain normalized used storage capacities; carrying out data normalization processing on the ratio of the plurality of preset storage devices to obtain normalized ratio; and carrying out weighted summation processing on the normalized used storage capacity and the normalized ratio of the same preset storage device to obtain the storage capacity score of the preset storage device, thereby obtaining the storage capacity score of each preset storage device. In the weighted summation process, the weight corresponding to the normalized used storage capacity is generally smaller than the weight corresponding to the normalized ratio.
It can be seen that, according to the scheme in the embodiment, the more accurate risk rate of data loss is determined, so that the data security of incremental backup is further improved.
As shown in fig. 3, based on the embodiment shown in fig. 1 or fig. 2, determining, based on the data loss risk rate, a plurality of first storage devices matching with the incremental data from a plurality of preset storage devices may include:
301. acquiring the occupied storage capacity of the incremental data;
in this embodiment, the occupied storage capacity of the incremental data is the data size of the incremental data.
302. Determining a plurality of second storage devices from a plurality of preset storage devices based on the occupied storage capacity;
in some embodiments of the present application, determining, based on the occupied storage capacity, a plurality of second storage devices from a plurality of preset storage devices may include: determining a redundant storage capacity which should remain in the preset storage device based on the occupied storage capacity, wherein the redundant storage capacity is greater than or equal to the occupied storage capacity, for example, the occupied storage capacity can be equal to the product of the occupied storage capacity and a preset multiple, and the preset multiple is a positive integer; and determining a plurality of preset storage devices with the residual storage capacity larger than the redundant storage capacity from the plurality of preset storage devices, and taking the plurality of preset storage devices as a plurality of second storage devices so as to facilitate the storage of incremental data.
303. And determining a plurality of first storage devices matched with the incremental data in the plurality of second storage devices based on the data loss risk rate.
In some embodiments of the present application, determining, from among the plurality of second storage devices, a plurality of first storage devices that match the incremental data based on the data loss risk rate may include: determining a data loss risk level of each second storage device based on the data loss risk rate, wherein the data loss risk level can comprise a first level, a second level, a third level and the like; determining the data importance level of the incremental data, wherein the data importance level can comprise a first level, a second level, a third level and the like; determining a plurality of third storage devices of the data loss risk levels matched with the data importance levels in the plurality of second storage devices, for example, when the data importance levels are first-level, the matched data loss risk levels are first-level, when the data importance levels are second-level, the matched data loss risk levels are second-level, namely, when the data importance levels are the same as the data loss risk levels, the data importance levels are judged to be matched with the data loss risk levels; among the plurality of third storage devices, a plurality of first storage devices are determined.
In a further embodiment, determining the data loss risk level for each second storage device based on the data loss risk rate may include: acquiring a plurality of preset risk rate intervals and preset risk levels corresponding to the risk rate intervals respectively, wherein the preset risk levels can comprise a first level, a second level, a third level and the like; determining a risk ratio interval in which the data loss risk ratio of each second storage device is located; and taking the preset risk level corresponding to the risk rate interval in which the data loss risk rate of each second storage device is located as the data loss risk level of each second storage device.
In a further embodiment, determining the data importance level of the incremental data may include: obtaining a virtual machine level of a target virtual machine, wherein the virtual machine level is manually set when the target virtual machine is generated, and the virtual machine level can comprise common, important, core and the like; acquiring historical use frequency of a target virtual machine in a preset historical time period, wherein the target virtual machine is considered to be used once by a user, and recording the number of times the target virtual machine is used, wherein the ratio between the accumulated number of times the target virtual machine is used in the preset historical time period and the duration of the preset historical time period, namely the historical use frequency of the target virtual machine in the preset historical time period; the incremental backup frequency of the target virtual machine is obtained, and because the incremental backup of the target virtual machine is generally executed at regular time, the period of the regular execution is the incremental backup frequency of the target virtual machine; determining the data importance of incremental data based on the virtual machine level, the historical use frequency and the incremental backup frequency, wherein a preset value corresponding to the virtual machine level (generally, the preset value corresponding to the virtual machine level is smaller than the preset value corresponding to the virtual machine level when the virtual machine level is common and is important, the preset value corresponding to the virtual machine level when the virtual machine level is important is smaller than the preset value corresponding to the virtual machine level when the virtual machine level is core), and carrying out weighted summation on the preset value, the historical use frequency and the incremental backup frequency, wherein the result of the weighted summation is the data importance of the incremental data, and in the weighted summation, the weight corresponding to the preset value is generally larger than the weight corresponding to the historical use frequency, and the weight corresponding to the historical use frequency is generally larger than the weight corresponding to the incremental backup frequency; based on the data importance degree, determining the data importance level of the incremental data, wherein the determining mode of the data importance level refers to the determining mode of the data loss risk level, and is not described herein.
In a further embodiment, determining, among the plurality of third storage devices, the plurality of first storage devices may include: the method comprises the steps of obtaining the number of data blocks into which incremental data are divided when the incremental data are backed up, wherein the incremental data are generally divided according to the size of a preset data block, so that the ratio of the occupied storage capacity of the incremental data to the size of the preset data block can be rounded up, and the number of the data blocks can be obtained; sorting the plurality of third storage devices from large to small according to the percentage of the remaining storage space to obtain a device rank; and ranking the third storage devices which are ranked at the front and matched with the number of the data blocks as the first storage devices, wherein the total number of the third storage devices is equal to the number of the data blocks so as to determine more proper first storage devices and further improve the data security of incremental backup.
In addition to the above description of the incremental backup method for data based on distributed storage, the embodiment of the present application further provides an incremental backup system for data based on distributed storage, where the incremental backup system for data based on distributed storage includes:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in memory and configured to perform the steps of any of the distributed storage based incremental backup method embodiments described above by the processor.
As shown in FIG. 4, a schematic diagram of a distributed storage-based incremental backup system of data according to an embodiment of the present application is shown, in particular:
the distributed storage based data incremental backup system may include one or more processors 401 of a processing core, one or more storage units 402 of a computer readable storage medium, a power supply 403, and an input unit 404, among other components. Those skilled in the art will appreciate that the distributed storage based data incremental backup system architecture shown in FIG. 4 is not limiting of the distributed storage based data incremental backup system and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:
the processor 401 is a control center of the distributed storage-based data incremental backup system, connects respective portions of the entire distributed storage-based data incremental backup system using various interfaces and lines, and performs various functions and processes of the distributed storage-based data incremental backup system by running or executing software programs and/or modules stored in the storage unit 402 and calling data stored in the storage unit 402, thereby performing overall monitoring of the distributed storage-based data incremental backup system. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, etc., and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The storage unit 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by running the software programs and modules stored in the storage unit 402. The storage unit 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created from the use of a data incremental backup system based on distributed storage, and the like. In addition, the storage unit 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory unit 402 may also include a memory controller to provide the processor 401 with access to the memory unit 402.
The incremental backup system for data based on distributed storage further includes a power supply 403 for supplying power to each component, and preferably, the power supply 403 may be logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and managing power consumption are implemented through the power management system. The power supply 403 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The distributed storage-based incremental data backup system may also include an input unit 404, where the input unit 404 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the incremental backup system for data based on distributed storage may further include a display unit or the like, which is not described herein. In particular, in the embodiment of the present application, the processor 401 in the incremental backup system based on the data stored in a distributed manner loads executable files corresponding to the processes of one or more application programs into the storage unit 402 according to the following instructions, and the processor 401 executes the application programs stored in the storage unit 402, so as to implement various functions as follows:
acquiring incremental data to be backed up in a target virtual machine; determining the data loss risk rate of each preset storage device in the distributed storage system; determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate; the incremental data is backed up to a plurality of first storage devices.
According to the data incremental backup method based on distributed storage, the incremental data is acquired, the plurality of first storage devices matched with the incremental data are screened out from the plurality of preset storage devices of the distributed storage system based on the data loss risk rate, and then the incremental data is backed up to the plurality of first storage devices, so that the incremental data is prevented from being backed up to the preset storage devices with larger data loss risk, and the data safety of the incremental backup is improved.
To this end, embodiments of the present application provide a computer-readable storage medium, which may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like. The computer readable storage medium has stored therein a plurality of instructions capable of being loaded by a processor to perform the steps of any of the distributed storage based incremental backup methods of data provided in embodiments of the present application. For example, the instructions may perform the steps of:
acquiring incremental data to be backed up in a target virtual machine; determining the data loss risk rate of each preset storage device in the distributed storage system; determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate; the incremental data is backed up to a plurality of first storage devices.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
The foregoing describes in detail a method, a system and a storage medium for incremental backup of data based on distributed storage, which are provided in the embodiments of the present application, and specific examples are applied herein to illustrate the principles and implementations of the present application, where the foregoing description of the embodiments is only for helping to understand the method and core ideas of the present application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above.

Claims (7)

1. A distributed storage-based incremental backup method for data, the method comprising:
acquiring incremental data to be backed up in a target virtual machine;
determining the data loss risk rate of each preset storage device in the distributed storage system;
determining a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices based on the data loss risk rate;
backing up the incremental data to the plurality of first storage devices;
the determining, based on the data loss risk rate, a plurality of first storage devices matched with the incremental data in a plurality of preset storage devices includes: acquiring the occupied storage capacity of the incremental data; determining a plurality of second storage devices in a plurality of preset storage devices based on the occupied storage capacity; determining a data loss risk level of each second storage device based on the data loss risk rate; determining the data importance level of the incremental data; determining a plurality of third storage devices of data loss risk levels matched with the data importance levels in the plurality of second storage devices; acquiring the number of data blocks into which the incremental data is divided when the incremental data is backed up; sorting the plurality of third storage devices from large to small according to the percentage of the remaining storage space to obtain device ranks; a plurality of third storage devices which are ranked in the device ranking and matched with the number of the data blocks are used as a plurality of first storage devices;
the data loss risk rate of the first storage device is smaller than a data risk loss rate threshold, and the data risk loss rate threshold corresponds to the data importance of the incremental data.
2. The incremental backup method of data based on distributed storage according to claim 1, wherein determining the risk of data loss of each preset storage device in the distributed storage system comprises:
acquiring the historical data loss times of each preset storage device;
acquiring accumulated operation time of each preset storage device;
determining the historical data loss frequency of each preset storage device based on the historical data loss times and the accumulated operation time length;
and determining the data loss risk rate of each preset storage device based on the historical data loss frequency.
3. The incremental backup method of data based on distributed storage of claim 2 wherein determining a risk of data loss for each of the predetermined storage devices based on the historical data loss frequency comprises:
acquiring equipment performance parameters of each preset storage equipment;
determining a device performance score of each preset storage device based on the device performance parameters;
determining a frequency correction coefficient of each preset storage device based on the device performance score;
and correcting the historical data loss frequency by adopting the frequency correction coefficient to obtain the data loss risk rate of each preset storage device.
4. The incremental backup method of data based on distributed storage of claim 3 wherein determining the frequency correction factor for each of the predetermined storage devices based on the device performance scores comprises:
acquiring the used storage capacity of each preset storage device;
acquiring the ratio between the used storage capacity of each preset storage device and the maximum storage capacity of the preset storage device;
determining a storage capacity score of each preset storage device based on the used storage capacity and the ratio;
and determining the frequency correction coefficient of each preset storage device based on the storage capacity score and the device performance score.
5. The incremental backup method of data based on distributed storage of claim 1 wherein the determining the level of data importance of the incremental data comprises:
obtaining a virtual machine level of the target virtual machine;
acquiring historical use frequency of the target virtual machine in a preset historical time period;
acquiring the incremental backup frequency of the target virtual machine;
determining the data importance of the incremental data based on the virtual machine level, the historical use frequency and the incremental backup frequency;
and determining the data importance level of the incremental data based on the data importance level.
6. A distributed storage-based data incremental backup system, the distributed storage-based data incremental backup system comprising:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and are configured to be executed by the processor to implement the steps in the distributed storage based incremental backup method of any one of claims 1 to 5.
7. A computer readable storage medium having stored thereon a computer program, the computer program being loaded by a processor to perform the steps of the distributed storage based incremental backup method of any one of claims 1 to 5.
CN202311248908.XA 2023-09-26 2023-09-26 Data incremental backup method, system and storage medium based on distributed storage Active CN116991636B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311248908.XA CN116991636B (en) 2023-09-26 2023-09-26 Data incremental backup method, system and storage medium based on distributed storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311248908.XA CN116991636B (en) 2023-09-26 2023-09-26 Data incremental backup method, system and storage medium based on distributed storage

Publications (2)

Publication Number Publication Date
CN116991636A CN116991636A (en) 2023-11-03
CN116991636B true CN116991636B (en) 2024-01-19

Family

ID=88525173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311248908.XA Active CN116991636B (en) 2023-09-26 2023-09-26 Data incremental backup method, system and storage medium based on distributed storage

Country Status (1)

Country Link
CN (1) CN116991636B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007249760A (en) * 2006-03-17 2007-09-27 Nec Corp Distributed mutual backup system, information processor, distributed mutual backup method and program
CN112612645A (en) * 2020-12-24 2021-04-06 深圳市科力锐科技有限公司 Backup standard reaching rate determining method, equipment, storage medium and device
CN115712549A (en) * 2022-11-22 2023-02-24 阿里巴巴(中国)有限公司 Performance evaluation method, device and storage medium
CN116700620A (en) * 2023-06-15 2023-09-05 新华三云计算技术有限公司 Data storage method, device, equipment and storage medium
CN116755939A (en) * 2023-08-14 2023-09-15 北京泰利思诺信息技术股份有限公司 Intelligent data backup task planning method and system based on system resources

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10496492B2 (en) * 2018-02-02 2019-12-03 EMC IP Holding Company LLC Virtual machine backup with efficient checkpoint handling based on a consistent state of the virtual machine of history data and a backup type of a current consistent state of the virtual machine
CN110413216B (en) * 2018-04-28 2023-07-18 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for managing a storage system
US11809280B2 (en) * 2021-03-05 2023-11-07 EMC IP Holding Company LLC Synchronizing expirations for incremental backup data stored on a cloud-based object storage

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007249760A (en) * 2006-03-17 2007-09-27 Nec Corp Distributed mutual backup system, information processor, distributed mutual backup method and program
CN112612645A (en) * 2020-12-24 2021-04-06 深圳市科力锐科技有限公司 Backup standard reaching rate determining method, equipment, storage medium and device
CN115712549A (en) * 2022-11-22 2023-02-24 阿里巴巴(中国)有限公司 Performance evaluation method, device and storage medium
CN116700620A (en) * 2023-06-15 2023-09-05 新华三云计算技术有限公司 Data storage method, device, equipment and storage medium
CN116755939A (en) * 2023-08-14 2023-09-15 北京泰利思诺信息技术股份有限公司 Intelligent data backup task planning method and system based on system resources

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向大数据的并行数据分布式备份存储仿真;姚迎乐等;计算机仿真(08);全文 *

Also Published As

Publication number Publication date
CN116991636A (en) 2023-11-03

Similar Documents

Publication Publication Date Title
US8117613B2 (en) Optimized virtual machine migration mechanism
US8516499B2 (en) Assistance in performing action responsive to detected event
CN111694515B (en) Zone writing distribution method and system based on ZNS solid state disk
CN106293803B (en) A kind of application control method and device
US8145449B2 (en) Computer product, apparatus, and method for system management
JP2004178118A (en) Monitoring method and monitoring program of operating state of program
CN117112701B (en) Node switching method in distributed database, computer equipment and storage medium
CN116991636B (en) Data incremental backup method, system and storage medium based on distributed storage
CN114281256A (en) Data synchronization method, device, equipment and medium based on distributed storage system
WO2024120081A1 (en) Energy storage system control method and related apparatus
CN110018797B (en) Data migration method, device and equipment and readable storage medium
CN108984330B (en) Method and device for controlling storage equipment and electronic equipment
CN114083987B (en) Correction method and device for battery monitoring parameters and computer equipment
CN114020416A (en) Large-page memory dynamic management method and device and computer equipment
CN114116318A (en) Data backup method and device, computer equipment and storage medium
CN115640109B (en) Task scheduling method, system and client
CN117407213A (en) System regulation and control method and device, storage medium and electronic equipment
CN116400871B (en) Defragmentation method, defragmentation device, storage medium and electronic device
CN112506428B (en) Storage defragmentation time adjustment method, system, terminal and storage medium
CN114579369A (en) Data backup method, computer equipment and storage medium
US10725879B2 (en) Resource management apparatus, resource management method, and nonvolatile recording medium
CN116594739B (en) Control method of virtual machine, computer equipment and storage medium
CN118069658B (en) Control method of database system and database system
CN110618388B (en) Battery performance detection method and device
CN115733771A (en) Storage module detection method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant