WO2019232993A1 - Adaptive data recovery flow control method and apparatus, electronic device and storage medium - Google Patents

Adaptive data recovery flow control method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2019232993A1
WO2019232993A1 PCT/CN2018/108128 CN2018108128W WO2019232993A1 WO 2019232993 A1 WO2019232993 A1 WO 2019232993A1 CN 2018108128 W CN2018108128 W CN 2018108128W WO 2019232993 A1 WO2019232993 A1 WO 2019232993A1
Authority
WO
WIPO (PCT)
Prior art keywords
statistical period
flow control
data block
control threshold
load category
Prior art date
Application number
PCT/CN2018/108128
Other languages
French (fr)
Chinese (zh)
Inventor
陈学伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019232993A1 publication Critical patent/WO2019232993A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device

Definitions

  • the present application relates to the field of computer technology, and in particular, to an adaptive data recovery flow control method, device, electronic device, and storage medium.
  • a common data redundancy strategy is to store multiple copies of data on different physical nodes. When some copies are damaged, the damaged copies can be repaired based on the intact copies.
  • a first aspect of the present application provides an adaptive data recovery flow control method, where the method includes:
  • steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
  • a second aspect of the present application provides an adaptive data recovery flow control device, where the device includes:
  • a synchronization module for regularly synchronizing information of each storage node in the distributed storage system
  • a detection module for detecting whether a storage node has failed
  • An obtaining module configured to obtain a storage list of a failed storage node when the detection module detects a failure of the storage node
  • Identification module used to identify the IO load category of the user application in the previous statistical period
  • a calculation module configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period
  • the recovery module is configured to perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
  • a third aspect of the present application provides an electronic device.
  • the electronic device includes a processor and a memory, where the memory is configured to store at least one instruction, and the processor is configured to execute the at least one instruction to implement the following steps:
  • steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
  • a fourth aspect of the present application provides a non-volatile readable storage medium. At least one instruction is stored on the non-volatile readable storage medium, and when the at least one instruction is executed by a processor, the following steps are implemented:
  • steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
  • the adaptive data recovery flow control method, device, electronic device and storage medium described in the present application can divide a recovery period into multiple statistical periods, and in each statistical period, according to the user application in the previous statistical period
  • the IO load category of the device dynamically adjusts the corresponding flow control threshold in the current statistical period, and recovers the data in the current statistical period according to different flow control thresholds.
  • the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load.
  • FIG. 1 is a flowchart of an adaptive data recovery flow control method provided in Embodiment 1 of the present application.
  • FIG. 2 is a functional block diagram of an adaptive data recovery flow control device provided in Embodiment 2 of the present application.
  • FIG. 3 is a schematic diagram of an electronic device according to a third embodiment of the present application.
  • the adaptive data recovery flow control method in the embodiment of the present application is applied to one or more electronic devices.
  • the adaptive data recovery flow control method can also be applied to a hardware environment composed of an electronic device and a server connected to the electronic device through a network.
  • the network includes, but is not limited to: a wide area network, a metropolitan area network, or a local area network.
  • the adaptive data recovery flow control method in the embodiment of the present application may be executed by a server or an electronic device; it may also be executed jointly by the server and the electronic device.
  • the adaptive data recovery flow control function provided by the method of the present application can be directly integrated on the electronic device, or an Client.
  • the method provided in this application can also be run on a device such as a server in the form of Software Development Kit (SDK), and provide an interface for adaptive data recovery flow control functions in the form of SDK, an electronic device.
  • SDK Software Development Kit
  • other devices can implement the function of adaptively controlling data recovery through the provided interface.
  • FIG. 1 is a flowchart of an adaptive data recovery flow control method provided in Embodiment 1 of the present application. According to different requirements, the execution order in this flowchart can be changed, and some steps can be omitted.
  • the distributed storage system (hereinafter referred to as a storage system) adopts a cluster storage method for distributed data storage.
  • the distributed storage is a data storage technology that uses the remaining disk space on each storage system in the cluster through the network and integrates the storage resources of these scattered remaining disk spaces to form a virtual Storage device, which stores data in various corners of the cluster.
  • each storage node described in this application is each sub storage system in the cluster.
  • the storage node may be a storage server, a computer, or a storage device.
  • the information of each storage node in the synchronized distributed storage system may include: 1) a storage center in the storage system performs information synchronization of each storage node; or 2) adopts In a decentralized method, any one storage node in the storage system initiates information synchronization of each storage node.
  • the synchronization of the information of each storage node may include, but is not limited to, synchronization of a CPU, a memory, a disk free space, and a list of stored files.
  • the storage file list records information such as the name, size, and location of data stored in each storage node.
  • the failure of the storage node may be that any one or more storage nodes in the storage system cannot be started, powered off, or disconnected from the network, or any one of the storage systems or Disks in multiple storage nodes have failed, etc. Therefore, the detecting whether a storage node is faulty includes: detecting whether any one or more storage nodes in the storage system have failed to start, power off, or disconnected from the network, or the storage system. Whether disks in any one or more storage nodes have failed, etc.
  • any one of the storage nodes in the storage system fails, such as failure to start, power off, or network disconnection, the failed storage node is disconnected from other storage nodes and / or storage centers. Therefore, the other storage nodes The node and / or storage center can detect that a storage node has failed.
  • the synchronization information sent by the failed storage node to other storage nodes and / or storage centers will include the failure information of the disk.
  • Other storage nodes and / or storage centers can detect that a storage node has failed.
  • step S13 When it is detected that a storage node has failed, step S13 is performed; when it is not detected that a storage node has failed, step S12 is continued.
  • obtaining the storage list of the storage node that has failed includes obtaining information such as the name, size, and location of data stored in the storage node that has failed.
  • a recovery period may include multiple statistical periods, and a statistical period may be a preset time period. For example, a statistical period is set to 1 second.
  • the IO load category includes: a high load category, a normal load category, and a low load category.
  • the identifying the IO load category of the user application in the previous statistical period may include:
  • the average data block size of the IO in the last statistical period may be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • the data block sizes of the ten IOs are: 2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M. And 8M.
  • the transmission delay refers to the time required for a node to enter a data block from the node to the transmission medium when transmitting data, that is, the time required for a sending site to start sending data frames to the completion of data frame transmission.
  • the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
  • the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm. Assume that assuming that the transmission delays of ten IOs in the previous statistical period are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, then When the average IO block delay in the previous statistical period is calculated using the arithmetic mean algorithm, the result is:
  • the average data block size of the IO in the previous statistical period is calculated using the arithmetic average algorithm, the average data block delay of the IO in the previous statistical period is also calculated using the arithmetic average algorithm; if The average data block size of the IO in the previous statistical period is calculated using the geometric mean algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the geometric mean algorithm; or The average data block size of the IO is calculated using the root mean square average algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the root mean square average algorithm.
  • the reference value of the size of the IO data block and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is the smallest, and in the ideal state, it can reach 50ms, then the reference value of the IO data block size can be set to 4k, and the corresponding data block delay reference value can be set. It is 50ms.
  • the average data block size of the IO in the previous statistical period is X
  • the average data block delay is Y
  • the reference value of the data block size is M
  • the reference value of the corresponding data block delay is N
  • the calculation formula of the IO load intensity in the previous statistical period is:
  • the load classification model includes, but is not limited to, a Support Vector Machine (SVM) model.
  • SVM Support Vector Machine
  • Using the average data block size of the IO in the last statistical period, the average data block delay of the IO in the last statistical period, and the IO load intensity in the last statistical period as the load classification model The input is calculated by the load classification model, and the IO load category in the previous statistical period is output.
  • SVM Support Vector Machine
  • the training process of the load classification model includes:
  • training samples in the training sets of different load categories are distributed to different folders. For example, training samples of high load category are distributed to the first folder, training samples of normal load category are distributed to the second folder, and training samples of low load category are distributed to the third folder.
  • training samples of the first preset ratio for example, 70%
  • second preset ratios for example, 30%
  • the accuracy rate is greater than or equal to a preset accuracy rate, end training, and use the trained load classification model as a classifier to identify the IO load category in the current statistical period; if the accuracy rate is less than When the accuracy is preset, the number of positive samples and the number of negative samples are increased to retrain the load classification model until the accuracy is greater than or equal to the preset accuracy.
  • the flow control refers to flow control. There are two methods for implementing flow control: one is to implement flow control based on source address, destination address, source port, destination port, and protocol type through the QoS module of routers and switches; the other is to use professional flow control equipment Implement application-based flow control.
  • Each statistical period in the recovery period can correspond to a flow control threshold.
  • the flow control threshold corresponding to each statistical cycle is dynamically adjusted.
  • the flow control threshold corresponding to the current statistical cycle can be calculated based on the IO load category in the previous statistical cycle.
  • the flow control threshold corresponding to the next statistical cycle can be calculated according to the current statistical cycle. Calculated within the IO load category.
  • the flow control threshold corresponding to the first statistical period in the recovery period of this application is a preset flow control threshold, which can be preset by the administrator of the storage system based on experience. That is, when a preset flow control threshold is used as the flow control threshold of the first statistical period in the recovery period, the flow control threshold corresponding to the second statistical period is calculated according to the IO load category in the first statistical period; according to The IO load category in the second statistical period calculates the flow control threshold corresponding to the third statistical period; and so on.
  • calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period may include:
  • the flow control threshold is reduced according to the first preset amplitude, so as to perform a recovery operation on the data of the storage node with a low flow control threshold in the current statistical period. Reduce the speed of data recovery to ensure efficient access to user applications.
  • the first preset amplitude may be 1/2 of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
  • the flow control threshold is increased according to the second preset amplitude to perform a recovery operation on the data of the storage node with a high flow control threshold in the current statistical period.
  • the speed of data recovery is improved.
  • the second preset amplitude may be 1.5 times a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times the flow control threshold corresponding to the current statistical period.
  • the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
  • step S14 When it is determined that a recovery operation is performed on data in all statistical periods of the failed storage node, the process ends; when it is determined that a recovery operation is not performed on data in all statistical periods of the failed storage node, Return to step S14 described above.
  • the adaptive data recovery flow control method described in this application periodically synchronizes information of each storage node in a distributed storage system; when a failure of a storage node is detected, the failed storage is acquired Node's storage list; identify the IO load category of the user application in the previous statistical period; calculate the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period; according to the storage list and the flow corresponding to the current statistical period Control a threshold, and perform a recovery operation on data in the current statistical period of the failed storage node until a recovery operation is performed on data in all statistical periods of the failed storage node.
  • This application can divide a recovery period into multiple statistical periods.
  • each statistical period dynamically adjust the corresponding flow control threshold in the current statistical period according to the IO load category applied by the user in the previous statistical period. Control the threshold to restore the data in the current statistical period.
  • the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load.
  • user applications When the I / O load intensity is low, increase the flow control threshold for fault recovery in the current statistical period, so as to achieve the goal of increasing the fault recovery intensity and recovering the distributed storage system to a healthy state as soon as possible. That is, this application can improve the data recovery efficiency of the large-scale distributed storage system and reduce the risk of data loss, while avoiding a significant impact on the performance of normal input and output services, and has a good flow control effect.
  • the corresponding flow control threshold in the current statistical cycle is automatically adjusted dynamically according to the IO load category of the user application in the previous statistical cycle, without manual adjustment by the manager, which reduces the workload of the manager and avoids
  • the problem of inaccurate adjustment caused by subjective factors can be dynamically adjusted with changes in the distributed storage system system and its hardware facilities, and has high reliability.
  • FIG. 2 is a functional module diagram of a preferred embodiment of an adaptive data recovery flow control device of the present application.
  • the adaptive data recovery flow control device 20 (hereinafter referred to as "data recovery flow control device 20") runs in an electronic device.
  • the data recovery flow control device 20 may include a plurality of functional modules composed of program code segments.
  • the program code of each program segment in the data recovery flow control device 20 may be stored in a memory and executed by at least one processor to execute (see FIG. 1 and related description for details) adaptive data recovery flow control. method.
  • the data recovery flow control device 20 of the electronic device may be divided into a plurality of functional modules according to functions performed by the device.
  • the functional modules may include a synchronization module 201, a detection module 202, an acquisition module 203, an identification module 204, a training module 205, a calculation module 206 / recovery module 207, and a judgment module 208.
  • the module referred to in the present application refers to a series of computer-readable instruction segments capable of being executed by at least one processor and capable of performing fixed functions, which are stored in a memory. In some embodiments, functions of each module will be described in detail in subsequent embodiments.
  • the synchronization module 201 is configured to periodically synchronize information of each storage node in the distributed storage system.
  • the distributed storage system (hereinafter referred to as a storage system) adopts a cluster storage method for distributed data storage.
  • the distributed storage is a data storage technology that uses the remaining disk space on each storage system in the cluster through the network and integrates the storage resources of these scattered remaining disk spaces to form a virtual Storage device, which stores data in various corners of the cluster.
  • each storage node described in this application is each sub storage system in the cluster.
  • the storage node may be a storage server, a computer, or a storage device.
  • the synchronization module 201 synchronizing information of each storage node in the distributed storage system may include: 1) a storage center in the storage system performs information synchronization of each storage node; or 2) Using a decentralized method, any one storage node in the storage system initiates information synchronization of each storage node.
  • the synchronization of the information of each storage node may include, but is not limited to, synchronization of a CPU, a memory, a disk free space, and a list of stored files.
  • the storage file list records information such as the name, size, and location of data stored in each storage node.
  • the detection module 202 is configured to detect whether a storage node has failed.
  • the failure of the storage node may be that any one or more storage nodes in the storage system cannot be started, powered off, or disconnected from the network, or any one of the storage systems or Disks in multiple storage nodes have failed, etc. Therefore, the detection module 202 detects whether a storage node has failed, including: detecting whether any one or more storage nodes in the storage system have failed to start, power off, or disconnected from the network; Describes whether the disks in any one or more storage nodes in the storage system have failed.
  • any one of the storage nodes in the storage system fails, such as failure to start, power off, or network disconnection, the failed storage node is disconnected from other storage nodes and / or storage centers. Therefore, the other storage nodes The node and / or storage center can detect that a storage node has failed.
  • the synchronization information sent by the failed storage node to other storage nodes and / or storage centers will include the failure information of the disk.
  • Other storage nodes and / or storage centers can detect that a storage node has failed.
  • An obtaining module 203 is configured to obtain a storage list of a storage node that has failed when the detection module 202 detects that a storage node has failed.
  • obtaining the storage list of the storage node that has failed includes obtaining information such as the name, size, and location of data stored in the storage node that has failed.
  • the identification module 204 is configured to identify an IO load category of a user application in a previous statistical period.
  • a recovery period may include multiple statistical periods, and a statistical period may be a preset time period. For example, a statistical period is set to 1 second.
  • the IO load category includes: a high load category, a normal load category, and a low load category.
  • the identification module 204 identifying the IO load category of the user application in the previous statistical period may include:
  • the average data block size of the IO in the last statistical period may be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • N is the number of data blocks of IO
  • S i is the data block size of each IO.
  • the data block sizes of the ten IOs are: 2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M And 8M.
  • the transmission delay refers to the time required for a node to enter a data block from the node to the transmission medium when transmitting data, that is, the time required for a sending site to start sending data frames to the completion of data frame transmission The total time required for a receiving station, or the time required for a receiving station to start receiving data frames and finish receiving data frames.
  • the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
  • the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm. Assume that assuming that the transmission delays of ten IOs in the previous statistical period are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, then When the average IO block delay in the previous statistical period is calculated using the arithmetic mean algorithm, the result is:
  • the average data block size of the IO in the previous statistical period is calculated using the arithmetic average algorithm, the average data block delay of the IO in the previous statistical period is also calculated using the arithmetic average algorithm; if The average data block size of the IO in the previous statistical period is calculated using the geometric mean algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the geometric mean algorithm; or The average data block size of the IO is calculated using the root mean square average algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the root mean square average algorithm.
  • the reference value of the size of the IO data block and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is the smallest, and in the ideal state, it can reach 50ms, then the reference value of the IO data block size can be set to 4k, and the corresponding data block delay reference value can be set. It is 50ms.
  • the average data block size of the IO in the previous statistical period is X
  • the average data block delay is Y
  • the reference value of the data block size is M
  • the reference value of the corresponding data block delay is N
  • the calculation formula of the IO load intensity in the previous statistical period is:
  • the load classification model includes, but is not limited to, a Support Vector Machine (SVM) model.
  • SVM Support Vector Machine
  • Using the average data block size of the IO in the last statistical period, the average data block delay of the IO in the last statistical period, and the IO load intensity in the last statistical period as the load classification model The input is calculated by the load classification model, and the IO load category in the previous statistical period is output.
  • SVM Support Vector Machine
  • the training module 205 is configured to train the load classification model.
  • the process of the training module 205 training the load classification model includes:
  • training samples in the training sets of different load categories are distributed to different folders. For example, training samples of high load category are distributed to the first folder, training samples of normal load category are distributed to the second folder, and training samples of low load category are distributed to the third folder.
  • training samples of the first preset ratio for example, 70%
  • second preset ratios for example, 30%
  • the accuracy rate is greater than or equal to a preset accuracy rate, end training, and use the trained load classification model as a classifier to identify the IO load category in the current statistical period; if the accuracy rate is less than When the accuracy is preset, the number of positive samples and the number of negative samples are increased to retrain the load classification model until the accuracy is greater than or equal to the preset accuracy.
  • the calculation module 206 is configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period.
  • the flow control refers to flow control. There are two methods for implementing flow control: one is to implement flow control based on source address, destination address, source port, destination port, and protocol type through the QoS module of routers and switches; the other is to use professional flow control equipment Implement application-based flow control.
  • Each statistical period in the recovery period can correspond to a flow control threshold.
  • the flow control threshold corresponding to each statistical cycle is dynamically adjusted.
  • the flow control threshold corresponding to the current statistical cycle can be calculated based on the IO load category in the previous statistical cycle.
  • the flow control threshold corresponding to the next statistical cycle can be calculated according to the current statistical cycle. Calculated within the IO load category.
  • the flow control threshold corresponding to the first statistical period in the recovery period of this application is a preset flow control threshold, which can be preset by the administrator of the storage system based on experience. That is, when a preset flow control threshold is used as the flow control threshold of the first statistical period in the recovery period, the flow control threshold corresponding to the second statistical period is calculated according to the IO load category in the first statistical period; according to The IO load category in the second statistical period calculates the flow control threshold corresponding to the third statistical period; and so on.
  • the calculating module 206 calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period may include:
  • the flow control threshold is reduced according to the first preset amplitude, so as to perform a recovery operation on the data of the storage node with a low flow control threshold in the current statistical period. Reduce the speed of data recovery to ensure efficient access to user applications.
  • the first preset amplitude may be 1/2 of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
  • the flow control threshold is increased according to the second preset amplitude to perform a recovery operation on the data of the storage node with a high flow control threshold in the current statistical period.
  • the speed of data recovery is improved.
  • the second preset amplitude may be 1.5 times a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times the flow control threshold corresponding to the current statistical period.
  • the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
  • the recovery module 207 is configured to perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
  • the determining module 208 is configured to determine whether a recovery operation is performed on data in all statistical periods of the faulty storage node.
  • the judging module 208 determines that the recovery operation is not performed on the data in all the statistical cycles of the failed storage node, it returns to execute the aforementioned identifying module 204.
  • the synchronization module 201 periodically synchronizes information of each storage node in the distributed storage system; the acquisition module 203 detects a storage node in the detection module 202 When a failure occurs, obtain the storage list of the storage node that failed; the identification module 204 identifies the IO load category of the user application in the previous statistical period; the calculation module 206 calculates the corresponding IO load category in the previous statistical period Flow control threshold; the recovery module 207 performs a recovery operation on data in the current statistical period of the failed storage node according to the storage list and the flow control threshold corresponding to the current statistical period, until the failed storage node Perform recovery operations on all data in the statistical period.
  • This application can divide a recovery period into multiple statistical periods. In each statistical period, dynamically adjust the corresponding flow control threshold in the current statistical period according to the IO load category applied by the user in the previous statistical period. Control the threshold to restore the data in the current statistical period.
  • the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load.
  • the previous statistical period user applications When the I / O load intensity is low, increase the flow control threshold for fault recovery in the current statistical period, so as to achieve the goal of increasing the fault recovery intensity and recovering the distributed storage system to a healthy state as soon as possible. That is, this application can improve the data recovery efficiency of a large-scale distributed storage system and reduce the risk of data loss, while avoiding a significant impact on normal I / O business performance, and has a good flow control effect.
  • the corresponding flow control threshold in the current statistical cycle is automatically adjusted dynamically according to the IO load category of the user application in the previous statistical cycle, without manual adjustment by the manager, which reduces the workload of the manager and avoids
  • the problem of inaccurate adjustment caused by subjective factors can be dynamically adjusted with changes in the distributed storage system system and its hardware facilities, and has high reliability.
  • the above integrated unit implemented in the form of a software functional module may be stored in a non-volatile readable storage medium.
  • the above software function module is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a dual-screen device, or a network device) or a processor to execute the embodiments described in this application. Part of the method.
  • FIG. 3 is a schematic diagram of an electronic device provided in Embodiment 5 of the present application.
  • the electronic device 3 includes a memory 31, at least one processor 32, computer-readable instructions 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.
  • the computer-readable instructions 33 may be divided into one or more modules / units, and the one or more modules / units are stored in the memory 31 and processed by the at least one processor 32 Execute to complete this application.
  • the one or more modules / units may be a series of computer-readable instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 33 in the electronic device 3.
  • the electronic device 3 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the schematic diagram 3 is only an example of the electronic device 3, and does not constitute a limitation on the electronic device 3. It may include more or less components than shown in the figure, or some components may be combined or different
  • the electronic device 3 may further include an input / output device, a network access device, a bus, and the like.
  • the at least one processor 32 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), and application-specific integrated circuits (ASICs). ), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the processor 32 may be a microprocessor or the processor 32 may be any conventional processor.
  • the processor 32 is a control center of the electronic device 3, and uses various interfaces and lines to connect the entire electronic device 3. The various parts.
  • the memory 31 may be configured to store the computer-readable instructions 33 and / or modules / units, and the processor 32 may execute or execute the computer-readable instructions and / or modules / units stored in the memory 31, and
  • the data stored in the memory 31 is called to implement various functions of the electronic device 3.
  • the memory 31 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, application programs required for at least one function (such as a sound playback function, an image playback function, etc.), etc .; Data (such as audio data, phone book, etc.) created according to the use of the electronic device 3 are stored.
  • the memory 31 may include a high-speed random access memory, and may also include a non-volatile memory, such as a hard disk, an internal memory, a plug-in hard disk, a Smart Media Card (SMC), and a Secure Digital (SD). Card, flash memory card (Flash card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
  • a non-volatile memory such as a hard disk, an internal memory, a plug-in hard disk, a Smart Media Card (SMC), and a Secure Digital (SD).
  • SSD Secure Digital
  • flash memory card Flash card
  • flash memory device at least one disk storage device, flash memory device, or other volatile solid-state storage device.
  • the integrated module / unit of the electronic device 3 When the integrated module / unit of the electronic device 3 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile readable storage medium. Based on this understanding, this application implements all or part of the processes in the methods of the above embodiments, and can also be completed by computer-readable instructions to instruct related hardware.
  • the computer-readable instructions can be stored in a non-volatile memory. In the read storage medium, when the computer-readable instructions are executed by a processor, the steps of the foregoing method embodiments can be implemented.
  • the computer-readable instructions may be in a source code form, an object code form, an executable file, or some intermediate form.
  • the non-volatile readable medium may include: any entity or device capable of carrying the computer-readable instruction code, a recording medium, a U disk, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), electric carrier signals, telecommunication signals, and software distribution media.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electric carrier signals telecommunication signals
  • telecommunication signals and software distribution media.
  • the content contained in the non-volatile readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practices in the jurisdictions. For example, in some jurisdictions, according to legislation and patent practices, non- Volatile readable media does not include electrical carrier signals and telecommunication signals.
  • each functional unit in each embodiment of the present application may be integrated in the same processing unit, or each unit may exist separately physically, or two or more units may be integrated in the same unit.
  • the integrated unit can be implemented in the form of hardware, or in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Provided is an adaptive data recovery flow control method, comprising: periodically synchronizing information of storage nodes in a distributed storage system (S11); when it is detected that a storage node fails (S12), acquiring a storage list of the failed storage node (S13); identifying an IO load category of a user application in a previous statistical period (S14); calculating, according to the IO load category in the previous statistical period, a flow control threshold value corresponding to the current statistical period (S15); according to the storage list and the flow control threshold value corresponding to the current statistical period, executing a recovery operation on data of the failed storage node in the current statistical period (S16); and determining whether the recovery operation is performed on the data of the failed storage node in all the statistical periods, and if so, ending the process (S17). Further provided are an adaptive data recovery flow control apparatus, an electronic device and a storage medium. According to the method, the obvious impact on a normal input and output service performance can be avoided, while the data recovery efficiency of the large-scale distributed storage system is improved and the data loss risk is reduced, and the method also has a good flow control effect.

Description

自适应的数据恢复流控方法、装置、电子设备及存储介质Adaptive data recovery flow control method, device, electronic equipment and storage medium
本申请要求于2018年06月04日提交中国专利局,申请号为201810565004.2发明名称为“自适应的数据恢复流控方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on June 04, 2018 with the application number 201810565004.2 and the invention name "Adaptive Data Recovery Flow Control Method, Device, Electronic Equipment and Storage Medium", all of which are The contents are incorporated herein by reference.
技术领域Technical field
本申请涉及计算机技术领域,具体涉及一种自适应的数据恢复流控方法、装置、电子设备及存储介质。The present application relates to the field of computer technology, and in particular, to an adaptive data recovery flow control method, device, electronic device, and storage medium.
背景技术Background technique
随着大数据和云计算时代的到来,各个领域的数据量呈现出快速增长的趋势。这些不断增长的海量数据需要依赖大规模的分布式存储系统,实现可靠的存储和高效的访问。然而,存储系统的规模越大,发生故障的概率也就越高。为了应对随时可能出现的故障,以保证数据存储的可靠性,分布式存储系统需要进行数据冗余。一种常见的数据冗余策略是将数据的多个副本存储到不同的物理节点上,当部分副本损坏时,可以根据完好的副本对损坏副本进行修复。With the advent of the era of big data and cloud computing, the data volume in various fields has shown a rapid growth trend. These growing amounts of data need to rely on large-scale distributed storage systems to achieve reliable storage and efficient access. However, the larger the storage system, the higher the probability of failure. In order to cope with possible failures at any time and to ensure the reliability of data storage, the distributed storage system needs data redundancy. A common data redundancy strategy is to store multiple copies of data on different physical nodes. When some copies are damaged, the damaged copies can be repaired based on the intact copies.
另外,在对分布式存储系统进行扩容时,需要进行一定规模的副本迁移,以此保证数据分布的均衡性,而这种数据迁移也被认为是一种特殊的数据修复。In addition, when expanding the capacity of a distributed storage system, a certain scale copy migration is required to ensure the balance of data distribution, and this data migration is also considered to be a special kind of data repair.
一方面需要提高数据修复效率以降低数据丢失风险,但另一方面,存储系统需要确保用户应用的高效访问,避免数据修复对正常业务的服务质量造成冲击,如何较好的权衡数据修复与正常的数据输入输出业务之间的任务分配,在提高数据修复效率的同时,避免对正常的数据输入输出业务性能造成明显冲击,使业务系统能持续稳定地获得较高的随机每秒输入输出次数(Input/Output Operations Per Second,IOPS)和吞吐率,对于分布式存储系统的性能提高是至关重要的。On the one hand, it is necessary to improve the efficiency of data repair to reduce the risk of data loss, but on the other hand, the storage system needs to ensure the efficient access of user applications to avoid the impact of data repair on the quality of service of normal business. How to better balance the data repair and normal The task allocation between data input and output services, while improving the efficiency of data repair, avoids a significant impact on normal data input and output business performance, and enables business systems to continuously and steadily obtain higher random input and output times per second (Input / Output Operations (Second, IOPS) and throughput are critical to improving the performance of distributed storage systems.
发明内容Summary of the Invention
鉴于以上内容,有必要提出一种自适应的数据恢复流控方法、装置、电子设备及存储介质,能够在提高大规模分布式存储系统数据恢复效率、降低数据丢失风险的同时,确保正常输入输出业务性能不被冲击,具有很好的流控效果。In view of the above, it is necessary to propose an adaptive data recovery flow control method, device, electronic device and storage medium, which can improve the data recovery efficiency of a large-scale distributed storage system and reduce the risk of data loss while ensuring normal input and output. Service performance is not impacted, and has good flow control effects.
本申请的第一方面提供一种自适应的数据恢复流控方法,所述方法包括:A first aspect of the present application provides an adaptive data recovery flow control method, where the method includes:
a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期内的数据执行了恢复操作。The foregoing steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
本申请的第二方面提供一种自适应的数据恢复流控装置,所述装置包括:A second aspect of the present application provides an adaptive data recovery flow control device, where the device includes:
同步模块,用于定期同步分布式存储系统中的各个存储节点的信息;A synchronization module for regularly synchronizing information of each storage node in the distributed storage system;
侦测模块,用于侦测是否有存储节点发生了故障;A detection module for detecting whether a storage node has failed;
获取模块,用于当所述侦测模块侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;An obtaining module, configured to obtain a storage list of a failed storage node when the detection module detects a failure of the storage node;
识别模块,用于识别上一个统计周期内用户应用的IO负载类别;Identification module, used to identify the IO load category of the user application in the previous statistical period;
计算模块,用于根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;A calculation module, configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
恢复模块,用于根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作。The recovery module is configured to perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
本申请的第三方面提供一种电子设备,所述电子设备包括处理器和存储器,所述存储器用于存储至少一个指令,所述处理器用于执行所述至少一个指令以实现以下步骤:A third aspect of the present application provides an electronic device. The electronic device includes a processor and a memory, where the memory is configured to store at least one instruction, and the processor is configured to execute the at least one instruction to implement the following steps:
a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期内的数据执行了恢复操作。The foregoing steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
本申请的第四方面提供一种非易失性可读存储介质,所述非易失性可读存储介质上存储有至少一个指令,所述至少一个指令被处理器执行时实现以下步骤:A fourth aspect of the present application provides a non-volatile readable storage medium. At least one instruction is stored on the non-volatile readable storage medium, and when the at least one instruction is executed by a processor, the following steps are implemented:
a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控 阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期内的数据执行了恢复操作。The foregoing steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
本申请所述的自适应的数据恢复流控方法、装置、电子设备及存储介质,能够通过将一个恢复周期分割成多个统计周期,在每一个统计周期内,根据上一个统计周期内用户应用的IO负载类别动态调整当前统计周期内对应的流控阈值,根据不同的流控阈值对当前统计周期内的数据进行恢复操作。在上一个统计周期内用户应用的IO负载强度高的时候,降低当前统计周期内故障恢复的流控阈值,从而达到降低故障恢复强度,保证业务IO负载的目的;在上一个统计周期内用户应用的IO负载强度低的时候,提高当前统计周期内故障恢复的流控阈值,从而达到提高故障恢复强度,尽快将分布式存储系统恢复到健康状态的目标。即本申请在提高大规模分布式存储系统数据修复效率、降低数据丢失风险的同时,能够避免对正常输入输出业务性能造成明显冲击,具有很好的流控效果。The adaptive data recovery flow control method, device, electronic device and storage medium described in the present application can divide a recovery period into multiple statistical periods, and in each statistical period, according to the user application in the previous statistical period The IO load category of the device dynamically adjusts the corresponding flow control threshold in the current statistical period, and recovers the data in the current statistical period according to different flow control thresholds. When the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load. In the previous statistical period, user applications When the I / O load intensity is low, increase the flow control threshold for fault recovery in the current statistical period, so as to achieve the goal of increasing the fault recovery intensity and recovering the distributed storage system to a healthy state as soon as possible. That is, this application can improve the data recovery efficiency of a large-scale distributed storage system and reduce the risk of data loss, while avoiding a significant impact on normal I / O business performance, and has a good flow control effect.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请实施例一提供的自适应的数据恢复流控方法的流程图。FIG. 1 is a flowchart of an adaptive data recovery flow control method provided in Embodiment 1 of the present application.
图2是本申请实施例二提供的自适应的数据恢复流控装置的功能模块图。FIG. 2 is a functional block diagram of an adaptive data recovery flow control device provided in Embodiment 2 of the present application.
图3是本申请实施例三提供的电子设备的示意图。FIG. 3 is a schematic diagram of an electronic device according to a third embodiment of the present application.
如下具体实施方式将结合上述附图进一步说明本申请。The following specific embodiments will further explain the present application in combination with the above drawings.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to more clearly understand the foregoing objectives, features, and advantages of the present application, the present application is described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.
在下面的描述中阐述了很多具体细节以便于充分理解本申请,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following description, many specific details are set forth in order to fully understand the present application. The described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terms used herein in the specification of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present application.
本申请实施例的自适应的数据恢复流控方法应用在一个或者多个电子设备中。所述自适应的数据恢复流控方法也可以应用于由电子设备和通过网络与所述电子设备进行连接的服务器所构成的硬件环境中。网络包括但不限于:广域网、城域网或局域网。本申请实施例的自适应的数据恢复流控方法可以由服务器来执行,也可以由电子设备来执行;还可以是由服务器和电子设备 共同执行。The adaptive data recovery flow control method in the embodiment of the present application is applied to one or more electronic devices. The adaptive data recovery flow control method can also be applied to a hardware environment composed of an electronic device and a server connected to the electronic device through a network. The network includes, but is not limited to: a wide area network, a metropolitan area network, or a local area network. The adaptive data recovery flow control method in the embodiment of the present application may be executed by a server or an electronic device; it may also be executed jointly by the server and the electronic device.
所述对于需要进行自适应的数据恢复流控方法的电子设备,可以直接在电子设备上集成本申请的方法所提供的自适应的数据恢复流控功能,或者安装用于实现本申请的方法的客户端。再如,本申请所提供的方法还可以以软件开发工具包(Software Development Kit,SDK)的形式运行在服务器等设备上,以SDK的形式提供自适应的数据恢复流控功能的接口,电子设备或其他设备通过提供的接口即可实现自适应的对数据恢复进行流控的功能。For an electronic device that needs to perform an adaptive data recovery flow control method, the adaptive data recovery flow control function provided by the method of the present application can be directly integrated on the electronic device, or an Client. For another example, the method provided in this application can also be run on a device such as a server in the form of Software Development Kit (SDK), and provide an interface for adaptive data recovery flow control functions in the form of SDK, an electronic device. Or other devices can implement the function of adaptively controlling data recovery through the provided interface.
实施例一Example one
图1是本申请实施例一提供的自适应的数据恢复流控方法的流程图。根据不同的需求,该流程图中的执行顺序可以改变,某些步骤可以省略。FIG. 1 is a flowchart of an adaptive data recovery flow control method provided in Embodiment 1 of the present application. According to different requirements, the execution order in this flowchart can be changed, and some steps can be omitted.
S11、定期同步分布式存储系统中的各个存储节点的信息。S11. Periodically synchronize information of each storage node in the distributed storage system.
本申请较佳实施例中,所述分布式存储系统(下文简称为存储系统)采用集群存储的方式进行数据分布式存储。In a preferred embodiment of the present application, the distributed storage system (hereinafter referred to as a storage system) adopts a cluster storage method for distributed data storage.
所述分布式存储是一种数据存储技术,其通过网络,使用集群中的每台存储系统上的剩余的磁盘空间,并将这些分散的剩余的磁盘空间的存储资源整合起来,构成一个虚拟的存储设备,将数据分散的存储在集群的各个角落。The distributed storage is a data storage technology that uses the remaining disk space on each storage system in the cluster through the network and integrates the storage resources of these scattered remaining disk spaces to form a virtual Storage device, which stores data in various corners of the cluster.
因此,本申请所述的各个存储节点为所述集群中的每个子存储系统。例如,所述存储节点可以是一个存储服务器、一台计算机或者一个存储设备。Therefore, each storage node described in this application is each sub storage system in the cluster. For example, the storage node may be a storage server, a computer, or a storage device.
在本申请较佳实施例中,所述同步分布式存储系统中的各个存储节点的信息可以包括:1)由所述存储系统中的一个存储中心执行各个存储节点的信息同步;或者2)采用去中心化的方法,由所述存储系统中的任何一个存储节点发起各个存储节点的信息同步。In a preferred embodiment of the present application, the information of each storage node in the synchronized distributed storage system may include: 1) a storage center in the storage system performs information synchronization of each storage node; or 2) adopts In a decentralized method, any one storage node in the storage system initiates information synchronization of each storage node.
所述各个存储节点的信息的同步可以包括,但不限于:CPU、内存、磁盘空闲空间及存储文件列表等的同步。The synchronization of the information of each storage node may include, but is not limited to, synchronization of a CPU, a memory, a disk free space, and a list of stored files.
本申请较佳实施例中,所述存储文件列表中记录有每个存储节点中所存储的数据的名称、大小、位置等信息。In a preferred embodiment of the present application, the storage file list records information such as the name, size, and location of data stored in each storage node.
S12、侦测是否有存储节点发生了故障。S12. Detect whether any storage node has failed.
在本申请较佳实施例中,所述存储节点发生故障可以是存储系统中的任何一个或者多个存储节点无法启动、断电或断网等,也可以是所述存储系统中的任何一个或者多个存储节点中的磁盘发生了故障等。因而,所述侦测是否有存储节点发生故障包括:侦测所述存储系统中的任何一个或者多个存储节点是否发生了无法启动、断电或断网等,或者侦测所述存储系统中的任何一个或者多个存储节点中的磁盘是否发生了故障等。In the preferred embodiment of the present application, the failure of the storage node may be that any one or more storage nodes in the storage system cannot be started, powered off, or disconnected from the network, or any one of the storage systems or Disks in multiple storage nodes have failed, etc. Therefore, the detecting whether a storage node is faulty includes: detecting whether any one or more storage nodes in the storage system have failed to start, power off, or disconnected from the network, or the storage system. Whether disks in any one or more storage nodes have failed, etc.
当所述存储系统中的任何一个存储节点发生了无法启动、断电、断网等故障时,所述故障存储节点会与其他存储节点及/或存储中心断开连接,因此,所述其他存储节点及/或存储中心可以侦测到有存储节点发生了故障。When any one of the storage nodes in the storage system fails, such as failure to start, power off, or network disconnection, the failed storage node is disconnected from other storage nodes and / or storage centers. Therefore, the other storage nodes The node and / or storage center can detect that a storage node has failed.
当所述存储系统中的任何一个存储节点中的磁盘发生故障时,所述故障存储节点发送给其他存储节点及/或存储中心的同步信息中会包含所述磁盘的故障信息,因此,所述其他存储节点及/或存储中心可以侦测到有存储节点发生了故障。When a disk in any storage node in the storage system fails, the synchronization information sent by the failed storage node to other storage nodes and / or storage centers will include the failure information of the disk. Other storage nodes and / or storage centers can detect that a storage node has failed.
当侦测到有存储节点发生了故障时,执行步骤S13;当没有侦测到有存储节点发生了故障时,继续执行步骤S12。When it is detected that a storage node has failed, step S13 is performed; when it is not detected that a storage node has failed, step S12 is continued.
S13、获取发生故障的存储节点的存储列表。S13. Acquire a storage list of the storage node that has failed.
在本申请较佳实施例中,获取发生故障的存储节点的存储列表包括获取发生故障的存储节点中所存储的数据的名称、大小、位置等信息。In the preferred embodiment of the present application, obtaining the storage list of the storage node that has failed includes obtaining information such as the name, size, and location of data stored in the storage node that has failed.
S14、识别上一个统计周期内用户应用的IO负载类别。S14. Identify the IO load category of the user application in the previous statistical period.
将存储节点的数据从发生故障到完成故障恢复的整个过程称之为一个恢复周期。一个恢复周期可以包括多个统计周期,一个统计周期可以为一个预设时间段,例如,一个统计周期设置为1秒钟。The entire process of storage node data from failure to complete recovery is called a recovery cycle. A recovery period may include multiple statistical periods, and a statistical period may be a preset time period. For example, a statistical period is set to 1 second.
在本申请较佳实施例中,所述IO负载类别包括:高负载类别、正常负载类别、低负载类别。In a preferred embodiment of the present application, the IO load category includes: a high load category, a normal load category, and a low load category.
具体地,所述识别上一个统计周期内用户应用的IO负载类别可以包括:Specifically, the identifying the IO load category of the user application in the previous statistical period may include:
(1)获取上一个统计周期内用户应用的每一个IO的数据块大小,计算所述上一个统计周期内的IO的平均数据块大小。(1) Obtain the data block size of each IO applied by the user in the previous statistical period, and calculate the average data block size of the IO in the previous statistical period.
所述上一个统计周期内的IO的平均数据块大小可以采用算术平均值算法、几何平均数算法,或者均方根平均数算法来计算。The average data block size of the IO in the last statistical period may be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm.
所述算术平均值算法的公式为:
Figure PCTCN2018108128-appb-000001
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the arithmetic mean algorithm is:
Figure PCTCN2018108128-appb-000001
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
所述几何平均数算法的公式为:
Figure PCTCN2018108128-appb-000002
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the geometric mean algorithm is:
Figure PCTCN2018108128-appb-000002
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
所述均方根平均数算法的公式为:
Figure PCTCN2018108128-appb-000003
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the root mean square algorithm is:
Figure PCTCN2018108128-appb-000003
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
举例而言,假设检测到上一个统计周期内,用户应用共有十次IO,十次IO的数据块大小分别为:2M,1M,3M,0.5M,10M,4M,0.1M,1.2M,5M以及8M。For example, suppose that during the last statistical period, the user application has a total of ten IOs. The data block sizes of the ten IOs are: 2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M. And 8M.
利用所述算术平均值算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000004
Figure PCTCN2018108128-appb-000005
Calculating the average data block size of the IO in the previous statistical period by using the arithmetic average algorithm is:
Figure PCTCN2018108128-appb-000004
Figure PCTCN2018108128-appb-000005
利用所述几何平均数算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000006
Calculating the average data block size of the IO in the previous statistical period by using the geometric average algorithm is:
Figure PCTCN2018108128-appb-000006
利用所述均方根平均数算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000007
Calculating the average data block size of the IO in the previous statistical period by using the root mean square average algorithm is:
Figure PCTCN2018108128-appb-000007
(2)获取所述上一个统计周期内的每个数据块的传输时延,计算所述上一个统计周期内的IO的平均数据块时延。(2) Obtain the transmission delay of each data block in the last statistical period, and calculate the average data block delay of the IO in the last statistical period.
所述传输时延(简称为时延)是指结点在发送数据时使数据块从结点进入到传输媒体所需的时间,即一个发送站点从开始发送数据帧到数据帧发送完毕所需要的全部时间,或者一个接收站点从开始接收数据帧到数据帧接收 完毕所需要的全部时间。The transmission delay (referred to as the delay) refers to the time required for a node to enter a data block from the node to the transmission medium when transmitting data, that is, the time required for a sending site to start sending data frames to the completion of data frame transmission. The total time required for a receiving station, or the time required for a receiving station to start receiving data frames and finish receiving data frames.
在本申请较佳实施例中,所述数据块的传输时延可以从每个存储节点中安装的一个负载量测工具或者性能监控工具中获取得到。In a preferred embodiment of the present application, the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
如上所述,所述上一个统计周期内的IO的平均数据块时延也可以采用算术平均值算法、几何平均数算法,或者均方根平均数算法来计算。假设,假设检测到上一个统计周期内,十次IO的传输时延分别为:1s、0.8s、1.5s、0.4s、5s、2s、0.02s、0.6s、3s及4.5s,则所述上一个统计周期内的IO平均数据块时延采用算术平均值算法来计算时,其结果为:As described above, the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm. Assume that assuming that the transmission delays of ten IOs in the previous statistical period are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, then When the average IO block delay in the previous statistical period is calculated using the arithmetic mean algorithm, the result is:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s。(1s + 0.8s + 1.5s + 0.4s + 5s + 2s + 0.1s + 0.6s + 3s + 4.4s) = 1.88s.
应当理解的是,若上一个统计周期内的IO的平均数据块大小采用算术平均值算法来计算,则上一个统计周期内的IO的平均数据块时延也采用算术平均值算法来计算;若上一个统计周期内的IO的平均数据块大小采用几何平均数算法来计算,则上一个统计周期内的IO的平均数据块时延也采用几何平均数算法来计算;或者若上一个统计周期内的IO的平均数据块大小采用均方根平均数算法来计算,则上一个统计周期内的IO的平均数据块时延也采用均方根平均数算法来计算。It should be understood that if the average data block size of the IO in the previous statistical period is calculated using the arithmetic average algorithm, the average data block delay of the IO in the previous statistical period is also calculated using the arithmetic average algorithm; if The average data block size of the IO in the previous statistical period is calculated using the geometric mean algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the geometric mean algorithm; or The average data block size of the IO is calculated using the root mean square average algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the root mean square average algorithm.
(3)获取预先设置的IO的数据块大小的基准值及对应的数据块时延的基准值。(3) Obtain a preset reference value of the data block size of the IO and a reference value of the corresponding data block delay.
在本申请较佳实施例中,所述IO数据块大小的基准值以及对应的数据块时延的基准值可以由存储系统的管理员根据经验预先设置。例如,根据经验,4K的数据块在传输时,时延最小,理想状态下可以达到50ms,则所述IO数据块大小的基准值可以设置为4k,对应的数据块时延的基准值可以设置为50ms。In a preferred embodiment of the present application, the reference value of the size of the IO data block and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is the smallest, and in the ideal state, it can reach 50ms, then the reference value of the IO data block size can be set to 4k, and the corresponding data block delay reference value can be set. It is 50ms.
(4)根据所述上一个统计周期内的所述IO的平均数据块大小、平均数据块时延、数据块大小的基准值、对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度。(4) calculating the last statistic according to the average data block size, average data block delay, reference value of data block size, and corresponding reference value of data block delay of the IO in the previous statistical period IO load strength during the cycle.
举例而言,假设上一个统计周期内的所述IO的平均数据块大小为X、平均数据块时延为Y、数据块大小的基准值为M、对应的数据块时延的基准值为N,则所述上一个统计周期内的IO负载强度的计算公式为:
Figure PCTCN2018108128-appb-000008
For example, assuming that the average data block size of the IO in the previous statistical period is X, the average data block delay is Y, the reference value of the data block size is M, and the reference value of the corresponding data block delay is N , The calculation formula of the IO load intensity in the previous statistical period is:
Figure PCTCN2018108128-appb-000008
(5)根据所述上一个统计周期内的IO负载强度,利用预先训练好的负载分类模型确定所述上一个统计周期内的IO负载类别。(5) According to the IO load intensity in the last statistical period, use a pre-trained load classification model to determine the IO load category in the last statistical period.
优选地,所述负载分类模型包括,但不限于:支持向量机(Support Vector Machine,SVM)模型。将所述上一个统计周期内的IO的平均数据块大小、所述上一个统计周期内的IO的平均数据块时延、所述上一个统计周期内的IO负载强度作为所述负载分类模型的输入,经过所述负载分类模型计算后,输出上一个统计周期内的IO负载类别。Preferably, the load classification model includes, but is not limited to, a Support Vector Machine (SVM) model. Using the average data block size of the IO in the last statistical period, the average data block delay of the IO in the last statistical period, and the IO load intensity in the last statistical period as the load classification model The input is calculated by the load classification model, and the IO load category in the previous statistical period is output.
在本申请的优选实施例中,所述负载分类模型的训练过程包括:In a preferred embodiment of the present application, the training process of the load classification model includes:
1)获取正样本的IO负载数据及负样本的IO负载数据,并将正样本的IO负载数据标注负载类别,以使正样本的IO负载数据携带IO负载类别标签。1) Obtain the IO load data of the positive sample and the IO load data of the negative sample, and label the IO load data of the positive sample with the load category, so that the IO load data of the positive sample carries the IO load category label.
例如,分别选取500个高负载类别、正常负载类别、低负载类别对应的IO负载数据,并对每个IO负载数据标注类别,可以以“1”作为高负载的IO数据标签,以“2”作为正常负载的IO数据标签,以“3”作为低负载的IO数据标签。For example, select 500 IO load data corresponding to the high load category, normal load category, and low load category, and label the category of each IO load data. You can use "1" as the high load IO data label and "2" As a normal load IO data tag, "3" is used as a low load IO data tag.
2)将所述正样本的IO负载数据及所述负样本的IO负载数据随机分成第一预设比例的训练集和第二预设比例的验证集,利用所述训练集训练所述负载分类模型,并利用所述验证集验证训练后的所述负载分类模型的准确率。2) Randomly divide the IO load data of the positive sample and the IO load data of the negative sample into a training set of a first preset ratio and a verification set of a second preset ratio, and use the training set to train the load classification Model, and use the validation set to verify the accuracy of the load classification model after training.
先将不同负载类别的训练集中的训练样本分发到不同的文件夹里。例如,将高负载类别的训练样本分发到第一文件夹里、正常负载类别的训练样本分发到第二文件夹里、低负载类别的训练样本分发到第三文件夹里。然后从不同的文件夹里分别提取第一预设比例(例如,70%)的训练样本作为总的训练样本进行负载分类模型的训练,从不同的文件夹里分别取剩余第二预设比例(例如,30%)的训练样本作为总的测试样本对训练完成的所述负载分类模型进行准确性验证。First distribute the training samples in the training sets of different load categories to different folders. For example, training samples of high load category are distributed to the first folder, training samples of normal load category are distributed to the second folder, and training samples of low load category are distributed to the third folder. Then extract training samples of the first preset ratio (for example, 70%) from different folders as the total training samples to train the load classification model, and take the remaining second preset ratios from different folders ( For example, 30%) of the training samples are used as the total test samples to verify the accuracy of the load classification model that has been trained.
3)若所述准确率大于或者等于预设准确率时,则结束训练,以训练后的所述负载分类模型作为分类器识别所述当前统计周期内的IO负载类别;若所述准确率小于预设准确率时,则增加正样本数量及负样本数量以重新训练所述负载分类模型直至所述准确率大于或者等于预设准确率。3) If the accuracy rate is greater than or equal to a preset accuracy rate, end training, and use the trained load classification model as a classifier to identify the IO load category in the current statistical period; if the accuracy rate is less than When the accuracy is preset, the number of positive samples and the number of negative samples are increased to retrain the load classification model until the accuracy is greater than or equal to the preset accuracy.
S15、根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值。S15. Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period.
所述流控是指流量控制。流控的实现方法包括以下两种:一种是通过路由器、交换机的QoS模块实现基于源地址、目的地址、源端口、目的端口以及协议类型的流量控制;另一种是通过专业的流控设备实现基于应用层的流控。The flow control refers to flow control. There are two methods for implementing flow control: one is to implement flow control based on source address, destination address, source port, destination port, and protocol type through the QoS module of routers and switches; the other is to use professional flow control equipment Implement application-based flow control.
恢复周期内的每一个统计周期可以对应一个流控阈值。每一个统计周期对应的流控阈值是动态调整的,当前统计周期对应的流控阈值可以根据上一个统计周期内的IO负载类别计算得到,下一个统计周期对应的流控阈值可以根据当前统计周期内的IO负载类别计算得到。Each statistical period in the recovery period can correspond to a flow control threshold. The flow control threshold corresponding to each statistical cycle is dynamically adjusted. The flow control threshold corresponding to the current statistical cycle can be calculated based on the IO load category in the previous statistical cycle. The flow control threshold corresponding to the next statistical cycle can be calculated according to the current statistical cycle. Calculated within the IO load category.
需要说明的是,本申请的恢复周期内的第一个统计周期对应的流控阈值为预先设置的流控阈值,可以由存储系统的管理者根据经验预先设置。即,在采用一个预设的流控阈值作为恢复周期内的第一个统计周期的流控阈值,根据第一个统计同期内的IO负载类别计算第二个统计周期对应的流控阈值;根据第二个统计同期内的IO负载类别计算第三个统计周期对应的流控阈值;以此类推。It should be noted that the flow control threshold corresponding to the first statistical period in the recovery period of this application is a preset flow control threshold, which can be preset by the administrator of the storage system based on experience. That is, when a preset flow control threshold is used as the flow control threshold of the first statistical period in the recovery period, the flow control threshold corresponding to the second statistical period is calculated according to the IO load category in the first statistical period; according to The IO load category in the second statistical period calculates the flow control threshold corresponding to the third statistical period; and so on.
具体的,所述根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值可以包括:Specifically, calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period may include:
1)当所述上一个统计周期内的IO负载类别为高负载类别时,将所述上一个统计周期对应的流控阈值降低第一预设幅度,得到当前统计周期对应的流控阈值。1) When the IO load category in the previous statistical cycle is a high load category, reduce the flow control threshold corresponding to the previous statistical cycle by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical cycle.
在上一个统计周期内的IO负载为高负载时,按照所述第一预设幅度降低 流控阈值,以在当前统计周期内以低流控阈值对所述存储节点的数据执行恢复操作,通过降低数据恢复的速度来保证用户应用的高效访问。When the IO load in the previous statistical period is a high load, the flow control threshold is reduced according to the first preset amplitude, so as to perform a recovery operation on the data of the storage node with a low flow control threshold in the current statistical period. Reduce the speed of data recovery to ensure efficient access to user applications.
在本申请的优选实施例中,所述第一预设幅度可以是上一个统计周期对应的流控阈值的1/2。即当前统计周期对应的流控阈值为上一个统计周期对应的流控阈值的1/2,下一个统计周期对应的流控阈值为当前统计周期对应的流控阈值的1/2。In a preferred embodiment of the present application, the first preset amplitude may be 1/2 of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
2)当所述上一个统计周期内的IO负载类别为低负载类别时,将所述上一个统计周期对应的流控阈值提高第二预设幅度,得到下一个统计周期对应的流控阈值。2) When the IO load category in the previous statistical cycle is a low load category, increase the flow control threshold corresponding to the previous statistical cycle by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical cycle.
在上一个统计周期内的IO负载为低负载时,按照所述第二预设幅度提高流控阈值,以在当前统计周期内以高流控阈值对所述存储节点的数据执行恢复操作,在保证用户应用的访问质量的基础上,提高数据恢复的速度。When the IO load in the previous statistical period is low, the flow control threshold is increased according to the second preset amplitude to perform a recovery operation on the data of the storage node with a high flow control threshold in the current statistical period. On the basis of ensuring the access quality of user applications, the speed of data recovery is improved.
在本申请的优选实施例中,所述第二预设幅度可以是上一个统计周期对应的流控阈值的1.5倍。即当前统计周期对应的流控阈值为上一个统计周期对应的流控阈值的1.5倍,下一个统计周期对应的流控阈值为当前统计周期对应的流控阈值的1.5倍。In a preferred embodiment of the present application, the second preset amplitude may be 1.5 times a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times the flow control threshold corresponding to the current statistical period.
3)当所述上一个统计周期内的IO负载类别为正常负载类别时,将所述上一个统计周期对应的流控阈值作为当前统计周期对应的流控阈值。3) When the IO load category in the previous statistical cycle is a normal load category, the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
S16、根据所述存储列表及当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作。S16. Perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
S17、判断是否对所述发生故障的存储节点的所有统计周期内的数据执行恢复了操作。S17. Determine whether a recovery operation is performed on data in all statistical periods of the faulty storage node.
当确定对所述发生故障的存储节点的所有统计周期内的数据执行恢复了操作时,流程结束;当确定未对所述发生故障的存储节点的所有统计周期内的数据执行恢复了操作时,返回执行上述步骤S14。When it is determined that a recovery operation is performed on data in all statistical periods of the failed storage node, the process ends; when it is determined that a recovery operation is not performed on data in all statistical periods of the failed storage node, Return to step S14 described above.
综上所述,本申请所述的自适应的数据恢复流控方法,定期同步分布式存储系统中的各个存储节点的信息;当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;识别上一个统计周期内用户应用的IO负载类别;根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;根据所述存储列表及当前统计周期对应的流控阈值,对所述发生故障的存储节点的当前统计周期内的数据执行恢复操作,直至对所述发生故障的存储节点的所有统计周期内的数据执行恢复操作。本申请能够通过将一个恢复周期分割成多个统计周期,在每一个统计周期内,根据上一个统计周期内用户应用的IO负载类别动态调整当前统计周期内对应的流控阈值,根据不同的流控阈值对当前统计周期内的数据进行恢复操作。在上一个统计周期内用户应用的IO负载强度高的时候,降低当前统计周期内故障恢复的流控阈值,从而达到降低故障恢复强度,保证业务IO负载的目的;在上一个统计周期内用户应用的IO负载强度低的时候,提高当前统计周期内故障恢复的流控阈值,从而达到提高故障恢复强度,尽快将分布式存储系统恢复到健康状态的目标。即本申请在提高大规模分布式存储系统数据修复效率、降低数据丢失 风险的同时,能够避免对正常输入输出业务性能造成明显冲击,具有很好的流控效果。In summary, the adaptive data recovery flow control method described in this application periodically synchronizes information of each storage node in a distributed storage system; when a failure of a storage node is detected, the failed storage is acquired Node's storage list; identify the IO load category of the user application in the previous statistical period; calculate the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period; according to the storage list and the flow corresponding to the current statistical period Control a threshold, and perform a recovery operation on data in the current statistical period of the failed storage node until a recovery operation is performed on data in all statistical periods of the failed storage node. This application can divide a recovery period into multiple statistical periods. In each statistical period, dynamically adjust the corresponding flow control threshold in the current statistical period according to the IO load category applied by the user in the previous statistical period. Control the threshold to restore the data in the current statistical period. When the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load. In the previous statistical period, user applications When the I / O load intensity is low, increase the flow control threshold for fault recovery in the current statistical period, so as to achieve the goal of increasing the fault recovery intensity and recovering the distributed storage system to a healthy state as soon as possible. That is, this application can improve the data recovery efficiency of the large-scale distributed storage system and reduce the risk of data loss, while avoiding a significant impact on the performance of normal input and output services, and has a good flow control effect.
其次,当前统计周期内对应的流控阈值是根据上一个统计周期内用户应用的IO负载类别自动进行动态调整,不需管理者手动调节,减少了管理者的工作量,避免了因管理者的主观因素导致的调整不精准的问题,能够随着分布式存储系统系统及其硬件设施的变化进行动态调整,可靠性强。Secondly, the corresponding flow control threshold in the current statistical cycle is automatically adjusted dynamically according to the IO load category of the user application in the previous statistical cycle, without manual adjustment by the manager, which reduces the workload of the manager and avoids The problem of inaccurate adjustment caused by subjective factors can be dynamically adjusted with changes in the distributed storage system system and its hardware facilities, and has high reliability.
以上所述,仅是本申请的具体实施方式,但本申请的保护范围并不局限于此,对于本领域的普通技术人员来说,在不脱离本申请创造构思的前提下,还可以做出改进,但这些均属于本申请的保护范围。The foregoing is only a specific implementation of this application, but the scope of protection of this application is not limited to this. For those of ordinary skill in the art, without departing from the creative concept of this application, they can also make Improvement, but these all belong to the protection scope of this application.
下面结合第2至3图,分别对实现上述自适应的数据恢复流控方法的电子设备的功能模块及硬件结构进行介绍。In the following, the functional modules and hardware structures of the electronic devices that implement the above-mentioned adaptive data recovery flow control method are described with reference to Figures 2 to 3.
实施例二Example two
图2为本申请自适应的数据恢复流控装置较佳实施例中的功能模块图。FIG. 2 is a functional module diagram of a preferred embodiment of an adaptive data recovery flow control device of the present application.
在一些实施例中,所述自适应的数据恢复流控装置20(下文简称为“数据恢复流控装置20”)运行于电子设备中。所述数据恢复流控装置20可以包括多个由程序代码段所组成的功能模块。所述数据恢复流控装置20中的各个程序段的程序代码可以存储于存储器中,并由至少一个处理器所执行,以执行(详见图1及其相关描述)自适应的数据恢复流控方法。In some embodiments, the adaptive data recovery flow control device 20 (hereinafter referred to as "data recovery flow control device 20") runs in an electronic device. The data recovery flow control device 20 may include a plurality of functional modules composed of program code segments. The program code of each program segment in the data recovery flow control device 20 may be stored in a memory and executed by at least one processor to execute (see FIG. 1 and related description for details) adaptive data recovery flow control. method.
本实施例中,所述电子设备的数据恢复流控装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:同步模块201、侦测模块202、获取模块203、识别模块204、训练模块205、计算模块206/恢复模块207及判断模块208。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器中。在一些实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the data recovery flow control device 20 of the electronic device may be divided into a plurality of functional modules according to functions performed by the device. The functional modules may include a synchronization module 201, a detection module 202, an acquisition module 203, an identification module 204, a training module 205, a calculation module 206 / recovery module 207, and a judgment module 208. The module referred to in the present application refers to a series of computer-readable instruction segments capable of being executed by at least one processor and capable of performing fixed functions, which are stored in a memory. In some embodiments, functions of each module will be described in detail in subsequent embodiments.
同步模块201,用于定期同步分布式存储系统中的各个存储节点的信息。The synchronization module 201 is configured to periodically synchronize information of each storage node in the distributed storage system.
本申请较佳实施例中,所述分布式存储系统(下文简称为存储系统)采用集群存储的方式进行数据分布式存储。In a preferred embodiment of the present application, the distributed storage system (hereinafter referred to as a storage system) adopts a cluster storage method for distributed data storage.
所述分布式存储是一种数据存储技术,其通过网络,使用集群中的每台存储系统上的剩余的磁盘空间,并将这些分散的剩余的磁盘空间的存储资源整合起来,构成一个虚拟的存储设备,将数据分散的存储在集群的各个角落。The distributed storage is a data storage technology that uses the remaining disk space on each storage system in the cluster through the network and integrates the storage resources of these scattered remaining disk spaces to form a virtual Storage device, which stores data in various corners of the cluster.
因此,本申请所述的各个存储节点为所述集群中的每个子存储系统。例如,所述存储节点可以是一个存储服务器、一台计算机或者一个存储设备。Therefore, each storage node described in this application is each sub storage system in the cluster. For example, the storage node may be a storage server, a computer, or a storage device.
在本申请较佳实施例中,所述同步模块201同步分布式存储系统中的各个存储节点的信息可以包括:1)由所述存储系统中的一个存储中心执行各个存储节点的信息同步;或者2)采用去中心化的方法,由所述存储系统中的任何一个存储节点发起各个存储节点的信息同步。In a preferred embodiment of the present application, the synchronization module 201 synchronizing information of each storage node in the distributed storage system may include: 1) a storage center in the storage system performs information synchronization of each storage node; or 2) Using a decentralized method, any one storage node in the storage system initiates information synchronization of each storage node.
所述各个存储节点的信息的同步可以包括,但不限于:CPU、内存、磁盘空闲空间及存储文件列表等的同步。The synchronization of the information of each storage node may include, but is not limited to, synchronization of a CPU, a memory, a disk free space, and a list of stored files.
本申请较佳实施例中,所述存储文件列表中记录有每个存储节点中所存储的数据的名称、大小、位置等信息。In a preferred embodiment of the present application, the storage file list records information such as the name, size, and location of data stored in each storage node.
侦测模块202,用于侦测是否有存储节点发生了故障。The detection module 202 is configured to detect whether a storage node has failed.
在本申请较佳实施例中,所述存储节点发生故障可以是存储系统中的任何一个或者多个存储节点无法启动、断电或断网等,也可以是所述存储系统中的任何一个或者多个存储节点中的磁盘发生了故障等。因而,所述侦测模块202侦测是否有存储节点发生故障包括:侦测所述存储系统中的任何一个或者多个存储节点是否发生了无法启动、断电或断网等,或者侦测所述存储系统中的任何一个或者多个存储节点中的磁盘是否发生了故障等。In the preferred embodiment of the present application, the failure of the storage node may be that any one or more storage nodes in the storage system cannot be started, powered off, or disconnected from the network, or any one of the storage systems or Disks in multiple storage nodes have failed, etc. Therefore, the detection module 202 detects whether a storage node has failed, including: detecting whether any one or more storage nodes in the storage system have failed to start, power off, or disconnected from the network; Describes whether the disks in any one or more storage nodes in the storage system have failed.
当所述存储系统中的任何一个存储节点发生了无法启动、断电、断网等故障时,所述故障存储节点会与其他存储节点及/或存储中心断开连接,因此,所述其他存储节点及/或存储中心可以侦测到有存储节点发生了故障。When any one of the storage nodes in the storage system fails, such as failure to start, power off, or network disconnection, the failed storage node is disconnected from other storage nodes and / or storage centers. Therefore, the other storage nodes The node and / or storage center can detect that a storage node has failed.
当所述存储系统中的任何一个存储节点中的磁盘发生故障时,所述故障存储节点发送给其他存储节点及/或存储中心的同步信息中会包含所述磁盘的故障信息,因此,所述其他存储节点及/或存储中心可以侦测到有存储节点发生了故障。When a disk in any storage node in the storage system fails, the synchronization information sent by the failed storage node to other storage nodes and / or storage centers will include the failure information of the disk. Other storage nodes and / or storage centers can detect that a storage node has failed.
获取模块203,用于当所述侦测模块202侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表。An obtaining module 203 is configured to obtain a storage list of a storage node that has failed when the detection module 202 detects that a storage node has failed.
在本申请较佳实施例中,获取发生故障的存储节点的存储列表包括获取发生故障的存储节点中所存储的数据的名称、大小、位置等信息。In the preferred embodiment of the present application, obtaining the storage list of the storage node that has failed includes obtaining information such as the name, size, and location of data stored in the storage node that has failed.
识别模块204,用于识别上一个统计周期内用户应用的IO负载类别。The identification module 204 is configured to identify an IO load category of a user application in a previous statistical period.
将存储节点的数据从发生故障到完成故障恢复的整个过程称之为一个恢复周期。一个恢复周期可以包括多个统计周期,一个统计周期可以为一个预设时间段,例如,一个统计周期设置为1秒钟。The entire process of storage node data from failure to complete recovery is called a recovery cycle. A recovery period may include multiple statistical periods, and a statistical period may be a preset time period. For example, a statistical period is set to 1 second.
在本申请较佳实施例中,所述IO负载类别包括:高负载类别、正常负载类别、低负载类别。In a preferred embodiment of the present application, the IO load category includes: a high load category, a normal load category, and a low load category.
具体地,所述识别模块204识别上一个统计周期内用户应用的IO负载类别可以包括:Specifically, the identification module 204 identifying the IO load category of the user application in the previous statistical period may include:
(1)获取上一个统计周期内用户应用的每一个IO的数据块大小,计算所述上一个统计周期内的IO的平均数据块大小。(1) Obtain the data block size of each IO applied by the user in the previous statistical period, and calculate the average data block size of the IO in the previous statistical period.
所述上一个统计周期内的IO的平均数据块大小可以采用算术平均值算法、几何平均数算法,或者均方根平均数算法来计算。The average data block size of the IO in the last statistical period may be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm.
所述算术平均值算法的公式为:
Figure PCTCN2018108128-appb-000009
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the arithmetic mean algorithm is:
Figure PCTCN2018108128-appb-000009
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
所述几何平均数算法的公式为:
Figure PCTCN2018108128-appb-000010
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the geometric mean algorithm is:
Figure PCTCN2018108128-appb-000010
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
所述均方根平均数算法的公式为:
Figure PCTCN2018108128-appb-000011
其中,N为IO的数据块的个数,S i为每个IO的数据块大小。
The formula of the root mean square algorithm is:
Figure PCTCN2018108128-appb-000011
Among them, N is the number of data blocks of IO, and S i is the data block size of each IO.
举例而言,假设检测到上一个统计周期内,用户应用共有十次IO,十次IO的数据块大小分别为:2M,1M,3M,0.5M,10M,4M,0.1M,1.2M, 5M以及8M。For example, suppose that during the last statistical period, the user application has a total of ten IOs. The data block sizes of the ten IOs are: 2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M And 8M.
利用所述算术平均值算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000012
Figure PCTCN2018108128-appb-000013
Calculating the average data block size of the IO in the previous statistical period by using the arithmetic average algorithm is:
Figure PCTCN2018108128-appb-000012
Figure PCTCN2018108128-appb-000013
利用所述几何平均数算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000014
Calculating the average data block size of the IO in the previous statistical period by using the geometric average algorithm is:
Figure PCTCN2018108128-appb-000014
利用所述均方根平均数算法计算所述上一个统计周期内的IO的平均数据块大小为:
Figure PCTCN2018108128-appb-000015
Calculating the average data block size of the IO in the previous statistical period by using the root mean square average algorithm is:
Figure PCTCN2018108128-appb-000015
(2)获取所述上一个统计周期内的每个数据块的传输时延,计算所述上一个统计周期内的IO的平均数据块时延。(2) Obtain the transmission delay of each data block in the last statistical period, and calculate the average data block delay of the IO in the last statistical period.
所述传输时延(简称为时延)是指结点在发送数据时使数据块从结点进入到传输媒体所需的时间,即一个发送站点从开始发送数据帧到数据帧发送完毕所需要的全部时间,或者一个接收站点从开始接收数据帧到数据帧接收完毕所需要的全部时间。The transmission delay (referred to as the delay) refers to the time required for a node to enter a data block from the node to the transmission medium when transmitting data, that is, the time required for a sending site to start sending data frames to the completion of data frame transmission The total time required for a receiving station, or the time required for a receiving station to start receiving data frames and finish receiving data frames.
在本申请较佳实施例中,所述数据块的传输时延可以从每个存储节点中安装的一个负载量测工具或者性能监控工具中获取得到。In a preferred embodiment of the present application, the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
如上所述,所述上一个统计周期内的IO的平均数据块时延也可以采用算术平均值算法、几何平均数算法,或者均方根平均数算法来计算。假设,假设检测到上一个统计周期内,十次IO的传输时延分别为:1s、0.8s、1.5s、0.4s、5s、2s、0.02s、0.6s、3s及4.5s,则所述上一个统计周期内的IO平均数据块时延采用算术平均值算法来计算时,其结果为:As described above, the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic average algorithm, a geometric mean algorithm, or a root mean square algorithm. Assume that assuming that the transmission delays of ten IOs in the previous statistical period are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, then When the average IO block delay in the previous statistical period is calculated using the arithmetic mean algorithm, the result is:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s。(1s + 0.8s + 1.5s + 0.4s + 5s + 2s + 0.1s + 0.6s + 3s + 4.4s) = 1.88s.
应当理解的是,若上一个统计周期内的IO的平均数据块大小采用算术平均值算法来计算,则上一个统计周期内的IO的平均数据块时延也采用算术平均值算法来计算;若上一个统计周期内的IO的平均数据块大小采用几何平均数算法来计算,则上一个统计周期内的IO的平均数据块时延也采用几何平均数算法来计算;或者若上一个统计周期内的IO的平均数据块大小采用均方根平均数算法来计算,则上一个统计周期内的IO的平均数据块时延也采用均方根平均数算法来计算。It should be understood that if the average data block size of the IO in the previous statistical period is calculated using the arithmetic average algorithm, the average data block delay of the IO in the previous statistical period is also calculated using the arithmetic average algorithm; if The average data block size of the IO in the previous statistical period is calculated using the geometric mean algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the geometric mean algorithm; or The average data block size of the IO is calculated using the root mean square average algorithm, and the average data block delay of the IO in the previous statistical period is also calculated using the root mean square average algorithm.
(3)获取预先设置的IO的数据块大小的基准值及对应的数据块时延的基准值。(3) Obtain a preset reference value of the data block size of the IO and a reference value of the corresponding data block delay.
在本申请较佳实施例中,所述IO数据块大小的基准值以及对应的数据块时延的基准值可以由存储系统的管理员根据经验预先设置。例如,根据经验,4K的数据块在传输时,时延最小,理想状态下可以达到50ms,则所述IO数据块大小的基准值可以设置为4k,对应的数据块时延的基准值可以设置为50ms。In a preferred embodiment of the present application, the reference value of the size of the IO data block and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is the smallest, and in the ideal state, it can reach 50ms, then the reference value of the IO data block size can be set to 4k, and the corresponding data block delay reference value can be set. It is 50ms.
(4)根据所述上一个统计周期内的所述IO的平均数据块大小、平均数据块时延、数据块大小的基准值、对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度。(4) calculating the last statistic according to the average data block size, average data block delay, reference value of data block size, and corresponding reference value of data block delay of the IO in the previous statistical period IO load strength during the cycle.
举例而言,假设上一个统计周期内的所述IO的平均数据块大小为X、平均数据块时延为Y、数据块大小的基准值为M、对应的数据块时延的基准值为N,则所述上一个统计周期内的IO负载强度的计算公式为:
Figure PCTCN2018108128-appb-000016
For example, assuming that the average data block size of the IO in the previous statistical period is X, the average data block delay is Y, the reference value of the data block size is M, and the reference value of the corresponding data block delay is N , The calculation formula of the IO load intensity in the previous statistical period is:
Figure PCTCN2018108128-appb-000016
(5)根据所述上一个统计周期内的IO负载强度,利用预先训练好的负载分类模型确定所述上一个统计周期内的IO负载类别。(5) According to the IO load intensity in the last statistical period, use a pre-trained load classification model to determine the IO load category in the last statistical period.
优选地,所述负载分类模型包括,但不限于:支持向量机(Support Vector Machine,SVM)模型。将所述上一个统计周期内的IO的平均数据块大小、所述上一个统计周期内的IO的平均数据块时延、所述上一个统计周期内的IO负载强度作为所述负载分类模型的输入,经过所述负载分类模型计算后,输出上一个统计周期内的IO负载类别。Preferably, the load classification model includes, but is not limited to, a Support Vector Machine (SVM) model. Using the average data block size of the IO in the last statistical period, the average data block delay of the IO in the last statistical period, and the IO load intensity in the last statistical period as the load classification model The input is calculated by the load classification model, and the IO load category in the previous statistical period is output.
训练模块205,用于训练所述负载分类模型。The training module 205 is configured to train the load classification model.
训练模块205训练所述负载分类模型的过程包括:The process of the training module 205 training the load classification model includes:
1)获取正样本的IO负载数据及负样本的IO负载数据,并将正样本的IO负载数据标注负载类别,以使正样本的IO负载数据携带IO负载类别标签。1) Obtain the IO load data of the positive sample and the IO load data of the negative sample, and label the IO load data of the positive sample with the load category, so that the IO load data of the positive sample carries the IO load category label.
例如,分别选取500个高负载类别、正常负载类别、低负载类别对应的IO负载数据,并对每个IO负载数据标注类别,可以以“1”作为高负载的IO数据标签,以“2”作为正常负载的IO数据标签,以“3”作为低负载的IO数据标签。For example, select 500 IO load data corresponding to the high load category, normal load category, and low load category, and label the category of each IO load data. You can use "1" as the high load IO data label and "2" As a normal load IO data tag, "3" is used as a low load IO data tag.
2)将所述正样本的IO负载数据及所述负样本的IO负载数据随机分成第一预设比例的训练集和第二预设比例的验证集,利用所述训练集训练所述负载分类模型,并利用所述验证集验证训练后的所述负载分类模型的准确率。2) Randomly divide the IO load data of the positive sample and the IO load data of the negative sample into a training set of a first preset ratio and a verification set of a second preset ratio, and use the training set to train the load classification Model, and use the validation set to verify the accuracy of the load classification model after training.
先将不同负载类别的训练集中的训练样本分发到不同的文件夹里。例如,将高负载类别的训练样本分发到第一文件夹里、正常负载类别的训练样本分发到第二文件夹里、低负载类别的训练样本分发到第三文件夹里。然后从不同的文件夹里分别提取第一预设比例(例如,70%)的训练样本作为总的训练样本进行负载分类模型的训练,从不同的文件夹里分别取剩余第二预设比例(例如,30%)的训练样本作为总的测试样本对训练完成的所述负载分类模型进行准确性验证。First distribute the training samples in the training sets of different load categories to different folders. For example, training samples of high load category are distributed to the first folder, training samples of normal load category are distributed to the second folder, and training samples of low load category are distributed to the third folder. Then extract training samples of the first preset ratio (for example, 70%) from different folders as the total training samples to train the load classification model, and take the remaining second preset ratios from different folders ( For example, 30%) of the training samples are used as the total test samples to verify the accuracy of the load classification model that has been trained.
3)若所述准确率大于或者等于预设准确率时,则结束训练,以训练后的所述负载分类模型作为分类器识别所述当前统计周期内的IO负载类别;若所述准确率小于预设准确率时,则增加正样本数量及负样本数量以重新训练所述负载分类模型直至所述准确率大于或者等于预设准确率。3) If the accuracy rate is greater than or equal to a preset accuracy rate, end training, and use the trained load classification model as a classifier to identify the IO load category in the current statistical period; if the accuracy rate is less than When the accuracy is preset, the number of positive samples and the number of negative samples are increased to retrain the load classification model until the accuracy is greater than or equal to the preset accuracy.
计算模块206,用于根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值。The calculation module 206 is configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period.
所述流控是指流量控制。流控的实现方法包括以下两种:一种是通过路由器、交换机的QoS模块实现基于源地址、目的地址、源端口、目的端口以及协议类型的流量控制;另一种是通过专业的流控设备实现基于应用层的流控。The flow control refers to flow control. There are two methods for implementing flow control: one is to implement flow control based on source address, destination address, source port, destination port, and protocol type through the QoS module of routers and switches; the other is to use professional flow control equipment Implement application-based flow control.
恢复周期内的每一个统计周期可以对应一个流控阈值。每一个统计周期 对应的流控阈值是动态调整的,当前统计周期对应的流控阈值可以根据上一个统计周期内的IO负载类别计算得到,下一个统计周期对应的流控阈值可以根据当前统计周期内的IO负载类别计算得到。Each statistical period in the recovery period can correspond to a flow control threshold. The flow control threshold corresponding to each statistical cycle is dynamically adjusted. The flow control threshold corresponding to the current statistical cycle can be calculated based on the IO load category in the previous statistical cycle. The flow control threshold corresponding to the next statistical cycle can be calculated according to the current statistical cycle. Calculated within the IO load category.
需要说明的是,本申请的恢复周期内的第一个统计周期对应的流控阈值为预先设置的流控阈值,可以由存储系统的管理者根据经验预先设置。即,在采用一个预设的流控阈值作为恢复周期内的第一个统计周期的流控阈值,根据第一个统计同期内的IO负载类别计算第二个统计周期对应的流控阈值;根据第二个统计同期内的IO负载类别计算第三个统计周期对应的流控阈值;以此类推。It should be noted that the flow control threshold corresponding to the first statistical period in the recovery period of this application is a preset flow control threshold, which can be preset by the administrator of the storage system based on experience. That is, when a preset flow control threshold is used as the flow control threshold of the first statistical period in the recovery period, the flow control threshold corresponding to the second statistical period is calculated according to the IO load category in the first statistical period; according to The IO load category in the second statistical period calculates the flow control threshold corresponding to the third statistical period; and so on.
具体的,所述计算模块206根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值可以包括:Specifically, the calculating module 206 calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period may include:
1)当所述上一个统计周期内的IO负载类别为高负载类别时,将所述上一个统计周期对应的流控阈值降低第一预设幅度,得到当前统计周期对应的流控阈值。1) When the IO load category in the previous statistical cycle is a high load category, reduce the flow control threshold corresponding to the previous statistical cycle by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical cycle.
在上一个统计周期内的IO负载为高负载时,按照所述第一预设幅度降低流控阈值,以在当前统计周期内以低流控阈值对所述存储节点的数据执行恢复操作,通过降低数据恢复的速度来保证用户应用的高效访问。When the IO load in the previous statistical period is a high load, the flow control threshold is reduced according to the first preset amplitude, so as to perform a recovery operation on the data of the storage node with a low flow control threshold in the current statistical period. Reduce the speed of data recovery to ensure efficient access to user applications.
在本申请的优选实施例中,所述第一预设幅度可以是上一个统计周期对应的流控阈值的1/2。即当前统计周期对应的流控阈值为上一个统计周期对应的流控阈值的1/2,下一个统计周期对应的流控阈值为当前统计周期对应的流控阈值的1/2。In a preferred embodiment of the present application, the first preset amplitude may be 1/2 of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
2)当所述上一个统计周期内的IO负载类别为低负载类别时,将所述上一个统计周期对应的流控阈值提高第二预设幅度,得到下一个统计周期对应的流控阈值。2) When the IO load category in the previous statistical cycle is a low load category, increase the flow control threshold corresponding to the previous statistical cycle by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical cycle.
在上一个统计周期内的IO负载为低负载时,按照所述第二预设幅度提高流控阈值,以在当前统计周期内以高流控阈值对所述存储节点的数据执行恢复操作,在保证用户应用的访问质量的基础上,提高数据恢复的速度。When the IO load in the previous statistical period is low, the flow control threshold is increased according to the second preset amplitude to perform a recovery operation on the data of the storage node with a high flow control threshold in the current statistical period. On the basis of ensuring the access quality of user applications, the speed of data recovery is improved.
在本申请的优选实施例中,所述第二预设幅度可以是上一个统计周期对应的流控阈值的1.5倍。即当前统计周期对应的流控阈值为上一个统计周期对应的流控阈值的1.5倍,下一个统计周期对应的流控阈值为当前统计周期对应的流控阈值的1.5倍。In a preferred embodiment of the present application, the second preset amplitude may be 1.5 times a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times the flow control threshold corresponding to the current statistical period.
3)当所述上一个统计周期内的IO负载类别为正常负载类别时,将所述上一个统计周期对应的流控阈值作为当前统计周期对应的流控阈值。3) When the IO load category in the previous statistical cycle is a normal load category, the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
恢复模块207,用于根据所述存储列表及当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作。The recovery module 207 is configured to perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
判断模块208,用于判断是否对所述发生故障的存储节点的所有统计周期内的数据执行恢复了操作。The determining module 208 is configured to determine whether a recovery operation is performed on data in all statistical periods of the faulty storage node.
当所述判断模块208确定未对所述发生故障的存储节点的所有统计周期内的数据执行恢复了操作时,返回执行上述识别模块204。When the judging module 208 determines that the recovery operation is not performed on the data in all the statistical cycles of the failed storage node, it returns to execute the aforementioned identifying module 204.
综上所述,本申请所述的自适应的数据恢复流控装置,同步模块201定期 同步分布式存储系统中的各个存储节点的信息;获取模块203在侦测模块202侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;识别模块204识别上一个统计周期内用户应用的IO负载类别;计算模块206根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;恢复模块207根据所述存储列表及当前统计周期对应的流控阈值,对所述发生故障的存储节点的当前统计周期内的数据执行恢复操作,直至对所述发生故障的存储节点的所有统计周期内的数据执行恢复操作。本申请能够通过将一个恢复周期分割成多个统计周期,在每一个统计周期内,根据上一个统计周期内用户应用的IO负载类别动态调整当前统计周期内对应的流控阈值,根据不同的流控阈值对当前统计周期内的数据进行恢复操作。在上一个统计周期内用户应用的IO负载强度高的时候,降低当前统计周期内故障恢复的流控阈值,从而达到降低故障恢复强度,保证业务IO负载的目的;在上一个统计周期内用户应用的IO负载强度低的时候,提高当前统计周期内故障恢复的流控阈值,从而达到提高故障恢复强度,尽快将分布式存储系统恢复到健康状态的目标。即本申请在提高大规模分布式存储系统数据修复效率、降低数据丢失风险的同时,能够避免对正常输入输出业务性能造成明显冲击,具有很好的流控效果。In summary, in the adaptive data recovery flow control device described in this application, the synchronization module 201 periodically synchronizes information of each storage node in the distributed storage system; the acquisition module 203 detects a storage node in the detection module 202 When a failure occurs, obtain the storage list of the storage node that failed; the identification module 204 identifies the IO load category of the user application in the previous statistical period; the calculation module 206 calculates the corresponding IO load category in the previous statistical period Flow control threshold; the recovery module 207 performs a recovery operation on data in the current statistical period of the failed storage node according to the storage list and the flow control threshold corresponding to the current statistical period, until the failed storage node Perform recovery operations on all data in the statistical period. This application can divide a recovery period into multiple statistical periods. In each statistical period, dynamically adjust the corresponding flow control threshold in the current statistical period according to the IO load category applied by the user in the previous statistical period. Control the threshold to restore the data in the current statistical period. When the IO load of user applications in the previous statistical period is high, reduce the flow control threshold for fault recovery in the current statistical period, so as to reduce the intensity of fault recovery and ensure the business IO load. In the previous statistical period, user applications When the I / O load intensity is low, increase the flow control threshold for fault recovery in the current statistical period, so as to achieve the goal of increasing the fault recovery intensity and recovering the distributed storage system to a healthy state as soon as possible. That is, this application can improve the data recovery efficiency of a large-scale distributed storage system and reduce the risk of data loss, while avoiding a significant impact on normal I / O business performance, and has a good flow control effect.
其次,当前统计周期内对应的流控阈值是根据上一个统计周期内用户应用的IO负载类别自动进行动态调整,不需管理者手动调节,减少了管理者的工作量,避免了因管理者的主观因素导致的调整不精准的问题,能够随着分布式存储系统系统及其硬件设施的变化进行动态调整,可靠性强。Secondly, the corresponding flow control threshold in the current statistical cycle is automatically adjusted dynamically according to the IO load category of the user application in the previous statistical cycle, without manual adjustment by the manager, which reduces the workload of the manager and avoids The problem of inaccurate adjustment caused by subjective factors can be dynamically adjusted with changes in the distributed storage system system and its hardware facilities, and has high reliability.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个非易失性可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,双屏设备,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above integrated unit implemented in the form of a software functional module may be stored in a non-volatile readable storage medium. The above software function module is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a dual-screen device, or a network device) or a processor to execute the embodiments described in this application. Part of the method.
实施例三Example three
图3为本申请实施例五提供的电子设备的示意图。FIG. 3 is a schematic diagram of an electronic device provided in Embodiment 5 of the present application.
所述电子设备3包括:存储器31、至少一个处理器32、存储在所述存储器31中并可在所述至少一个处理器32上运行的计算机可读指令33及至少一条通讯总线34。The electronic device 3 includes a memory 31, at least one processor 32, computer-readable instructions 33 stored in the memory 31 and executable on the at least one processor 32, and at least one communication bus 34.
所述至少一个处理器32执行所述计算机可读指令33时实现上述自适应的数据恢复流控方法实施例中的步骤。When the at least one processor 32 executes the computer-readable instructions 33, the steps in the foregoing embodiment of the adaptive data recovery flow control method are implemented.
示例性的,所述计算机可读指令33可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器31中,并由所述至少一个处理器32执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所述计算机可读指令33在所述电子设备3中的执行过程。Exemplarily, the computer-readable instructions 33 may be divided into one or more modules / units, and the one or more modules / units are stored in the memory 31 and processed by the at least one processor 32 Execute to complete this application. The one or more modules / units may be a series of computer-readable instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 33 in the electronic device 3.
所述电子设备3可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。本领域技术人员可以理解,所述示意图3仅仅是电子设备3的示例,并不构成对电子设备3的限定,可以包括比图示更多或更少的部件, 或者组合某些部件,或者不同的部件,例如所述电子设备3还可以包括输入输出设备、网络接入设备、总线等。The electronic device 3 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. Those skilled in the art may understand that the schematic diagram 3 is only an example of the electronic device 3, and does not constitute a limitation on the electronic device 3. It may include more or less components than shown in the figure, or some components may be combined or different For example, the electronic device 3 may further include an input / output device, a network access device, a bus, and the like.
所述至少一个处理器32可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。该处理器32可以是微处理器或者该处理器32也可以是任何常规的处理器等,所述处理器32是所述电子设备3的控制中心,利用各种接口和线路连接整个电子设备3的各个部分。The at least one processor 32 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), and application-specific integrated circuits (ASICs). ), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The processor 32 may be a microprocessor or the processor 32 may be any conventional processor. The processor 32 is a control center of the electronic device 3, and uses various interfaces and lines to connect the entire electronic device 3. The various parts.
所述存储器31可用于存储所述计算机可读指令33和/或模块/单元,所述处理器32通过运行或执行存储在所述存储器31内的计算机可读指令和/或模块/单元,以及调用存储在存储器31内的数据,实现所述电子设备3的各种功能。所述存储器31可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备3的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器31可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 31 may be configured to store the computer-readable instructions 33 and / or modules / units, and the processor 32 may execute or execute the computer-readable instructions and / or modules / units stored in the memory 31, and The data stored in the memory 31 is called to implement various functions of the electronic device 3. The memory 31 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, application programs required for at least one function (such as a sound playback function, an image playback function, etc.), etc .; Data (such as audio data, phone book, etc.) created according to the use of the electronic device 3 are stored. In addition, the memory 31 may include a high-speed random access memory, and may also include a non-volatile memory, such as a hard disk, an internal memory, a plug-in hard disk, a Smart Media Card (SMC), and a Secure Digital (SD). Card, flash memory card (Flash card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
所述电子设备3集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个非易失性可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机可读指令可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述非易失性可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述非易失性可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,非易失性可读介质不包括电载波信号和电信信号。When the integrated module / unit of the electronic device 3 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a non-volatile readable storage medium. Based on this understanding, this application implements all or part of the processes in the methods of the above embodiments, and can also be completed by computer-readable instructions to instruct related hardware. The computer-readable instructions can be stored in a non-volatile memory. In the read storage medium, when the computer-readable instructions are executed by a processor, the steps of the foregoing method embodiments can be implemented. The computer-readable instructions may be in a source code form, an object code form, an executable file, or some intermediate form. The non-volatile readable medium may include: any entity or device capable of carrying the computer-readable instruction code, a recording medium, a U disk, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), electric carrier signals, telecommunication signals, and software distribution media. It should be noted that the content contained in the non-volatile readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practices in the jurisdictions. For example, in some jurisdictions, according to legislation and patent practices, non- Volatile readable media does not include electrical carrier signals and telecommunication signals.
在本申请所提供的几个实施例中,应该理解到,所揭露的电子设备和方法,可以通过其它的方式实现。例如,以上所描述的电子设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed electronic device and method may be implemented in other ways. For example, the embodiments of the electronic device described above are merely schematic. For example, the division of the units is only a logical function division, and there may be another division manner in actual implementation.
另外,在本申请各个实施例中的各功能单元可以集成在相同处理单元中, 也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在相同单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated in the same processing unit, or each unit may exist separately physically, or two or more units may be integrated in the same unit. The integrated unit can be implemented in the form of hardware, or in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。It is obvious to a person skilled in the art that the present application is not limited to the details of the above exemplary embodiments, and that the present application can be implemented in other specific forms without departing from the spirit or basic features of the application. Therefore, the embodiments are to be regarded as exemplary and non-limiting in every respect. The scope of the present application is defined by the appended claims rather than the above description, and therefore is intended to fall within the claims. All changes within the meaning and scope of the equivalent requirements are included in this application. Any reference signs in the claims should not be construed as limiting the claims involved. Furthermore, it is clear that the word "comprising" does not exclude other units or that the singular does not exclude the plural. A plurality of units or devices stated in the system claims may also be implemented by one unit or device through software or hardware. Words such as first and second are used to indicate names, but not in any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present application and are not limiting. Although the present application has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solution of the present application can be Modifications or equivalent replacements are made without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种自适应的数据恢复流控方法,其特征在于,所述方法包括:An adaptive data recovery flow control method is characterized in that the method includes:
    a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
    b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
    c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
    d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
    e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
    f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
    重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期内的数据执行了恢复操作。The foregoing steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
  2. 如权利要求1所述的方法,其特征在于,所述根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The method according to claim 1, wherein calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period comprises:
    采用预先设置的流控阈值作为第一个统计周期对应的流控阈值。The preset flow control threshold is used as the flow control threshold corresponding to the first statistical period.
  3. 如权利要求1所述的方法,其特征在于,所述识别上一个统计周期内用户应用的IO负载类别包括:The method according to claim 1, wherein the identifying an IO load category of a user application in a previous statistical period comprises:
    获取所述上一个统计周期内用户应用的每一个IO的数据块大小,计算所述上一个统计周期内的IO的平均数据块大小;Acquiring the data block size of each IO applied by the user in the last statistical period, and calculating the average data block size of the IO in the last statistical period;
    获取所述上一个统计周期内的每个数据块的传输时延,计算所述上一个统计周期内的IO的平均数据块时延;Acquiring the transmission delay of each data block in the last statistical period, and calculating the average data block delay of the IO in the last statistical period;
    获取预先设置的IO的数据块大小的基准值及对应的数据块时延的基准值;Obtaining a preset reference value of the data block size of the IO and a corresponding reference value of the data block delay;
    根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小的基准值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度;Calculating the IO according to an average data block size of the IO, the average data block delay, a reference value of the data block size, and a reference value of the corresponding data block delay in the previous statistical period IO load intensity in the last statistical period;
    根据所述上一个统计周期内的IO负载强度,利用预先训练好的负载分类模型确定所述上一个统计周期内的IO负载类别。According to the IO load intensity in the last statistical period, a pre-trained load classification model is used to determine the IO load category in the last statistical period.
  4. 如权利要求1所述的方法,其特征在于,所述IO负载类别包括:高负载类别、正常负载类别、低负载类别,所述根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The method according to claim 1, wherein the IO load category comprises: a high load category, a normal load category, and a low load category, and the calculating the corresponding one of the current statistical cycle according to the IO load category in the previous statistical cycle. Flow control thresholds include:
    当所述上一个统计周期内的IO负载类别为高负载类别时,将所述上一个统计周期对应的流控阈值降低第一预设幅度,得到当前统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a high load category, reducing the flow control threshold corresponding to the previous statistical cycle by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical cycle;
    当所述上一个统计周期内的IO负载类别为低负载类别时,将所述上一个统计周期对应的流控阈值提高第二预设幅度,得到下一个统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a low load category, increasing the flow control threshold corresponding to the previous statistical cycle by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical cycle;
    当所述上一个统计周期内的IO负载类别为正常负载类别时,将所述上一个统计周期对应的流控阈值作为当前统计周期对应的流控阈值。When the IO load category in the previous statistical cycle is a normal load category, the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
  5. 如权利要求3所述的方法,其特征在于,所述根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小的基准 值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度的计算公式为:
    Figure PCTCN2018108128-appb-100001
    其中,X为上述上一个统计周期内的所述IO的平均数据块大小,Y为所述平均数据块时延,M为所述数据块大小的基准值,N为所述对应的数据块时延的基准值。
    The method according to claim 3, wherein, according to the average data block size of the IO, the average data block delay, a reference value of the data block size, For the reference value of the corresponding data block delay, a calculation formula for calculating the IO load intensity in the previous statistical period is:
    Figure PCTCN2018108128-appb-100001
    Where X is the average data block size of the IO in the previous statistical period, Y is the average data block delay, M is the reference value of the data block size, and N is the corresponding data block. The benchmark value of the extension.
  6. 如权利要求1所述的方法,其特征在于,所述侦测是否有存储节点发生故障包括:The method of claim 1, wherein the detecting whether a storage node fails includes:
    侦测所述分布式存储系统中的任何一个或者多个存储节点是否发生了无法启动、断电或断网;或者Detecting whether any one or more storage nodes in the distributed storage system cannot be started, powered off, or disconnected from the network; or
    侦测所述分布式存储系统中的任何一个或者多个存储节点中的磁盘是否发生了故障。Detect whether a disk in any one or more storage nodes in the distributed storage system has failed.
  7. 如权利要求1至6中任意一项所述的方法,其特征在于,所述同步分布式存储系统中的各个存储节点的信息包括:The method according to any one of claims 1 to 6, wherein the information of each storage node in the synchronous distributed storage system comprises:
    由所述分布式存储系统中的一个存储中心执行各个存储节点的信息同步;或者A storage center in the distributed storage system performs information synchronization of each storage node; or
    采用去中心化的方法,由所述分布式存储系统中的任何一个存储节点发起各个存储节点的信息同步。Adopting a decentralized method, any one storage node in the distributed storage system initiates information synchronization of each storage node.
  8. 一种自适应的数据恢复流控装置,其特征在于,所述装置包括:An adaptive data recovery flow control device is characterized in that the device includes:
    同步模块,用于定期同步分布式存储系统中的各个存储节点的信息;A synchronization module for regularly synchronizing information of each storage node in the distributed storage system;
    侦测模块,用于侦测是否有存储节点发生了故障;A detection module for detecting whether a storage node has failed;
    获取模块,用于当所述侦测模块侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;An obtaining module, configured to obtain a storage list of a failed storage node when the detection module detects a failure of the storage node;
    识别模块,用于识别上一个统计周期内用户应用的IO负载类别;Identification module, used to identify the IO load category of the user application in the previous statistical period;
    计算模块,用于根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;A calculation module, configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
    恢复模块,用于根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作。The recovery module is configured to perform a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period.
  9. 一种电子设备,其特征在于,所述电子设备包括处理器和存储器,所述存储器用于存储至少一个指令,所述处理器用于执行所述至少一个指令以实现以下步骤:An electronic device is characterized in that the electronic device includes a processor and a memory, where the memory is configured to store at least one instruction, and the processor is configured to execute the at least one instruction to implement the following steps:
    a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
    b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
    c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
    d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
    e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
    f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
    重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期 内的数据执行了恢复操作。Repeat the above steps d) -f) until the recovery operation is performed on the data in all the statistical periods of the failed storage node.
  10. 如权利要求9所述的电子设备,其特征在于,所述根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The electronic device according to claim 9, wherein the calculating a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period comprises:
    采用预先设置的流控阈值作为第一个统计周期对应的流控阈值。The preset flow control threshold is used as the flow control threshold corresponding to the first statistical period.
  11. 如权利要求9所述的电子设备,其特征在于,所述识别上一个统计周期内用户应用的IO负载类别包括:The electronic device according to claim 9, wherein the identifying the IO load category of the user application in the previous statistical period comprises:
    获取所述上一个统计周期内用户应用的每一个IO的数据块大小,计算所述上一个统计周期内的IO的平均数据块大小;Acquiring the data block size of each IO applied by the user in the last statistical period, and calculating the average data block size of the IO in the last statistical period;
    获取所述上一个统计周期内的每个数据块的传输时延,计算所述上一个统计周期内的IO的平均数据块时延;Acquiring the transmission delay of each data block in the last statistical period, and calculating the average data block delay of the IO in the last statistical period;
    获取预先设置的IO的数据块大小的基准值及对应的数据块时延的基准值;Obtaining a preset reference value of the data block size of the IO and a corresponding reference value of the data block delay;
    根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小的基准值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度;Calculating the IO according to an average data block size of the IO, the average data block delay, a reference value of the data block size, and a reference value of the corresponding data block delay in the previous statistical period IO load intensity in the last statistical period;
    根据所述上一个统计周期内的IO负载强度,利用预先训练好的负载分类模型确定所述上一个统计周期内的IO负载类别。According to the IO load intensity in the last statistical period, a pre-trained load classification model is used to determine the IO load category in the last statistical period.
  12. 如权利要求9所述的电子设备,其特征在于,所述IO负载类别包括:高负载类别、正常负载类别、低负载类别,所述根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The electronic device according to claim 9, wherein the IO load category comprises: a high load category, a normal load category, and a low load category, and the current statistical cycle corresponding to the IO load category is calculated according to the IO load category in the previous statistical cycle The flow control thresholds include:
    当所述上一个统计周期内的IO负载类别为高负载类别时,将所述上一个统计周期对应的流控阈值降低第一预设幅度,得到当前统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a high load category, reducing the flow control threshold corresponding to the previous statistical cycle by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical cycle;
    当所述上一个统计周期内的IO负载类别为低负载类别时,将所述上一个统计周期对应的流控阈值提高第二预设幅度,得到下一个统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a low load category, increasing the flow control threshold corresponding to the previous statistical cycle by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical cycle;
    当所述上一个统计周期内的IO负载类别为正常负载类别时,将所述上一个统计周期对应的流控阈值作为当前统计周期对应的流控阈值。When the IO load category in the previous statistical cycle is a normal load category, the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
  13. 如权利要求11所述的电子设备,其特征在于,所述根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小的基准值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度的计算公式为:
    Figure PCTCN2018108128-appb-100002
    其中,X为上述上一个统计周期内的所述IO的平均数据块大小,Y为所述平均数据块时延,M为所述数据块大小的基准值,N为所述对应的数据块时延的基准值。
    The electronic device according to claim 11, wherein the reference value based on the average data block size, the average data block delay, and the data block size of the IO in the previous statistical period 2. The reference value of the corresponding data block delay, and the calculation formula for calculating the IO load intensity in the previous statistical period is:
    Figure PCTCN2018108128-appb-100002
    Where X is the average data block size of the IO in the previous statistical period, Y is the average data block delay, M is the reference value of the data block size, and N is the corresponding data block. The benchmark value of the extension.
  14. 如权利要求9所述的电子设备,其特征在于,所述侦测是否有存储节点发生故障包括:The electronic device according to claim 9, wherein the detecting whether a storage node fails includes:
    侦测所述分布式存储系统中的任何一个或者多个存储节点是否发生了无法启动、断电或断网;或者Detecting whether any one or more storage nodes in the distributed storage system cannot be started, powered off, or disconnected from the network; or
    侦测所述分布式存储系统中的任何一个或者多个存储节点中的磁盘是否发生了故障。Detect whether a disk in any one or more storage nodes in the distributed storage system has failed.
  15. 一种非易失性可读存储介质,所述非易失性可读存储介质上存储有 至少一个指令,其特征在于,所述至少一个指令被处理器执行时实现以下步骤:A non-volatile readable storage medium stores at least one instruction on the non-volatile readable storage medium, and is characterized in that, when the at least one instruction is executed by a processor, the following steps are implemented:
    a)定期同步分布式存储系统中的各个存储节点的信息;a) Periodically synchronize the information of each storage node in the distributed storage system;
    b)侦测是否有存储节点发生了故障;b) Detect if any storage node has failed;
    c)当侦测到有存储节点发生了故障时,获取发生故障的存储节点的存储列表;c) when a failure of a storage node is detected, obtaining a storage list of the failed storage node;
    d)识别上一个统计周期内用户应用的IO负载类别;d) identify the IO load category of the user application in the previous statistical period;
    e)根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值;e) Calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
    f)根据所述存储列表及所述当前统计周期对应的流控阈值,对发生故障的存储节点的当前统计周期内的数据执行恢复操作;f) performing a recovery operation on the data in the current statistical period of the storage node that has failed according to the storage list and the flow control threshold corresponding to the current statistical period;
    重复执行上述步骤d)-f),直至对所述发生故障的存储节点的所有统计周期内的数据执行了恢复操作。The foregoing steps d) -f) are repeatedly performed until a recovery operation is performed on data in all statistical periods of the failed storage node.
  16. 如权利要求15所述的存储介质,其特征在于,所述根据所述上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The storage medium according to claim 15, wherein the calculating a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period comprises:
    采用预先设置的流控阈值作为第一个统计周期对应的流控阈值。The preset flow control threshold is used as the flow control threshold corresponding to the first statistical period.
  17. 如权利要求15所述的存储介质,其特征在于,所述识别上一个统计周期内用户应用的IO负载类别包括:The storage medium according to claim 15, wherein the identifying the IO load category of the user application in the previous statistical period comprises:
    获取所述上一个统计周期内用户应用的每一个IO的数据块大小,计算所述上一个统计周期内的IO的平均数据块大小;Acquiring the data block size of each IO applied by the user in the last statistical period, and calculating the average data block size of the IO in the last statistical period;
    获取所述上一个统计周期内的每个数据块的传输时延,计算所述上一个统计周期内的IO的平均数据块时延;Acquiring the transmission delay of each data block in the last statistical period, and calculating the average data block delay of the IO in the last statistical period;
    获取预先设置的IO的数据块大小的基准值及对应的数据块时延的基准值;Obtaining a preset reference value of the data block size of the IO and a corresponding reference value of the data block delay;
    根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小的基准值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度;Calculating the IO according to an average data block size of the IO, the average data block delay, a reference value of the data block size, and a reference value of the corresponding data block delay in the previous statistical period IO load intensity in the last statistical period;
    根据所述上一个统计周期内的IO负载强度,利用预先训练好的负载分类模型确定所述上一个统计周期内的IO负载类别。According to the IO load intensity in the last statistical period, a pre-trained load classification model is used to determine the IO load category in the last statistical period.
  18. 如权利要求15所述的存储介质,其特征在于,所述IO负载类别包括:高负载类别、正常负载类别、低负载类别,所述根据上一个统计周期内的IO负载类别计算当前统计周期对应的流控阈值包括:The storage medium according to claim 15, wherein the IO load category comprises: a high load category, a normal load category, and a low load category, and the current statistical cycle correspondence is calculated according to the IO load category in the previous statistical cycle. The flow control thresholds include:
    当所述上一个统计周期内的IO负载类别为高负载类别时,将所述上一个统计周期对应的流控阈值降低第一预设幅度,得到当前统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a high load category, reducing the flow control threshold corresponding to the previous statistical cycle by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical cycle;
    当所述上一个统计周期内的IO负载类别为低负载类别时,将所述上一个统计周期对应的流控阈值提高第二预设幅度,得到下一个统计周期对应的流控阈值;When the IO load category in the previous statistical cycle is a low load category, increasing the flow control threshold corresponding to the previous statistical cycle by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical cycle;
    当所述上一个统计周期内的IO负载类别为正常负载类别时,将所述上一个统计周期对应的流控阈值作为当前统计周期对应的流控阈值。When the IO load category in the previous statistical cycle is a normal load category, the flow control threshold corresponding to the previous statistical cycle is used as the flow control threshold corresponding to the current statistical cycle.
  19. 如权利要求17所述的存储介质,其特征在于,所述根据所述上一个统计周期内的所述IO的平均数据块大小、所述平均数据块时延、所述数据块大小 的基准值、所述对应的数据块时延的基准值,计算所述上一个统计周期内的IO负载强度的计算公式为:
    Figure PCTCN2018108128-appb-100003
    其中,X为上述上一个统计周期内的所述IO的平均数据块大小,Y为所述平均数据块时延,M为所述数据块大小的基准值,N为所述对应的数据块时延的基准值。
    The storage medium according to claim 17, wherein the reference value based on the average data block size, the average data block delay, and the data block size of the IO in the previous statistical period 2. The reference value of the corresponding data block delay, and the calculation formula for calculating the IO load intensity in the previous statistical period is:
    Figure PCTCN2018108128-appb-100003
    Where X is the average data block size of the IO in the previous statistical period, Y is the average data block delay, M is the reference value of the data block size, and N is the corresponding data block. The benchmark value of the extension.
  20. 如权利要求15所述的存储介质,其特征在于,所述侦测是否有存储节点发生故障包括:The storage medium of claim 15, wherein the detecting whether a storage node fails includes:
    侦测所述分布式存储系统中的任何一个或者多个存储节点是否发生了无法启动、断电或断网;或者Detecting whether any one or more storage nodes in the distributed storage system cannot be started, powered off, or disconnected from the network; or
    侦测所述分布式存储系统中的任何一个或者多个存储节点中的磁盘是否发生了故障。Detect whether a disk in any one or more storage nodes in the distributed storage system has failed.
PCT/CN2018/108128 2018-06-04 2018-09-27 Adaptive data recovery flow control method and apparatus, electronic device and storage medium WO2019232993A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810565004.2 2018-06-04
CN201810565004.2A CN108804039B (en) 2018-06-04 2018-06-04 Adaptive data recovery flow control method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2019232993A1 true WO2019232993A1 (en) 2019-12-12

Family

ID=64087212

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/108128 WO2019232993A1 (en) 2018-06-04 2018-09-27 Adaptive data recovery flow control method and apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN108804039B (en)
WO (1) WO2019232993A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10963332B2 (en) * 2018-12-17 2021-03-30 Western Digital Technologies, Inc. Data storage systems and methods for autonomously adapting data storage system performance, capacity and/or operational requirements
CN110120973A (en) * 2019-04-28 2019-08-13 华为技术有限公司 A kind of request control method, relevant device and computer storage medium
CN110516117A (en) * 2019-07-22 2019-11-29 平安科技(深圳)有限公司 Scheme classification type variable storage method, apparatus, equipment and the storage medium calculated
CN110750213A (en) * 2019-09-09 2020-02-04 华为技术有限公司 Hard disk management method and device
CN110673977B (en) * 2019-09-27 2022-06-07 浪潮电子信息产业股份有限公司 Data recovery optimization method, device, equipment and medium
CN111258816B (en) * 2020-01-17 2023-08-08 西安奥卡云数据科技有限公司 RPO adjustment method, device and computer readable storage medium
CN113377861B (en) * 2020-02-25 2023-04-07 中移(苏州)软件技术有限公司 Reconstruction method, device, equipment and storage medium of distributed storage system
CN114064362B (en) * 2021-11-16 2022-08-05 北京志凌海纳科技有限公司 Data recovery method, system and computer readable storage medium for distributed storage
CN116627362B (en) * 2023-07-26 2023-09-22 大汉电子商务有限公司 Financial data processing method based on distributed storage

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130111172A1 (en) * 2011-10-31 2013-05-02 International Business Machines Corporation Data Migration Between Storage Devices
CN105930498A (en) * 2016-05-06 2016-09-07 中国银联股份有限公司 Distributed database management method and system
CN106201354A (en) * 2016-07-12 2016-12-07 乐视控股(北京)有限公司 Date storage method and system
CN107544862A (en) * 2016-06-29 2018-01-05 中兴通讯股份有限公司 A kind of data storage reconstructing method and device, memory node based on correcting and eleting codes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130111172A1 (en) * 2011-10-31 2013-05-02 International Business Machines Corporation Data Migration Between Storage Devices
CN105930498A (en) * 2016-05-06 2016-09-07 中国银联股份有限公司 Distributed database management method and system
CN107544862A (en) * 2016-06-29 2018-01-05 中兴通讯股份有限公司 A kind of data storage reconstructing method and device, memory node based on correcting and eleting codes
CN106201354A (en) * 2016-07-12 2016-12-07 乐视控股(北京)有限公司 Date storage method and system

Also Published As

Publication number Publication date
CN108804039B (en) 2021-01-29
CN108804039A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
WO2019232993A1 (en) Adaptive data recovery flow control method and apparatus, electronic device and storage medium
WO2019232926A1 (en) Method and apparatus for data consistency checking and flow control, electronic device and storage medium
US10261853B1 (en) Dynamic replication error retry and recovery
US10289451B2 (en) Method, apparatus, and system for adjusting deployment location of virtual machine
US10209908B2 (en) Optimization of in-memory data grid placement
CN108633311B (en) Method and device for concurrent control based on call chain and control node
WO2017162011A1 (en) Network element performance data processing method and device, and nms
WO2019232927A1 (en) Distributed data deletion flow control method and apparatus, electronic device, and storage medium
CN110708369B (en) File deployment method and device for equipment nodes, scheduling server and storage medium
CN111880967A (en) File backup method, device, medium and electronic equipment in cloud scene
CN109840141B (en) Thread control method and device based on cloud monitoring, electronic equipment and storage medium
TWI537829B (en) Method, system and computer program product for restoring a previous version of a virtual machine image
CN106375102A (en) Service registration method, application method and correlation apparatus
US10554513B2 (en) Technologies for filtering network packets on ingress
US20240143456A1 (en) Log replay methods and apparatuses, data recovery methods and apparatuses, and electronic devices
CN108763107B (en) Background disc writing flow control method and device, electronic equipment and storage medium
US10896056B2 (en) Cluster expansion method and apparatus, electronic device and storage medium
WO2019232925A1 (en) Hotspot data migration flow control method and apparatus, and electronic device and storage medium
CN109298974B (en) System control method, device, computer and computer readable storage medium
CN108471387B (en) Log flow decentralized control method and system
CN110704382B (en) File deployment method, device, server and storage medium
US9552324B2 (en) Dynamic data collection communication between adapter functions
CN110569172B (en) Performance monitoring system of service level
WO2019159952A1 (en) Communication system and communication method
CN108306770B (en) Inter-board communication system and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18921992

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18921992

Country of ref document: EP

Kind code of ref document: A1