WO2019170133A1 - 一种数据存储方法及装置 - Google Patents

一种数据存储方法及装置 Download PDF

Info

Publication number
WO2019170133A1
WO2019170133A1 PCT/CN2019/077440 CN2019077440W WO2019170133A1 WO 2019170133 A1 WO2019170133 A1 WO 2019170133A1 CN 2019077440 W CN2019077440 W CN 2019077440W WO 2019170133 A1 WO2019170133 A1 WO 2019170133A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
slice
storage
data center
center
Prior art date
Application number
PCT/CN2019/077440
Other languages
English (en)
French (fr)
Inventor
汪渭春
夏伟强
王伟
Original Assignee
杭州海康威视系统技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视系统技术有限公司 filed Critical 杭州海康威视系统技术有限公司
Publication of WO2019170133A1 publication Critical patent/WO2019170133A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Definitions

  • the present application relates to the field of data storage technologies, and in particular, to a data storage method and apparatus.
  • a storage system can include multiple data centers.
  • the data center is based on computer networks to transmit and accelerate data. Display, calculate, and store devices.
  • the storage system also uses copy technology to back up data. That is to say, when the data is stored in the storage system, the data and the multiple copies corresponding to the data are respectively stored in different data centers, so that the security and reliability of the data storage can be ensured.
  • the copy of the data is the same as the amount of data in the data itself, that is, the storage space occupied by the copy in the data center storage is the same as the storage space occupied by the data itself, so although the copy of the data improves the reliability of the data.
  • sexuality but also occupies a large storage space in the data center, which results in a large storage space consumption of the data center in the storage system.
  • the present application provides a data storage method and apparatus to solve the problem of large storage space consumption of a data center in a storage system.
  • the specific technical solutions are as follows:
  • the embodiment of the present application provides a data storage method, which is applied to a target data center in a storage system, where the storage system includes at least two data centers; the method includes:
  • the data storage manner is: a storage manner capable of reconstructing the data according to the slice data of the data;
  • Each slice data is separately sent to a corresponding first data center.
  • the determining, according to the preset first allocation policy, the first data center for storing each slice data from the storage system including:
  • a first data center is assigned to each slice data, wherein the first data center allocated by each slice data is different.
  • the determining, according to the preset first allocation policy, the first data center for storing each slice data from the storage system including:
  • the determined slice data is allocated to each first data center according to the remaining storage space of each of the first data centers, so that each slice data is allocated a first data center.
  • the data storage manner is: an erasure code storage manner.
  • the method further includes:
  • the first data is reconstructed using the extracted slice data according to the data storage manner.
  • the method further includes:
  • the first data stored locally is sent to the first client.
  • the slice data of the second data is stored locally; the method further includes:
  • the embodiment of the present application provides a data storage device, which is applied to a target data center in a storage system, where the storage system includes at least two data centers; the device includes:
  • An acquiring module configured to acquire first data, and store the first data locally
  • a first determining module configured to determine, according to a preset data storage manner, slice data of the first data, where the data storage manner is: a storage manner capable of reconstructing the data according to the slice data of the data;
  • a second determining module configured to determine, from the storage system, a first data center for storing each slice data according to a preset first allocation policy
  • the first sending module is configured to separately send each slice data to a corresponding first data center.
  • the second determining module includes:
  • a first determining submodule configured to determine the quantity of the first data centers from the storage system according to a preset first allocation policy
  • the first allocation submodule is configured to allocate a first data center for each slice data, wherein the first data center allocated by each slice data is different.
  • the second determining module includes:
  • a sorting sub-module configured to sort the data centers included in the storage system according to a preset first allocation policy
  • a second determining submodule configured to determine, according to the ordering, a first data center, where a sum of remaining storage spaces of the determined first data center is greater than a sum of data amounts of the determined slice data
  • the second allocation sub-module is configured to allocate the determined slice data to each first data center according to the remaining storage space of each first data center, so that each slice data is allocated a first data center.
  • the data storage manner is: an erasure code storage manner.
  • the device further includes:
  • a first extraction module configured to extract slice data of the first data from each first data center when a storage device that stores the first data locally fails
  • a first reconstruction module configured to reconstruct the first data by using the extracted slice data according to the data storage manner.
  • the device further includes:
  • a first receiving module configured to receive a request sent by the first client to extract the first data
  • a detecting module configured to detect whether the storage device storing the first data in the target data center is faulty, and if yes, triggering the first extraction module
  • the second sending module is configured to send the first data stored locally to the first client when the detection result of the detecting module is negative.
  • the slice data of the second data is stored locally; the device further includes:
  • a second receiving module configured to receive a request for extracting second data sent by the second client
  • a third determining module configured to determine a data center storing slice data of the second data, as a second data center
  • a second extraction module configured to extract slice data of the second data from each of the second data centers
  • a second reconstruction module configured to reconstruct the second data by using the extracted slice data of the second data according to the data storage manner
  • a third sending module configured to send the reconstructed second data to the second client.
  • an embodiment of the present application provides an electronic device, including a processor and a memory;
  • a memory for storing a computer program
  • the processor when executed to execute a program stored on the memory, implements one of the data storage method steps described in any of the above.
  • an embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, implementing one of the foregoing data. Store method steps.
  • the first data is obtained, and the first data is stored locally; and the slice data of the first data is determined according to a preset data storage manner; according to the preset first allocation policy Determining, from the storage system, a first data center for storing each slice data; and transmitting each slice data to a corresponding first data center.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed. The security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • FIG. 1 is a flowchart of a data storage method according to an embodiment of the present application
  • FIG. 2 is another flowchart of a data storage method according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a data storage device according to an embodiment of the present disclosure.
  • FIG. 4 is another schematic structural diagram of a data storage device according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the data storage method and device are applied to the target data center in the storage system, and the storage system includes at least two Data centers; the method includes:
  • the data storage manner is: the storage mode of the data can be reconstructed according to the slice data of the data;
  • Each slice data is separately sent to a corresponding first data center.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • a data storage method provided by the embodiment of the present application is first introduced.
  • the method is applied to a target data center in a storage system, where the storage system includes at least two data centers.
  • the data center may include one or at least two storage devices, and the data center may store the data in a storage device included in the data storage device.
  • the storage device may be a storage server or the like, and each storage server may include multiple hard disks.
  • the storage device can also be a hard disk, which can be mounted on the machine, and each machine can mount multiple hard disks.
  • the data center stores the data to which storage device, which may be determined by the data center itself, and the data center determines the policy of the storage device for storing the data may be customized. And the policies customized for each data center can be different. In addition, the data center can also record the storage location of each data.
  • Determining a data center as the target data center from at least two data centers included in the storage system may be determined according to a preset policy.
  • the preset policy may be a policy determined according to at least one of the following information: a data center that has been allocated as a target data center, a survival state of each data center in the storage system, and a remaining of each data center in the storage system. Storage space, the load pressure of each data center in the storage system.
  • the foregoing allocation policy is similar to the following first allocation policy.
  • the first allocation policy will be described in detail below, and details are not described herein again.
  • a data storage method provided by an embodiment of the present application is introduced in conjunction with FIG. 1. As shown in FIG. 1, the data storage method includes the following steps.
  • the first data is data to be stored.
  • the direct storage method may be adopted.
  • the method for obtaining the first data by the target data center may be: the first data stored by the client is directly sent to the target data center.
  • the target data center may store the first data locally, that is, in the storage device included in the target data center.
  • the client is a monitoring device
  • the video data collected by the monitoring device is the first data
  • the monitoring device directly sends the video data to the target data center.
  • the target data center may store the video data in the target data center.
  • the target data center is local.
  • the storage system can also adopt the method of dumping.
  • the storage system may further include an access node, and the access node is connected to a client external to the storage system, and is configured to receive data sent by the client.
  • the method for obtaining the first data by the target data center may be: the access node receives the first data sent by the client, and after receiving the first data, the access node may send the first data to the target data center, so that the target data The center receives the first data sent by the client.
  • the storage device for storing the first data may be selected by the target data center itself.
  • the data storage method is: the storage method of the data can be reconstructed according to the slice data of the data. That is to say, according to the requirements of the data storage method, the data can be sliced to obtain slice data.
  • the plurality of pieces of slice data are subjected to reconstruction processing, and data before the slice data is sliced can be obtained.
  • the undamaged slice may be used as long as the number of damaged slice data is less than or equal to the preset number in the data storage mode. The data is reconstructed, and the data of the slice data before the slice is obtained.
  • the data A is sliced according to the requirements of the data storage method, and the slice data 1, the slice data 2, the slice data 3, and the slice data 4 are obtained. Then, the slice data 1, the slice data 2, and the slice can be sliced. Data 3 and slice data 4 are subjected to reconstruction processing, and reconstruction processing is performed to obtain data A. When the number of slice data damaged by the data storage mode is not more than 2, the undamaged slice data may be reconstructed. If the slice data 4 is damaged, the slice data 1, the slice data 2, and the slice data 3 may be heavy. The structure is processed to obtain data A.
  • the first data is sliced, and the first data is sliced into a preset number of slice data, wherein the preset number may be pre-customized.
  • the first data slice may be processed to obtain 5 slice data.
  • the preset number can also be determined by the preset data storage method itself. For example, if the preset number of preset data storage methods is set to 3, three pieces of slice data can be obtained when the data is sliced according to the requirements of the data storage method.
  • the preset data storage mode may be an erasure code storage mode, which is described in detail in the following embodiments, and will not be described in detail herein.
  • S103 Determine, according to a preset first allocation policy, a first data center for storing each slice data from the storage system.
  • the first allocation policy may be: determining, according to at least one of the following information, a first data center for storing each slice data: a data center that has been previously allocated as the first data center, and each data center in the storage system The survival state, the remaining storage space of each data center in the storage system, and the load pressure of each data center in the storage system.
  • the first data center may be determined according to other information, which is not limited herein.
  • the preset first allocation policy when determining the first data center from the storage system, not only the number of the first data centers but also the first data center to be stored corresponding to each slice data may be determined.
  • first data centers can be determined, respectively. It is: data center A, data center B, data center C, and data center D, and the slice data 1 is allocated to the data center A for storage, the slice data 2 is allocated to the data center B for storage, and the slice data 3 is allocated to The data center C stores and distributes the slice data 4 to the data center D for storage.
  • the manner in which the first data center is determined from the storage system can be divided into two cases.
  • the determined number of first data centers is the same as the number of slice data, each data center stores one slice data, and the slice data stored in each data center is different. This first case is described in detail in the following third embodiment and will not be described in detail herein.
  • the second case for each of the determined first data centers, one or at least two pieces of slice data may be allocated for storage.
  • the data centers included in the storage system are sorted according to a preset first allocation policy. For example, when the first allocation policy is to determine the first data center according to the remaining storage space, the data centers included in the storage system may be sorted according to the order of the remaining storage spaces of the data centers: the remaining storage space is the largest. The data center is ranked first, and the data center with the smallest remaining space is ranked last.
  • the first data center is determined from the storage system based on the number of obtained slice data and the amount of data of each slice data, that is, the determined first data center is sufficient to store slice data of the first data.
  • the first data center corresponding to each slice data is determined according to the remaining storage space of each of the first data centers.
  • each slice data can be evenly distributed according to the running capability of each data center, so that each data center in the storage system runs more balanced.
  • the slice data For the first data center with relatively large remaining storage space and relatively small load pressure, the slice data may be allocated for storage, and for the first data center with relatively small remaining storage space and relatively small load pressure, the data center may be allocated. Less or even only one slice of data is allocated for storage.
  • slice data A there are four pieces of slice data of the first data, namely: slice data A, slice data B, slice data C, and slice data D
  • slice data A has 2M
  • slice data B has 2M
  • slice data C has 2M
  • the slice data D has 2M.
  • the preset first allocation policy there are three first data centers determined from the storage system, namely: data center A, data center B, and data center C, and, in the determined three first In the data center, data center A has the largest remaining storage space and the least load pressure, and can be used to store two slice data at the same time, while data center B and data center C can store one slice data separately, so slice data A and slice
  • data B is allocated to the data center A for storage
  • the slice data C is allocated to the data center B for storage
  • the slice data D is allocated to the data center C for storage.
  • the determined slice data may be stored in the same first data center, and when the plurality of slice data are stored in the same first data center, respectively, may be stored separately in the first data center.
  • the storage device In the second case, the determined slice data may be stored in the same first data center, and when the plurality of slice data are stored in the same first data center, respectively, may be stored separately in the first data center.
  • the determined first data center may include the target data center, that is, the target data center may also be used to store the allocated slice data if the first allocation policy is satisfied.
  • the target data center can store the first data locally, and can store the slice data of the first data locally.
  • the method for sending the slice data may be sent through a network between data centers in the storage system, and the network may be any one of a network such as a communication network or a Serial Attached SCSI (SAS) network.
  • SAS Serial Attached SCSI
  • SCSI Small Computer System Interface
  • SCSI Small Computer System Interface
  • Each slice data is separately sent to a corresponding first data center, where the correspondence means that the slice data is allocated to the first data center to be stored.
  • slice data 1 is assigned to the first data center A
  • slice data 2 is assigned to the first data center B
  • the target data center transmits the slice data 1 to the first data center A, and sends the slice data 2 to the first Data Center B.
  • each of the first data centers After receiving the slice data, each of the first data centers stores the received slice data, and the storage location of the slice data may be determined by each of the first data centers.
  • the first data center A includes three storage devices, namely: storage device A, storage device B, and storage device C.
  • storage device A storage device A
  • storage device B storage device B
  • storage device C storage device C
  • the preset data storage mode may be an Erasure Coding (EC) storage mode
  • the first data is sliced according to the erasure code storage mode, and the pre-prepared data may be obtained.
  • a number of N pieces of data are set, and according to the obtained N pieces of data, a corresponding preset number of M pieces can be further obtained. From the sum of the data slice and the test slice M + N, is the number of slice data obtained. Among them, N and M can be pre-customized.
  • EC Erasure Coding
  • the RS (Reed-Solomon) code storage mode is a commonly used erasure code storage mode in the storage system.
  • the number N of data slices can be set to 5, and the number of check slices M is 3.
  • the first data is sliced, and five data slices and three test slices can be obtained, and finally the obtained slice data is eight.
  • the obtained slice data may be subjected to reconstruction processing, and data of the slice data before the slice is obtained.
  • M is the preset number of uncorrupted slice data.
  • the undamaged slice data can be reconstructed to obtain data before the slice is performed. Therefore, the data is sliced according to the erasure code storage mode, which can play a role of fault tolerance, thereby improving the reliability of data storage.
  • the data A is sliced according to the erasure code storage manner, and the obtained slice data includes 5 data slices and 3 check slices, wherein the 5 data slices are: data slice A, data slice B, and data.
  • the slice C, the data slice D, the data slice E, and the three check slices are: test piece A, test piece B, and test piece C, respectively.
  • the data slice B, the data slice C, the data slice D, the data slice E, the test slice B, and the test slice C can be reconstructed to obtain the data A.
  • the data storage method is not limited to the above-described erasure code storage mode, and other storage methods that can reconstruct the data according to the slice data of one data are feasible, and are not limited herein.
  • the first allocation policy may be: determining, according to at least one of the following information, a first data center for storing each slice data: a data center, a storage system that has been previously allocated as the first data center The survival status of each data center, the remaining storage space of each data center in the storage system, and the load pressure of each data center in the storage system.
  • the basis for determining the first data center is: the data center that has been allocated as the first data center before.
  • the target data center can record the identity of each of the determined first data centers.
  • the first implementation manner is a centralized policy, that is, when the target data center determines the first data center, the same data center as the first determined first data center may be selected.
  • the storage system includes: a data center 1, a data center 2, a data center 3, a data center 4, and a data center 5.
  • the last determined first data center recorded by the target data center is data center 1 and data center 2
  • the data center 1 and the data center 2 are still selected as the first data center.
  • the first data center determined last time fails, the storage space is full, and the remaining storage space is insufficient to store slice data, the first data center determined last time can no longer be used as the first data.
  • the center can re-determine the first data center by determining other basis of the first data center through the target data center.
  • the second implementation manner is a decentralized strategy, that is, when the target data center determines the first data center, the data center different from the previously determined first data center may be selected. It can be divided into two cases: in the first case, a data center different from the first data center determined last time can be selected; in the second case, data different from the first data center determined many times before can be selected. center.
  • the target data center when determining the first data center, may select a data center different from the first data center determined last time. That is to say, the determined first data center is only required to be different from the first data center determined last time.
  • the storage system includes: data center 1, data center 2, data center 3, data center 4, data center 5, and the last determined data center recorded by the target data center is data center 1 and data center 2, then When the target data center determines the first data center, the data center can be selected from the data center 3, the data center 4, and the data center 5 as the first data center.
  • the target data center determines the first data center
  • the data center different from the first data center determined a plurality of times may be selected. Among them, the number of times before can be customized.
  • the target data center may adopt a rotation strategy when determining the first data center.
  • the index sorting table for each data center included in the storage system may be preset, and the sorting table records the sorting of the identifiers corresponding to the data centers included in the storage system, and the sorting rules may be customized.
  • the identifier may be allocated according to the identifier sorting table, and the identifier of the first data center determined each time is: the identifier corresponding to the first data center determined last time is Identifies the next ID in the sorted table.
  • the first data center may be re-determined.
  • the basis for determining the first data center is: the survival state of each data center in the storage system.
  • the survival status of the data center is: the fault state or the intact state of the data center.
  • Each data center in the storage system can obtain the survival status of other data centers.
  • the survival status of other data centers in the storage system can also be known in real time. In this way, when the target data center determines the first data center, it can timely troubleshoot the data center, and only the target data center is determined from the intact data center.
  • the basis for determining the first data center is: the remaining storage space of each data center in the storage system.
  • the remaining storage space of each data center can be sorted. For a data center with a large remaining storage space, it can be preferentially determined as the first data center. In this way, the target data center can update the order of the remaining storage spaces of each data center after each determination of the first data center, and the obtained new remaining storage space order can be used as the next determination of the first data center. in accordance with. In this way, the remaining storage space of each data center in the storage system can be balanced.
  • the basis for determining the first data center is: the load pressure of each data center in the storage system.
  • the load pressure of each data center included in the storage system can be sorted, and for a data center with a small load pressure, it can be preferentially determined as the first data center.
  • the target data center can update the load pressure order of each data center after each determination of the first data center, and the obtained new load pressure order can be used as the basis for determining the first data center next time. In this way, the load pressure of each data center in the storage system can be balanced.
  • the determined first data center is slice data corresponding to the storage of sufficient remaining storage space.
  • the basis of the first data center may be determined separately as the target data center, and any two, three or four combinations may be used as the basis for determining the first data center of the target data center. There is no limit here.
  • determining, according to the preset first allocation policy, the first data center for storing each slice data from the storage system (S103) may include the following steps.
  • a first data center is assigned to each slice data, wherein the first data center allocated by each slice data is different.
  • the slice data is distributed one-to-one with the first data center. Therefore, how much slice data the first data is divided into, correspondingly, how many data centers are determined from the storage system as the first data center.
  • the first data is sliced, and the number of slice data that the first data is divided into can be obtained, and then the number of the first data center can be determined.
  • the target data center performs slice processing on the first data and divides it into four pieces of slice data.
  • four data centers need to be determined from the storage system as the first data center.
  • determining the same number of first data centers as the slice data may be according to a preset first allocation policy.
  • a first data center can be allocated for each slice data, and the first data center allocated by each slice data is different. .
  • the target data center performs slice processing on the first data, and divides into four pieces of slice data: slice data 1, slice data 2, slice data 3, and slice data 4, and determines four data centers from the storage system as the first
  • the data center, the determined four first data centers are: a first data center A, a first data center B, a first data center C, and a first data center D
  • the slice data 1 can be allocated to the first data center.
  • the slice data 2 can be assigned to the first data center B
  • the slice data 3 can be assigned to the first data center C
  • the slice data 4 can be assigned to the first data center D.
  • the one-to-one allocation strategy may be based on the amount of data of each slice data and the remaining storage space of each of the first data centers.
  • each slice data is sorted, as a first sequence, according to the size of the remaining storage space, each first data center is sorted as a second sequence; the first sequence and the first sequence The two sequences are in one-to-one correspondence in order.
  • the first slice data in the first sequence corresponds to the first data center in the second sequence
  • the center corresponds, and so on.
  • the slice data is sorted in descending order of the amount of data, and the first sequence obtained is: slice data 1, slice data 2, slice data 3; for the first data center according to the order of remaining storage space from large to small Sorting, the obtained second sequence is: the first data center C, the first data center B, and the first data center A. Therefore, according to the corresponding relationship, the slice data 1 is allocated to the first data center C, and the slice is sliced. Data 2 is assigned to the first data center B, and slice data 3 is assigned to the first data center A.
  • the one-to-one allocation strategy may also be based on the amount of data of each slice data and the load pressure of each first data center.
  • each slice data is sorted according to the size of the data volume, and as a third sequence, each first data center is sorted according to the magnitude of the load pressure, as a fourth sequence; the third sequence and the fourth sequence are The sequences are in one-to-one correspondence in order.
  • the first slice data in the third sequence corresponds to the first data center in the fourth sequence
  • the center corresponds, and so on.
  • This strategy is similar to the above-mentioned one-to-one allocation strategy, and will not be described here.
  • the manner of allocating a first data center for each slice data is not limited to the above two types, and may also include other allocation modes, which are not limited herein.
  • the first data is obtained, and the first data is stored locally; and the slice data of the first data is determined according to a preset data storage manner; according to the preset first allocation policy Determining, from the storage system, a first data center for storing each slice data; and transmitting each slice data to respective corresponding first data centers, so that each of the first data centers stores the received slice data.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed. The security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • the embodiment of the present application further provides a data storage method. As shown in FIG. 2 , the method may further include the following steps after step S104 of the foregoing embodiment.
  • the storage device may be a device for storing data, such as a hard disk, a magnetic disk, or the like.
  • each data center can include multiple storage devices, and each data center can decide which storage device to store the data in.
  • Local storage in this application refers to the storage of the target data center.
  • the storage of the first data may be stored in one storage device of the target data center, and may also be stored separately in the plurality of storage devices. When stored in a storage device, it is detected whether the one storage device is faulty; when stored in multiple storage devices, the multiple storage devices need to be detected.
  • the storage device may be considered to be faulty.
  • the target data center can also read data from the storage device, the target data center receives the instruction to extract the first data, directly reads the first data from the storage device storing the first data, and reads the first data.
  • a data is sent to the location or device pointed to by the instruction.
  • the first data center corresponds to the slice data, and the first data center stores the slice data, and the extracted slice data belongs to the same data, that is, the first data in this embodiment.
  • the target data center extracts slice data from each of the first data centers and stores the extracted slice data locally.
  • the manner in which the target data center extracts the slice data may be through a network or may be extracted through a SAS network.
  • the network is a network connected to each data center in the storage system, and the process of extracting the slice data by the target data center is: the first data center of each slice data is stored The data is sent to the target data center, and the received data is stored locally by the target data center.
  • the SAS network When extracting slice data through the SAS network, the SAS network is a network in which data centers in the storage system are connected to each other. Because of the characteristics of the SAS network, for each data center, it can be directly read or written from other data centers. data. Therefore, the process of extracting the slice data by the target data center is: the target data center reads each slice data from each first data center in which the slice data is stored, and stores the read slice data locally.
  • the target data center may reconstruct the sliced data to obtain the first data.
  • the data storage mode may be an erasure code storage mode, that is, the extracted slice data may be reconstructed according to the erasure code storage manner to obtain the first data.
  • the slice data of the first data includes: slice data 1, slice data 2, and slice data 3.
  • the target data center respectively extracts slice data 1, slice data 2, and slice data 3, according to the erasure code storage mode,
  • the slice data 1, the slice data 2, and the slice data 3 are subjected to reconstruction processing, and the first data can be obtained.
  • step S104 the following steps may be further included:
  • step 3 Detecting whether the storage device storing the first data locally is faulty, if not, performing the following step 3;
  • the first client may be a device external to the storage system, and the first client may request to extract data from any data center in the storage system, and may be stored in the data center receiving the extraction request for the data requested to be extracted. It can also be stored in data centers other than the data center that receives the fetch request.
  • the target data center is used as the data center for receiving the extraction request. For the target data center, first determine whether the data that the first client needs to extract is stored locally. If it is stored locally, the following step 2 can be performed, if not stored. Locally, there are two cases.
  • the data to be extracted is stored in other data centers, and the target data center can extract the data from other data centers;
  • the data to be extracted is not stored in other data centers, and only the slice data of the data, the target data center can extract the slice data of the data from the data center in which the slice data is stored, and The slice data is reconstructed into the data that needs to be extracted.
  • the first data that the first client needs to extract is stored in the target data center.
  • Step 2 The target data center determines a storage location where the first data is stored locally, that is, determines which storage device the first data is stored in the target data center.
  • the storage device storing the first data is further detected to detect whether the storage device is faulty, that is, whether the stored first data can be read from the storage device.
  • the target data center cannot directly send the stored first data to the first client.
  • the first client may request to extract the first data from any other first data center.
  • each data center is associated with each other. Therefore, for the first client, the identifier corresponding to the data center of the slice data storing the first data, that is, the identifier of the first data center, may be obtained from any one of the data centers. In this way, the first client can send a request to extract the first data to any of the first data centers. Receiving the requested first data center, extracting slice data of the first data from the local and other first data centers, and reconstructing the extracted slice data according to the data storage manner to obtain the first data, and then reconstructing The obtained first data is sent to the first client.
  • the target data center may extract slice data of the first data from the first data center. After extracting the slice data of the first data to the target data center locally in the target data center, the target data center may reconstruct the sliced data of the extracted first data according to the data storage manner to obtain the first data, and then The reconstructed first data is sent to the first client.
  • Step 3 If the storage device is not faulty, the target data center may directly obtain the first data from the local area, and send the obtained first data to the first client.
  • the slice data of the second data may also be stored locally in the target data center, that is, the target data center stores only the slice data of the second data locally, but does not store the second data.
  • Step 1 For the request for extracting the second data, the second data is not stored locally in the target data center, and the second data cannot be directly extracted locally from the target data center.
  • the second data can be obtained by extracting the slice data of the second data.
  • the target data center may determine other data centers storing the slice data of the second data from the storage system, and use the determined data center as the second data center.
  • the slice data of the second data includes: slice data 1, slice data 2, slice data 3, and slice data 4.
  • the slice data 1 is stored in the target data center
  • the slice data 2 is stored in the data center A
  • sliced Data 3 is stored in data center B
  • slice data 4 is stored in data center C.
  • the target data center may determine the data center A, the data center B, and the data center C of the slice data storing the second data, and the data Center A, data center B, and data center C serve as the second data center.
  • Step 2 After determining that the second data center is stored, the target data center may extract slice data of the corresponding second data from each second data center, and use the extracted slice data of the second data according to the data storage manner. The second data is reconstructed, and the reconstructed second data is sent to the second client.
  • the slice data of the second data includes: slice data 1, slice data 2, and slice data 3, the slice data 1 is stored in the target data center, the slice data 2 is stored in the data center A, and the slice data 3 is stored in the data center B, then
  • the target data center may extract the slice data 2 from the data center A, and may also extract the slice data 3 from the data center B, and according to the erasure code storage manner.
  • the locally stored slice data 1, the extracted slice data 2, and the slice data 3 may be reconstructed to obtain second data, and the reconstructed second data is sent to the second client.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • the embodiment of the present application further provides a data storage device, which is applied to a target data center in a storage system, where the storage system includes at least two data centers; , the device includes:
  • the obtaining module 310 is configured to acquire the first data and store the first data locally;
  • the first determining module 320 is configured to determine the slice data of the first data according to the preset data storage manner, where the data storage manner is: the storage manner of the data can be reconstructed according to the slice data of the data;
  • the second determining module 330 is configured to determine, according to the preset first allocation policy, a first data center for storing each slice data from the storage system;
  • the first sending module 340 is configured to separately send each slice data to a corresponding first data center.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • the second determining module 330 includes:
  • a first determining submodule configured to determine a quantity of the first data centers from the storage system according to the preset first allocation policy
  • the first allocation submodule is configured to allocate a first data center for each slice data, wherein the first data center allocated by each slice data is different.
  • the second determining module 330 includes:
  • a sorting sub-module configured to sort the data centers included in the storage system according to the preset first allocation policy
  • a second determining submodule configured to determine, according to the ordering, a first data center, where a sum of remaining storage spaces of the determined first data center is greater than a sum of data amounts of the determined slice data
  • the second allocation sub-module is configured to allocate the determined slice data to each first data center according to the remaining storage space of each first data center, so that each slice data is allocated a first data center.
  • the data storage manner may be: an erasure code storage manner.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • the embodiment of the present application further provides a data storage device.
  • the device may also be include:
  • a first extraction module 410 configured to extract slice data of the first data from each first data center when the storage device that stores the first data locally fails;
  • the first reconstruction module 420 is configured to reconstruct the first data by using the extracted slice data according to a data storage manner.
  • the device may further include:
  • a first receiving module configured to receive a request for extracting the first data sent by the first client
  • a detecting module configured to detect whether the storage device storing the first data in the target data center is faulty, and if so, triggering the first extraction module 410;
  • the second sending module is configured to send the locally stored first data to the first client when the detection result of the detecting module is negative.
  • the slice data of the second data is stored locally; the device may further include:
  • a second receiving module configured to receive a request for extracting second data sent by the second client
  • a third determining module configured to determine a data center storing the slice data of the second data as the second data center
  • a second extraction module configured to extract slice data of the second data from each of the second data centers
  • a second reconstruction module configured to reconstruct the second data by using the sliced data of the extracted second data according to a data storage manner
  • a third sending module configured to send the reconstructed second data to the second client.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • the embodiment of the present application further provides an electronic device, as shown in FIG. 5, including a processor 510, and a memory 520;
  • a memory 520 configured to store a computer program
  • the processor 510 is configured to perform the following steps when executing the program stored on the memory 520:
  • the data storage manner is: the storage mode of the data can be reconstructed according to the slice data of the data;
  • Each slice data is separately sent to a corresponding first data center.
  • the data in addition to storing data locally, the data may be sliced to obtain slice data, and the slice data is separately stored in the selected first data center, so that not only the data is guaranteed.
  • the security and reliability of the data storage, and the slice data is separately stored after the data is sliced, which saves the storage space and reduces the storage space consumption of the data center in the storage system.
  • An electronic device provided by an embodiment of the present application may further perform a data storage method according to any one of the foregoing embodiments.
  • a data storage method may be performed according to any one of the foregoing embodiments.
  • FIG. 1 and FIG. 2 A data storage method according to any of the corresponding embodiments.
  • the communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the above electronic device and other devices.
  • the memory may include a random access memory (RAM), and may also include a non-volatile memory (NVM), such as at least one disk storage.
  • RAM random access memory
  • NVM non-volatile memory
  • the memory may also be at least one storage device located away from the aforementioned processor.
  • the above processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; or may be a digital signal processing (DSP), dedicated integration.
  • CPU central processing unit
  • NP network processor
  • DSP digital signal processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据存储方法及装置,应用于存储系统中的目标数据中心,存储系统包括至少两个数据中心;所述方法包括:获取第一数据,并在本地存储所述第一数据(S101);按照预设的数据存储方式,确定第一数据的切片数据(S102);根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心(S103);将各个切片数据分别发送至各自对应的第一数据中心(S104)。该方法除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,不仅保证了数据存储的安全性和可靠性,而且节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。

Description

一种数据存储方法及装置
本申请要求于2018年03月09日提交中国专利局、申请号为201810193321.6发明名称为“一种数据存储方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储技术领域,特别是涉及一种数据存储方法及装置。
背景技术
随着计算机应用规模越来越大,对数据存储的需求也与日俱增,其中,一个存储系统中可以包括多个数据中心,数据中心是在计算机网络的基础上,用来对数据进行传递、加速、展示、计算、存储的设备。
目前,存储系统除了要对数据进行存储外,还采用副本技术对数据进行备份。也就是说,在上述存储系统中存储数据时,将数据及该数据对应的多个副本分别存储在不同的数据中心,这样,可以保证数据存储的安全性和可靠性。
然而,数据的副本与数据本身的数据量大小是一样的,即副本在数据中心存储所占的存储空间与数据本身所占的存储空间是一样的,因此,虽然数据的副本提高了数据的可靠性,但也在数据中心中占用了较大的存储空间,这样,导致了存储系统中数据中心的存储空间消耗较大。
发明内容
本申请提供了一种数据存储方法及装置,以解决存储系统中数据中心的存储空间消耗较大的问题。具体技术方案如下:
第一方面,本申请实施例提供了一种数据存储方法,应用于存储系统中的目标数据中心,所述存储系统中包括至少两个数据中心;所述方法包括:
获取第一数据,并在本地存储所述第一数据;
按照预设的数据存储方式,确定所述第一数据的切片数据,其中,所述数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心;
将各个切片数据分别发送至各自对应的第一数据中心。
可选地,所述根据预设的第一分配策略,从所述存储系统中确定用于存 储各个切片数据的第一数据中心,包括:
获得切片数据的数量;
根据预设的第一分配策略,从所述存储系统中确定所述数量个第一数据中心;
为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
可选地,所述根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心,包括:
根据预设的第一分配策略,对所述存储系统所包括的数据中心进行排序;
按照所述排序,确定第一数据中心,其中,所确定的第一数据中心的剩余存储空间之和大于所确定的切片数据的数据量之和;
根据各第一数据中心的剩余存储空间,为各第一数据中心分配所确定的切片数据,以使得每一切片数据分配一个第一数据中心。
可选地,所述数据存储方式为:纠删码存储方式。
可选地,所述将各个切片数据分别发送至各自对应的第一数据中心之后,还包括:
当本地存储所述第一数据的存储设备故障的情况下,从各个第一数据中心提取所述第一数据的切片数据;
按照所述数据存储方式,利用所提取的切片数据重构所述第一数据。
可选地,所述将各个切片数据分别发送至各自对应的第一数据中心之后,还包括:
接收第一客户端发送的提取所述第一数据的请求;
检测本地存储所述第一数据的存储设备是否故障;
如果否,将本地存储的所述第一数据发送至所述第一客户端。
可选地,本地存储第二数据的切片数据;所述方法还包括:
接收第二客户端发送的提取第二数据的请求;
确定存储有所述第二数据的切片数据的数据中心,作为第二数据中心;
从各个第二数据中心中提取所述第二数据的切片数据;
按照所述数据存储方式,利用所提取的所述第二数据的切片数据重构所述第二数据;
将重构得到的第二数据发送至所述第二客户端。
第二方面,本申请实施例提供了一种数据存储装置,应用于存储系统中的目标数据中心,所述存储系统中包括至少两个数据中心;所述装置包括:
获取模块,用于获取第一数据,并在本地存储所述第一数据;
第一确定模块,用于按照预设的数据存储方式,确定所述第一数据的切片数据,其中,所述数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
第二确定模块,用于根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心;
第一发送模块,用于将各个切片数据分别发送至各自对应的第一数据中心。
可选地,所述第二确定模块包括:
获得子模块,用于获得切片数据的数量;
第一确定子模块,用于根据预设的第一分配策略,从所述存储系统中确定所述数量个第一数据中心;
第一分配子模块,用于为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
可选地,所述第二确定模块包括:
排序子模块,用于根据预设的第一分配策略,对所述存储系统所包括的数据中心进行排序;
第二确定子模块,用于按照所述排序,确定第一数据中心,其中,所确定的第一数据中心的剩余存储空间之和大于所确定的切片数据的数据量之和;
第二分配子模块,用于根据各第一数据中心的剩余存储空间,为各第一数据中心分配所确定的切片数据,以使得每一切片数据分配一个第一数据中心。
可选地,所述数据存储方式为:纠删码存储方式。
可选地,所述装置还包括:
第一提取模块,用于当本地存储所述第一数据的存储设备故障的情况下,从各个第一数据中心提取所述第一数据的切片数据;
第一重构模块,用于按照所述数据存储方式,利用所提取的切片数据重构所述第一数据。
可选地,所述装置还包括:
第一接收模块,用于接收第一客户端发送的提取所述第一数据的请求;
检测模块,用于检测所述目标数据中心中存储所述第一数据的存储设备是否故障,如果是,触发所述第一提取模块;
第二发送模块,用于当所述检测模块的检测结果为否时,将本地存储的所述第一数据发送至所述第一客户端。
可选地,本地存储第二数据的切片数据;所述装置还包括:
第二接收模块,用于接收第二客户端发送的提取第二数据的请求;
第三确定模块,用于确定存储有所述第二数据的切片数据的数据中心,作为第二数据中心;
第二提取模块,用于从各个第二数据中心中提取所述第二数据的切片数据;
第二重构模块,用于按照所述数据存储方式,利用所提取的所述第二数据的切片数据重构所述第二数据;
第三发送模块,用于将重构得到的第二数据发送至所述第二客户端。
第三方面,本申请实施例提供了一种电子设备,包括处理器和存储器;
存储器,用于存放计算机程序;
处理器,用于执行存储器上所存放的程序时,实现上述任一所述的一种数据存储方法步骤。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述任一所述的一种数据存储方法步骤。
本申请实施例提供的技术方案中,通过获取第一数据,并在本地存储所述第一数据;按照预设的数据存储方式,确定第一数据的切片数据;根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心;将各个切片数据分别发送至各自对应的第一数据中心。通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例 和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种数据存储方法的一种流程图;
图2为本申请实施例提供的一种数据存储方法的另一种流程图;
图3为本申请实施例提供的一种数据存储装置的一种结构示意图;
图4为本申请实施例提供的一种数据存储装置的另一种结构示意图;
图5为本申请实施例提供的一种电子设备的一种结构示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了节省存储系统中数据中心的存储空间,进而可以减小存储空间的消耗,本申请实施例提供了一种数据存储方法及装置,应用于存储系统中的目标数据中心,该存储系统包括至少两个数据中心;该方法包括:
获取第一数据,并在本地存储所述第一数据;
按照预设的数据存储方式,确定第一数据的切片数据,其中,数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心;
将各个切片数据分别发送至各自对应的第一数据中心。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
下面首先对本申请实施例提供的一种数据存储方法进行介绍,其中,该方法应用于存储系统中的目标数据中心,该存储系统中包括至少两个数据中心。
数据中心可以包括一个或至少两个存储设备,数据中心可以将数据存储至自身所包含的存储设备中,其中,一方面,存储设备可以是存储服务器等机器,每一存储服务器可以包括多个硬盘;另一方面,存储设备还可以是硬盘,硬盘可以挂载在机器上,每一台机器可以挂载多个硬盘。
在数据中心包括至少两个存储设备的情况下,数据中心将数据存储至哪个存储设备,可以由数据中心自身所确定,数据中心确定用于存储数据的存储设备的策略可以是自定义设定的,并且各个数据中心自定义的策略可以是不相同的。另外,数据中心还可以记录各数据的存储位置。
从存储系统所包括的至少两个数据中心中,确定一个数据中心作为目标数据中心,可以是根据预设的策略所确定的。其中,预设的策略可以是根据以下信息中的至少一种所确定的策略:已分配过作为目标数据中心的数据中心,存储系统中各数据中心的存活状态,存储系统中各数据中心的剩余存储空间,存储系统中各数据中心的负载压力。
上述分配策略与下面的第一分配策略类似,在下面会对第一分配策略进行详细介绍,在此不再赘述。
结合图1对本申请实施例提供的一种数据存储方法进行介绍,如图1所示,该数据存储方法包括如下步骤。
S101,获取第一数据,并在本地存储第一数据。
其中,第一数据为待存储的数据。对于存储系统来说,可以采用直存的方式,在直存的方式中,目标数据中心获取第一数据的方式可以是:由客户端将所需存储的第一数据直接发送至目标数据中心,目标数据中心接收到客户端发送的第一数据后,可以将第一数据存储在本地,即存储在该目标数据中心所包括的存储设备中。
例如,客户端为监控设备,监控设备采集的视频数据为第一数据,监控设备将视频数据直接发送至目标数据中心,目标数据中心接收到视频数据之后,可以将视频数据存储在目标数据中心的一个存储设备中,即目标数据中心本地。
另外,存储系统还可以采用转存的方式。在转存的方式中,存储系统还可以包括接入节点,接入节点与存储系统外部的客户端相连接,用于接收客户端发送的数据。
目标数据中心获取第一数据的方式可以是:接入节点接收客户端发送的 第一数据,接入节点在接收到第一数据后,可以将第一数据发送给目标数据中心,这样,目标数据中心则接收客户端发送的第一数据。
目标数据中心将第一数据存储至本地时,可以由目标数据中心自身选取用于存储第一数据的存储设备。
S102,按照预设的数据存储方式,确定第一数据的切片数据。
其中,数据存储方式为:能够根据数据的切片数据重构该数据的存储方式。也就是说,按照该数据存储方式的要求,可以对数据进行切片处理,得到切片数据。
另外,按照该数据存储方式可以得到的多个切片数据,将该多个切片数据进行重构处理,可以得到切片数据在进行切片之前的数据。当然,由该数据存储方式所得到的多个切片数据中,有部分切片数据损坏时,只要损坏的切片数据的数量小于或者等于该数据存储方式中预设的数量,也可以对未损坏的切片数据进行重构处理,进而可以得到切片数据在进行切片之前的数据。
例如,对于数据A,按照数据存储方式的要求,对数据A进行切片处理,得到切片数据1、切片数据2、切片数据3和切片数据4,那么,可以将切片数据1、切片数据2、切片数据3和切片数据4进行重构处理,重构处理得到数据A。而当该数据存储方式损坏的切片数据的数量不大于2时可以对未损坏的切片数据进行重构处理,若切片数据4损坏时,可以对切片数据1、切片数据2和切片数据3进行重构处理,得到数据A。
对第一数据进行切片处理,可以将第一数据进行切片分成预设数量的切片数据,其中,预设数量可以是预先自定义设定的。
例如,将预设数量设定为5,则按照数据存储方式的要求,对第一数据进行切片处理时,可以将第一数据切片处理,得到5个切片数据。
预设数量也可以是预设的数据存储方式本身所确定的。例如,预设的数据存储方式本身设定的预设数量为3,则按照该数据存储方式的要求,对数据进行切片处理时,可以得到3个切片数据。
预设的数据存储方式可以是纠删码存储方式,在以下实施方式中进行详细介绍,在此不再详述。
S103,根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心。
其中,第一分配策略可以是根据以下信息中的至少一种,确定用于存储 各个切片数据的第一数据中心:在之前已分配过作为第一数据中心的数据中心、存储系统中各数据中心的存活状态、存储系统中各数据中心的剩余存储空间、存储系统中各数据中心的负载压力。除了上述四种信息以外,还可以根据其他信息来确定第一数据中心,在此不作限定。
对于上述第一分配策略,在以下实施方式中进行详细介绍,在此不再详述。
根据预设的第一分配策略,从存储系统中确定第一数据中心时,不仅可以确定第一数据中心的数量,还可以确定每一切片数据所要对应存储的第一数据中心。
例如,第一数据的切片数据有4个,分别为:切片数据1、切片数据2、切片数据3和切片数据4,根据预设的第一分配策略,可以确定4个第一数据中心,分别为:数据中心A、数据中心B、数据中心C和数据中心D,并且,将切片数据1分配给数据中心A进行存储,将切片数据2分配给数据中心B进行存储,将切片数据3分配给数据中心C进行存储,将切片数据4分配给数据中心D进行存储。
从存储系统中确定第一数据中心的方式可以分为两种情况。
第一种情况:所确定的第一数据中心的数量与切片数据的数量相同,每一数据中心存储一个切片数据,并且,每一数据中心存储的切片数据不一样。该第一种情况在以下第三种实施方式中进行详细介绍,在此不再详述。
第二种情况:对于所确定的每一个第一数据中心,均可以分配存储一个或者至少两个切片数据。
在该第二种情况下,根据预设的第一分配策略对存储系统所包括的数据中心进行排序。例如,第一分配策略为依据剩余存储空间来确定第一数据中心时,可以按照各数据中心的剩余存储空间从大到小的顺序,对存储系统所包括的数据中心进行排序:剩余存储空间最大的数据中心排在第一个,剩余空间最小的数据中心排在最后一个。
根据获得切片数据的数量、以及每一切片数据的数据量大小,从存储系统中来确定第一数据中心,也就是说,所确定出的第一数据中心足以存储第一数据的切片数据。
在确定出第一数据中心之后,根据各第一数据中心的剩余存储空间情况,来确定每一切片数据所对应存储的第一数据中心。
这样,可以根据各数据中心的运行能力来均衡分配各切片数据,使得存储系统中的各数据中心运行较均衡。对于剩余存储空间相对较大且负载压力相对较小的第一数据中心可以多分配切片数据进行存储,而对于剩余存储空间相对较小且负载压力相对较小的第一数据中心,则可以分配较少甚至仅分配一个切片数据进行存储。
例如,第一数据的切片数据有4个,分别为:切片数据A、切片数据B、切片数据C以及切片数据D,并且,切片数据A有2M,切片数据B有2M,切片数据C有2M,切片数据D有2M。根据预设的第一分配策略,从存储系统中确定出的第一数据中心有3个,分别为:数据中心A、数据中心B和数据中心C,并且,在所确定出的3个第一数据中心中,数据中心A的剩余存储空间最大且负载压力最小,可以用来同时存储两个切片数据,而数据中心B和数据中心C可以分别存储一个切片数据,因此,将切片数据A和切片数据B分配给数据中心A进行存储,将切片数据C分配给数据中心B进行存储,将切片数据D分配给数据中心C进行存储。
在第二种情况中,可以将所确定的切片数据存储在同一第一数据中心,当多个切片数据存储在同一第一数据中心时,可以是分别存储在该第一数据中心所包括的不同的存储设备中。
对于上述两种情况,所确定的第一数据中心可以包括目标数据中心,也就是说,在满足第一分配策略的情况下,目标数据中心也可以用来存储所分配的切片数据。此时,目标数据中心既可以在本地存储第一数据,又可以在本地存储第一数据的切片数据。
S104,将各个切片数据分别发送至各自对应的第一数据中心。
其中,发送切片数据的方式可以通过存储系统中各数据中心之间的网络进行发送,网络可以是通信网络、串行连接SCSI接口网络(Serial Attached SCSI,简称SAS)等网络中的任一种,在此不做限定。其中,SCSI(Small Computer System Interface)即为小型计算机系统接口,是一种通用接口。
将各个切片数据分别发送至各自对应的第一数据中心,此处的对应是指:切片数据被分配给所要存储的第一数据中心。
例如,切片数据1被分配给第一数据中心A,切片数据2被分配给第一数据中心B,则目标数据中心将切片数据1发送至第一数据中心A,将切片数据2发送至第一数据中心B。
各个第一数据中心接收到切片数据之后,将所接收到的切片数据进行存储,对于切片数据的存储位置,可以由各第一数据中心各自所确定。
例如,第一数据中心A包括3个存储设备,分别为:存储设备A、存储设备B和存储设备C,在第一数据中心A接收到切片数据1时,第一数据中心A可以确定将切片数据1存储至存储设备B。
一种实施方式中,对于上述步骤S102,预设的数据存储方式可以为纠删码(Erasure Coding,简称EC)存储方式,按照纠删码存储方式,对第一数据进行切片处理,可以得到预设数量N个数据片,并且,根据所得到的N个数据片,进一步地可以得到相应的预设数量M个检验片。由数据片与检验片之和M+N,即为所得到的切片数据的数量。其中,N和M可以是预先自定义设定的。
例如,RS(Reed-Solomon)码存储方式是存储系统较为常用的一种纠删码存储方式,RS码存储方式中,可以设定数据片的数量N为5、校验片的数量M为3,按照RS码存储方式对第一数据进行切片处理,可以得到5个数据片、3个检验片,最后所得到的切片数据为8个。
另外,对于按照纠删码存储方式所得到的切片数据,还可以对所得到的切片数据进行重构处理,进而可以得到切片数据在进行切片之前的数据。对于按照纠删码存储方式进行切片所得到的N个数据片和M个校验片中,M即为预设的未损坏的切片数据的数量。
在所得到的M+N个切片数据中,当损坏的切片数据的数量小于或者等于M个时,均可以对未损坏的切片数据进行重构处理,得到进行切片之前的数据。因此,按照纠删码存储方式对数据进行切片处理,可以起到容错的作用,进而可以提高数据存储的可靠性。
例如,按照纠删码存储方式对数据A进行切片处理,所得到的切片数据包括5个数据片和3个校验片,其中,5个数据片分别为:数据片A、数据片B、数据片C、数据片D、数据片E,3个校验片分别为:检验片A、检验片B、检验片C。当数据片A和校验片A出现错误时,可以对数据片B、数据片C、数据片D、数据片E、检验片B、检验片C进行重构处理,进而得到数据A。
数据存储方式可以不仅限于上述纠删码存储方式,对于能够根据一个数据的切片数据重构该数据的其他的存储方式都是可行的,在此不做限定。
一种实施方式中,第一分配策略可以是根据以下信息中的至少一种,确定用于存储各个切片数据的第一数据中心:在之前已分配过作为第一数据中 心的数据中心、存储系统中各数据中心的存活状态、存储系统中各数据中心的剩余存储空间、存储系统中各数据中心的负载压力。
下面对上述四种所根据的信息分别进行介绍。
一、确定第一数据中心的依据为:在之前已分配过作为第一数据中心的数据中心。
目标数据中心可以记录每一次所确定的第一数据中心的标识。
第一种实现方式,为集中策略,即,目标数据中心在确定第一数据中心时,可以选取与上一次所确定的第一数据中心相同的数据中心。
例如,存储系统中包括:数据中心1、数据中心2、数据中心3、数据中心4、数据中心5,目标数据中心所记录的上一次所确定的第一数据中心为数据中心1和数据中心2,那么,目标数据中心在本次确定第一数据中心时,仍然选取数据中心1和数据中心2作为第一数据中心。
当上一次所确定出的第一数据中心出现故障、存储空间已满、剩余存储空间不足以存储切片数据等情况时,则不能再将上一次所确定出的第一数据中心再作为第一数据中心,可以通过目标数据中心确定第一数据中心的其他依据重新确定第一数据中心。
第二种实现方式,为分散策略,即,目标数据中心在确定第一数据中心时,可以选取与之前所确定的第一数据中心不同的数据中心。可以分为两种情况:第一种情况,可以选取与上一次所确定的第一数据中心不同的数据中心;第二种情况,可以选取与之前多次所确定的第一数据中心不同的数据中心。
下面就上述两种情况分别进行说明介绍。
第一种情况,目标数据中心在确定第一数据中心时,可以选取与上一次所确定的第一数据中心不同的数据中心。也就是说,所确定的第一数据中心只要与上一次所确定的第一数据中心不同即可。
例如,存储系统中包括:数据中心1、数据中心2、数据中心3、数据中心4、数据中心5,目标数据中心所记录的上一次所确定的数据中心为数据中心1和数据中心2,那么,目标数据中心在本次确定第一数据中心时,可以从数据中心3、数据中心4以及数据中心5中选取数据中心作为第一数据中心。
第二种情况,目标数据中心在确定第一数据中心时,可以选取与之前多次所确定的第一数据中心不同的数据中心。其中,之前多次的次数可以是自 定义设定的。
在上述实现方式的基础上,为了使得切片数据在存储系统所包含的数据中心中是分散存储的,目标数据中心在确定第一数据中心时可以采用轮转策略。
可以预先设定针对存储系统所包含各数据中心的标识排序表,该标识排序表中记录存储系统所包含的数据中心所对应标识的排序,排序的规则可以自定义设定。目标数据中心在每一次确定第一数据中心时,均可以按照该标识排序表进行分配,每一次所确定的第一数据中心的标识为:上一次所确定的第一数据中心所对应标识在该标识排序表中的下一标识。
当所确定的第一数据中心的剩余存储空间不足以存储切片数据时,可以重新确定第一数据中心。
二、确定第一数据中心的依据为:存储系统中各数据中心的存活状态。
其中,数据中心的存活状态为:数据中心的故障状态或者完好状态。
存储系统中的每一数据中心可以获取到其他各数据中心的存活状态,对于目标数据中心来说,也可以实时了解存储系统中其他各数据中心的存活状态。这样,目标数据中心在确定第一数据中心时,可以及时地排除故障的数据中心,而仅从完好的数据中心中确定目标数据中心。
三、确定第一数据中心的依据为:存储系统中各数据中心的剩余存储空间。
可以针对各数据中心的剩余存储空间进行排序,对于剩余存储空间较大的数据中心,可以优先将其确定为第一数据中心。这样,目标数据中心可以在每进行一次第一数据中心的确定之后,对各数据中心的剩余存储空间的排序进行更新,所得到的新的剩余存储空间排序可以作为下一次确定第一数据中心的依据。这样,可以均衡存储系统中的各数据中心的剩余存储空间。
四、确定第一数据中心的依据为:存储系统中各数据中心的负载压力。
可以针对存储系统中所包含的各数据中心的负载压力进行排序,对于负载压力小的数据中心,可以优先将其确定为第一数据中心。这样,目标数据中心可以在每进行一次第一数据中心的确定之后,对各数据中心的负载压力排序进行更新,所得到的新的负载压力排序可以作为下一次确定第一数据中心的依据。这样,可以均衡存储系统中的各数据中心的负载压力。
当然,所确定出的第一数据中心是可以有足够的剩余存储空间存储所对 应的切片数据。
对于上述四种信息,可以分别单独地作为目标数据中心确定第一数据中心的依据,还可以将任意两种、三种或者四种组合作为目标数据中心确定第一数据中心的依据。在此不做限定。
一种实施方式中,根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心(S103),可以包括如下步骤。
获得切片数据的数量;
根据预设的第一分配策略,从存储系统中确定所获得的数量个第一数据中心;
为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
该实施方式中,即为上述实施例中的第一种情况:切片数据与第一数据中心进行一对一的分配。因此,第一数据分成多少切片数据,则相应地,从存储系统中确定多少个数据中心作为第一数据中心。
对于目标数据中心来说,对第一数据进行切片处理,可以获取到第一数据所切分成的切片数据的数量,进而可以确定第一数据中心的数量。
例如,目标数据中心对第一数据进行切片处理,分成4个切片数据,则为了一对一的存储4个切片数据,需要从存储系统中确定4个数据中心,作为第一数据中心。
其中,确定与切片数据相同数量的第一数据中心,可以是根据预设的第一分配策略。可以参见上述实施方式,在此不再赘述。
因为该实施方式中采用一对一的存储方式,因此,对于目标数据中心来说,可以为每一切片数据分配一个第一数据中心,并且,每一切片数据所分配的第一数据中心不相同。
例如,目标数据中心对第一数据进行切片处理,分成4个切片数据分别为:切片数据1、切片数据2、切片数据3和切片数据4,从存储系统中确定4个数据中心,作为第一数据中心,所确定的4个第一数据中心分别为:第一数据中心A、第一数据中心B、第一数据中心C和第一数据中心D,可以将切片数据1分配给第一数据中心A,可以将切片数据2分配给第一数据中心B,可以将切片数据3分配给第一数据中心C,可以将切片数据4分配给第一数据中心D。
一对一分配的策略可以是依据各切片数据的数据量大小以及各第一数据 中心的剩余存储空间进行分配。
一种实现方式,按照数据量的大小,对各切片数据进行排序,作为第一序列,按照剩余存储空间的大小,对各第一数据中心进行排序,作为第二序列;将第一序列和第二序列按照顺序进行一一对应。这样,第一序列中排第一的切片数据与第二序列中排第一的第一数据中心相对应,第一序列中排第二的切片数据与第二序列中排第二的第一数据中心相对应,以此类推。
例如,对切片数据按照数据量从大到小的顺序排序,得到的第一序列为:切片数据1、切片数据2、切片数据3;对第一数据中心按照剩余存储空间从大到小的顺序排序,得到的第二序列为:第一数据中心C、第一数据中心B、第一数据中心A,因此,按照对应关系进行分配,则将切片数据1分配给第一数据中心C,将切片数据2分配给第一数据中心B,将切片数据3分配给第一数据中心A。
一对一分配的策略还可以是依据各切片数据的数据量大小以及各第一数据中心的负载压力进行分配。一种实现方式,按照数据量的大小,对各切片数据进行排序,作为第三序列,按照负载压力的大小,对各第一数据中心进行排序,作为第四序列;将第三序列和第四序列按照顺序进行一一对应。这样,第三序列中排第一的切片数据与第四序列中排第一的第一数据中心相对应,第三序列中排第二的切片数据与第四序列中排第二的第一数据中心相对应,以此类推。该策略与上述一对一分配的策略相似,在此不再赘述。
为每一切片数据分配一个第一数据中心的分配方式不仅限于上述两种,还可以包括其他的分配方式,在此不做限定。
本申请实施例提供的技术方案中,通过获取第一数据,并在本地存储所述第一数据;按照预设的数据存储方式,确定第一数据的切片数据;根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心;将各个切片数据分别发送至各自对应的第一数据中心,以使得各个第一数据中心对所接收到的切片数据进行存储。通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
在上述图1及图1所对应实施例的基础上,本申请实施例还提供一种数据 存储方法,如图2所示,该方法在上述实施例的步骤S104之后,还可以包括如下步骤。
S201,当本地存储第一数据的存储设备故障的情况下,从各个第一数据中心提取第一数据的切片数据。
其中,存储设备可以是用于存储数据的设备,例如,硬盘、磁盘等。对于存储系统中的数据中心,每一数据中心均可以包括多个存储设备,并且,每一数据中心可以决定将数据存储于哪个存储设备中。本申请中的本地存储是指目标数据中心的存储。
对于第一数据的存储,可以存储在目标数据中心的一个存储设备中,还可以分开存储在多个存储设备中。当存储在一个存储设备时,检测该一个存储设备是否故障;当存储在多个存储设备时,需要对该多个存储设备进行检测。
对于存储设备的故障,当不能从存储设备上读取数据时,则可以认为该存储设备故障。当目标数据中心还可以从存储设备上读取数据时,目标数据中心接收到提取第一数据的指令,直接从存储第一数据的存储设备上读取第一数据,并将所读取的第一数据发送至指令所指向的位置或设备。
另外,第一数据中心与切片数据是对应的,第一数据中心存储切片数据,所提取的切片数据属于同一数据,即本实施例中的第一数据。
目标数据中心从各个第一数据中心提取切片数据,并将所提取的切片数据存储至本地。其中,目标数据中心提取切片数据的方式可以是通过网络,还可以是通过SAS网络提取。
通过网络提取切片数据时,该网络是存储系统中各数据中心之间相互连接的网络的,目标数据中心提取切片数据的过程是:由各存储有切片数据的第一数据中心将各自的切片数据发送至目标数据中心,再由目标数据中心将所接收到的切片数据存储至本地。
通过SAS网络提取切片数据时,SAS网络是存储系统中各数据中心之间相互连接的网络,因为SAS网络的特性,对于每一数据中心来说,均可以从其他数据中心直接读取或者写入数据。因此,目标数据中心提取切片数据的过程是:目标数据中心从各存储有切片数据的第一数据中心读取各切片数据,并将所读取的切片数据存储至本地。
S202,按照数据存储方式,利用所提取的切片数据重构第一数据。
目标数据中心在提取到第一数据的各切片数据之后,可以通过将所提取到的各切片数据进行重构,即得到第一数据。
其中,数据存储方式可以是纠删码存储方式,即可以按照纠删码存储方式对所提取的切片数据进行重构,得到第一数据。可以参见上述实施方式,在此不再赘述。
例如,第一数据的切片数据包括:切片数据1、切片数据2和切片数据3,当目标数据中心分别提取到切片数据1、切片数据2和切片数据3之后,按照纠删码存储方式,对切片数据1、切片数据2和切片数据3进行重构处理,进而可以得到第一数据。
一种实施方式中,在上述图1对应的实施例的基础上,步骤S104之后,还可以包括如下步骤:
1、接收第一客户端发送的提取第一数据的请求;
2、检测本地存储第一数据的存储设备是否故障,如果否,执行下面步骤3;
3、将本地存储的第一数据发送至第一客户端。
下面对上述实施方式中的各步骤分别进行介绍。
对于步骤1,第一客户端可以是存储系统外部的设备,第一客户端可以向存储系统中的任一数据中心请求提取数据,对于请求提取的数据,可以是存储在接收提取请求的数据中心,还可以存储在除接收提取请求的数据中心以外的其他数据中心。
目标数据中心作为接收提取请求的数据中心,对于目标数据中心来说,首先确定第一客户端所需提取的数据是否存储在本地,若存储在本地,则可以执行下面的步骤2,若没有存储在本地,还可分为两种情况。
第一种情况,所需提取的数据在其他数据中心中存储有,则目标数据中心可以从其他数据中心提取该数据;
第二种情况,所需提取的数据在其他数据中心中也没有存储,而仅有该数据的切片数据,则目标数据中心可以从存储有切片数据的数据中心中提取该数据的切片数据,并将切片数据重构为所需提取的数据。
本实施方式中,第一客户端所需提取的第一数据是存储在目标数据中心中的。
步骤2,目标数据中心确定本地存储第一数据的存储位置,即确定第一数 据存储在目标数据中心的哪个存储设备中。再对所存储第一数据的存储设备进行检测,检测该存储设备是否故障,也就是说,检测是否可以从该存储设备上读取所存储的第一数据。
若目标数据中心中存储第一数据的存储设备故障,目标数据中心不能直接将所存储的第一数据发送给第一客户端。
此时,一种实现方式,第一客户端可以向其他任一个第一数据中心请求提取第一数据。
具体地,在存储系统中,各数据中心是相互关联的。因此,对于第一客户端来说,可以从任一个数据中心获取到存储有第一数据的切片数据的数据中心对应的标识,即第一数据中心的标识。这样,第一客户端可以向任一个第一数据中心发送提取第一数据的请求。接收到请求的第一数据中心从本地以及其他的第一数据中心提取第一数据的切片数据,并按照数据存储方式将所提取到的切片数据进行重构处理得到第一数据,再将重构得到的第一数据发送至第一客户端。
另一种实现方式,在目标数据中心还可以从其他数据中心提取数据并进行重构处理的情况下,目标数据中心可以从第一数据中心提取第一数据的切片数据。在目标数据中心将第一数据的切片数据提取至目标数据中心本地后,目标数据中心可以按照数据存储方式,将所提取的第一数据的切片数据进行重构处理,得到第一数据,再将重构得到的第一数据发送至第一客户端。
步骤3,若存储设备没有故障,则目标数据中心可以从本地直接获取到第一数据,并将所获取到的第一数据发送至第一客户端。
一种实施方式中,目标数据中心中本地还可以存储第二数据的切片数据,也就是说,目标数据中心在本地仅存储了第二数据的切片数据,而没有存储第二数据。
本申请实施例提供的一种数据存储方法,还可以包括如下步骤:
一、接收第二客户端发送的提取第二数据的请求;确定存储有第二数据的切片数据的数据中心,作为第二数据中心;
二、从各个第二数据中心中提取第二数据的切片数据,按照数据存储方式,利用所提取的第二数据的切片数据重构第二数据,将重构得到的第二数据发送至第二客户端。
下面对上述实施方式中的上述步骤进行介绍。
步骤一,对于提取第二数据的请求,在目标数据中心本地没有存储第二数据,不能从目标数据中心本地直接提取第二数据。可以通过提取第二数据的切片数据进而获取第二数据。
目标数据中心可以从存储系统中确定出其他存储有第二数据的切片数据的数据中心,并将所确定出的数据中心作为第二数据中心。
例如,第二数据的切片数据包括:切片数据1、切片数据2、切片数据3和切片数据4,在存储系统中,切片数据1存储在目标数据中心,切片数据2存储在数据中心A,切片数据3存储在数据中心B,切片数据4存储在数据中心C。那么,当目标数据中心接收到第二客户端的提取第二数据的请求时,目标数据中心可以确定出存储有第二数据的切片数据的数据中心A、数据中心B和数据中心C,并将数据中心A、数据中心B和数据中心C作为第二数据中心。
步骤二,目标数据中心在确定出存储有第二数据中心后,可以从各第二数据中心中提取对应的第二数据的切片数据,按照数据存储方式,利用所提取的第二数据的切片数据重构第二数据,将重构得到的第二数据发送至第二客户端。
其中,数据存储方式可以参见上述实施方式,其他部分可以与上述步骤S201和S202部分,在此不再赘述。
例如,第二数据的切片数据包括:切片数据1、切片数据2和切片数据3,切片数据1存储在目标数据中心,切片数据2存储在数据中心A,切片数据3存储在数据中心B,那么,当目标数据中心接收到第二客户端的提取第二数据的请求时,目标数据中心可以从数据中心A提取切片数据2,还可以从数据中心B提取切片数据3,并按照纠删码存储方式,可以对本地存储的切片数据1、所提取的切片数据2和切片数据3进行重构处理得到第二数据,将重构得到的第二数据发送至第二客户端。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
相应于上述图1及图1对应的实施例,本申请实施例还提供一种数据存储装置,应用于存储系统中的目标数据中心,存储系统中包括至少两个数据中 心;如图3所示,该装置包括:
获取模块310,用于获取第一数据,并在本地存储第一数据;
第一确定模块320,用于按照预设的数据存储方式,确定第一数据的切片数据,其中,数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
第二确定模块330,用于根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心;
第一发送模块340,用于将各个切片数据分别发送至各自对应的第一数据中心。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
可选地,一种实施方式,第二确定模块330包括:
获得子模块,用于获得切片数据的数量;
第一确定子模块,用于根据预设的第一分配策略,从存储系统中确定数量个第一数据中心;
第一分配子模块,用于为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
可选地,一种实施方式,第二确定模块330包括:
排序子模块,用于根据预设的第一分配策略,对存储系统所包括的数据中心进行排序;
第二确定子模块,用于按照所述排序,确定第一数据中心,其中,所确定的第一数据中心的剩余存储空间之和大于所确定的切片数据的数据量之和;
第二分配子模块,用于根据各第一数据中心的剩余存储空间,为各第一数据中心分配所确定的切片数据,以使得每一切片数据分配一个第一数据中心。
可选地,一种实施方式,数据存储方式可以为:纠删码存储方式。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据 中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
相应于上述图2及图2对应的实施例,本申请实施例还提供一种数据存储装置,如图4所示,在上述图3及图3对应的实施例的基础上,该装置还可以包括:
第一提取模块410,用于当本地存储第一数据的存储设备故障的情况下,从各个第一数据中心提取第一数据的切片数据;
第一重构模块420,用于按照数据存储方式,利用所提取的切片数据重构第一数据。
可选地,一种实施方式,该装置还可以包括:
第一接收模块,用于接收第一客户端发送的提取第一数据的请求;
检测模块,用于检测目标数据中心中存储第一数据的存储设备是否故障,如果是,触发第一提取模块410;
第二发送模块,用于当检测模块的检测结果为否时,将本地存储的第一数据发送至第一客户端。
可选地,一种实施方式,本地存储第二数据的切片数据;该装置还可以包括:
第二接收模块,用于接收第二客户端发送的提取第二数据的请求;
第三确定模块,用于确定存储有第二数据的切片数据的数据中心,作为第二数据中心;
第二提取模块,用于从各个第二数据中心中提取第二数据的切片数据;
第二重构模块,用于按照数据存储方式,利用所提取的第二数据的切片数据重构第二数据;
第三发送模块,用于将重构得到的第二数据发送至第二客户端。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
本申请实施例还提供了一种电子设备,如图5所示,包括处理器510、和 存储器520;
存储器520,用于存放计算机程序;
处理器510,用于执行存储器520上所存放的程序时,实现如下步骤:
获取第一数据,并在本地存储第一数据;
按照预设的数据存储方式,确定第一数据的切片数据,其中,数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
根据预设的第一分配策略,从存储系统中确定用于存储各个切片数据的第一数据中心;
将各个切片数据分别发送至各自对应的第一数据中心。
通过本申请实施例提供的技术方案,除了在本地存储数据外,还可以将该数据进行切片得到切片数据,并将切片数据分别存储至所选定的第一数据中心中,这样,不仅保证了数据存储的安全性和可靠性,而且将数据切片后将切片数据分别存储,节省了存储空间,减小了存储系统中数据中心的存储空间的消耗。
本申请实施例提供的一种电子设备还可以执行上述实施例中任一所述的一种数据存储方法。具体见图1和图2所对应的实施例,这里不再赘述。
在本申请提供的又一实施例中,还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述图1和图2所对应的实施例中任一所述的一种数据存储方法。
上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
通信接口用于上述电子设备与其他设备之间的通信。
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific  Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。

Claims (16)

  1. 一种数据存储方法,其特征在于,应用于存储系统中的目标数据中心,所述存储系统中包括至少两个数据中心;
    所述方法包括:
    获取第一数据,并在本地存储所述第一数据;
    按照预设的数据存储方式,确定所述第一数据的切片数据,其中,所述数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
    根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心;
    将各个切片数据分别发送至各自对应的第一数据中心。
  2. 根据权利要求1所述的方法,其特征在于,所述根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心,包括:
    获得切片数据的数量;
    根据预设的第一分配策略,从所述存储系统中确定所述数量个第一数据中心;
    为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
  3. 根据权利要求1所述的方法,其特征在于,所述根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心,包括:
    根据预设的第一分配策略,对所述存储系统所包括的数据中心进行排序;
    按照所述排序,确定第一数据中心,其中,所确定的第一数据中心的剩余存储空间之和大于所确定的切片数据的数据量之和;
    根据各第一数据中心的剩余存储空间,为各第一数据中心分配所确定的切片数据,以使得每一切片数据分配一个第一数据中心。
  4. 根据权利要求1所述的方法,其特征在于,所述数据存储方式为:纠删码存储方式。
  5. 根据权利要求1所述的方法,其特征在于,所述将各个切片数据分别发送至各自对应的第一数据中心之后,还包括:
    当本地存储所述第一数据的存储设备故障的情况下,从各个第一数据中心提取所述第一数据的切片数据;
    按照所述数据存储方式,利用所提取的切片数据重构所述第一数据。
  6. 根据权利要求1所述的方法,其特征在于,所述将各个切片数据分别发送至各自对应的第一数据中心之后,还包括:
    接收第一客户端发送的提取所述第一数据的请求;
    检测本地存储所述第一数据的存储设备是否故障;
    如果否,将本地存储的所述第一数据发送至所述第一客户端。
  7. 根据权利要求1所述的方法,其特征在于,本地存储第二数据的切片数据;
    所述方法还包括:
    接收第二客户端发送的提取第二数据的请求;
    确定存储有所述第二数据的切片数据的数据中心,作为第二数据中心;
    从各个第二数据中心中提取所述第二数据的切片数据;
    按照所述数据存储方式,利用所提取的所述第二数据的切片数据重构所述第二数据;
    将重构得到的第二数据发送至所述第二客户端。
  8. 一种数据存储装置,其特征在于,应用于存储系统中的目标数据中心,所述存储系统中包括至少两个数据中心;
    所述装置包括:
    获取模块,用于获取第一数据,并在本地存储所述第一数据;
    第一确定模块,用于按照预设的数据存储方式,确定所述第一数据的切片数据,其中,所述数据存储方式为:能够根据数据的切片数据重构该数据的存储方式;
    第二确定模块,用于根据预设的第一分配策略,从所述存储系统中确定用于存储各个切片数据的第一数据中心;
    第一发送模块,用于将各个切片数据分别发送至各自对应的第一数据中心。
  9. 根据权利要求8所述的装置,其特征在于,所述第二确定模块包括:
    获得子模块,用于获得切片数据的数量;
    第一确定子模块,用于根据预设的第一分配策略,从所述存储系统中确定所述数量个第一数据中心;
    第一分配子模块,用于为每一切片数据分配一个第一数据中心,其中,每一切片数据所分配的第一数据中心不相同。
  10. 根据权利要求8所述的装置,其特征在于,所述第二确定模块包括:
    排序子模块,用于根据预设的第一分配策略,对所述存储系统所包括的数据中心进行排序;
    第二确定子模块,用于按照所述排序,确定第一数据中心,其中,所确定的第一数据中心的剩余存储空间之和大于所确定的切片数据的数据量之和;
    第二分配子模块,用于根据各第一数据中心的剩余存储空间,为各第一数据中心分配所确定的切片数据,以使得每一切片数据分配一个第一数据中心。
  11. 根据权利要求8所述的装置,其特征在于,所述数据存储方式为:纠删码存储方式。
  12. 根据权利要求8所述的装置,其特征在于,所述装置还包括:
    第一提取模块,用于当本地存储所述第一数据的存储设备故障的情况下,从各个第一数据中心提取所述第一数据的切片数据;
    第一重构模块,用于按照所述数据存储方式,利用所提取的切片数据重构所述第一数据。
  13. 根据权利要求8所述的装置,其特征在于,所述装置还包括:
    第一接收模块,用于接收第一客户端发送的提取所述第一数据的请求;
    检测模块,用于检测所述目标数据中心中存储所述第一数据的存储设备是否故障;
    第二发送模块,用于当所述检测模块的检测结果为否时,将本地存储的所述第一数据发送至所述第一客户端。
  14. 根据权利要求8所述的装置,其特征在于,本地存储第二数据的切片数据;
    所述装置还包括:
    第二接收模块,用于接收第二客户端发送的提取第二数据的请求;
    第三确定模块,用于确定存储有所述第二数据的切片数据的数据中心,作为第二数据中心;
    第二提取模块,用于从各个第二数据中心中提取所述第二数据的切片数据;
    第二重构模块,用于按照所述数据存储方式,利用所提取的所述第二数据的切片数据重构所述第二数据;
    第三发送模块,用于将重构得到的第二数据发送至所述第二客户端。
  15. 一种电子设备,其特征在于,包括处理器和存储器;
    存储器,用于存放计算机程序;
    处理器,用于执行存储器上所存放的程序时,实现权利要求1-7任一所述的方法步骤。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-7任一所述的方法步骤。
PCT/CN2019/077440 2018-03-09 2019-03-08 一种数据存储方法及装置 WO2019170133A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810193321.6A CN110244903B (zh) 2018-03-09 2018-03-09 一种数据存储方法及装置
CN201810193321.6 2018-03-09

Publications (1)

Publication Number Publication Date
WO2019170133A1 true WO2019170133A1 (zh) 2019-09-12

Family

ID=67845874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/077440 WO2019170133A1 (zh) 2018-03-09 2019-03-08 一种数据存储方法及装置

Country Status (2)

Country Link
CN (1) CN110244903B (zh)
WO (1) WO2019170133A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268523B (zh) * 2021-05-14 2022-02-01 刘伟铭 一种产品多工序工业数据等分切片对齐存储搜索系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105573680A (zh) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 副本数据的存储方法及装置
CN106649891A (zh) * 2017-02-24 2017-05-10 深圳市中博睿存信息技术有限公司 一种分布式数据存储方法和系统
CN106686095A (zh) * 2016-12-30 2017-05-17 郑州云海信息技术有限公司 一种基于纠删码技术的数据存储方法及装置
CN106909470A (zh) * 2017-01-20 2017-06-30 深圳市中博科创信息技术有限公司 基于纠删码的分布式文件系统存储方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8656253B2 (en) * 2011-06-06 2014-02-18 Cleversafe, Inc. Storing portions of data in a dispersed storage network
CN102279777B (zh) * 2011-08-18 2014-09-03 华为数字技术(成都)有限公司 数据冗余处理方法、装置和分布式存储系统
MX2013005303A (es) * 2013-05-10 2013-08-07 Fondo De Informacion Y Documentacion Para La Ind Infotec Un sistema y un proceso de alto desempeño para el tratamiento y almacenamiento de datos, basado en componentes de bajo costo, que garantiza la integridad y disponibilidad de los datos para su propia administracion.
CN107436725B (zh) * 2016-05-25 2019-12-20 杭州海康威视数字技术股份有限公司 一种数据写、读方法、装置及分布式对象存储集群
CN107273048B (zh) * 2017-06-08 2020-08-04 浙江大华技术股份有限公司 一种数据写入方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105573680A (zh) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 副本数据的存储方法及装置
CN106686095A (zh) * 2016-12-30 2017-05-17 郑州云海信息技术有限公司 一种基于纠删码技术的数据存储方法及装置
CN106909470A (zh) * 2017-01-20 2017-06-30 深圳市中博科创信息技术有限公司 基于纠删码的分布式文件系统存储方法及装置
CN106649891A (zh) * 2017-02-24 2017-05-10 深圳市中博睿存信息技术有限公司 一种分布式数据存储方法和系统

Also Published As

Publication number Publication date
CN110244903A (zh) 2019-09-17
CN110244903B (zh) 2021-08-13

Similar Documents

Publication Publication Date Title
US9940209B2 (en) SVC cluster configuration node failover
US9851906B2 (en) Virtual machine data placement in a virtualized computing environment
CN107544832B (zh) 一种虚拟机进程的监控方法、装置和系统
US10275326B1 (en) Distributed computing system failure detection
KR20140061444A (ko) 비휘발성 스토리지 장치 세트의 휘발성 메모리 표현 기법
CN111176888B (zh) 云存储的容灾方法、装置及系统
US10225158B1 (en) Policy based system management
WO2019205788A1 (zh) 数据存储方法、存储服务器及云存储系统
US10303678B2 (en) Application resiliency management using a database driver
US11372549B2 (en) Reclaiming free space in a storage system
US20150288753A1 (en) Remote monitoring pool management
WO2018024139A1 (zh) 硬盘管理方法和系统
JP2006524872A (ja) 分散型検索方法、アーキテクチャ、システム、およびソフトウェア
CN114443332B (zh) 一种存储池的检测方法、装置、电子设备及存储介质
WO2018072561A1 (zh) 一种视频切换方法、装置及视频巡逻系统
CN112835885B (zh) 一种分布式表格存储的处理方法、装置及系统
WO2021213171A1 (zh) 一种服务器切换方法、装置、管理节点及存储介质
US11023354B2 (en) Hyper-converged infrastructure (HCI) log system
US9952951B2 (en) Preserving coredump data during switchover operation
CN109634524B (zh) 一种数据处理守护进程的数据分区配置方法、装置及设备
WO2019170133A1 (zh) 一种数据存储方法及装置
CN106708865B (zh) 流处理系统中访问窗口数据的方法和装置
WO2019170004A1 (zh) 一种数据存储系统、方法及装置
CN109408302B (zh) 一种故障检测方法、装置及电子设备
WO2018228147A1 (zh) 一种提高数据存储安全性方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19763277

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19763277

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19763277

Country of ref document: EP

Kind code of ref document: A1