CN110244903B - Data storage method and device - Google Patents

Data storage method and device Download PDF

Info

Publication number
CN110244903B
CN110244903B CN201810193321.6A CN201810193321A CN110244903B CN 110244903 B CN110244903 B CN 110244903B CN 201810193321 A CN201810193321 A CN 201810193321A CN 110244903 B CN110244903 B CN 110244903B
Authority
CN
China
Prior art keywords
data
slice
storage
data center
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810193321.6A
Other languages
Chinese (zh)
Other versions
CN110244903A (en
Inventor
汪渭春
夏伟强
王伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision System Technology Co Ltd
Original Assignee
Hangzhou Hikvision System Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision System Technology Co Ltd filed Critical Hangzhou Hikvision System Technology Co Ltd
Priority to CN201810193321.6A priority Critical patent/CN110244903B/en
Priority to PCT/CN2019/077440 priority patent/WO2019170133A1/en
Publication of CN110244903A publication Critical patent/CN110244903A/en
Application granted granted Critical
Publication of CN110244903B publication Critical patent/CN110244903B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data storage method and a data storage device, which are applied to a target data center in a storage system, wherein the storage system comprises at least two data centers; the method comprises the following steps: acquiring and locally storing first data; and determining slice data of the first data; and determining a first data center for storing the respective slice data; and respectively sending each slice data to the corresponding first data center. According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.

Description

Data storage method and device
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a data storage method and apparatus.
Background
With the increasing size of computer applications, the demand for data storage is increasing, wherein a storage system may include a plurality of data centers, and a data center is a device for transmitting, accelerating, displaying, calculating and storing data on the basis of a computer network.
At present, a storage system stores data and backups the data by using a copy technology. That is, when data is stored in the storage system, the data and the multiple copies corresponding to the data are stored in different data centers, so that the safety and reliability of data storage can be ensured.
However, the size of the copy of the data is the same as the data size of the data itself, that is, the storage space occupied by the copy in the data center is the same as the storage space occupied by the data itself, so that although the data reliability is improved by the copy of the data, a larger storage space is occupied in the data center, which results in a larger consumption of the storage space of the data center in the storage system.
Disclosure of Invention
The embodiment of the invention aims to provide a data storage method and a data storage device, which are used for solving the problem of large storage space consumption of a data center in a storage system. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data storage method, which is applied to a target data center in a storage system, where the storage system includes at least two data centers;
the method comprises the following steps:
acquiring first data and locally storing the first data;
determining slice data of the first data according to a preset data storage mode, wherein the data storage mode is as follows: a storage mode capable of reconstructing data from slice data of the data;
determining a first data center for storing each slice of data from the storage system according to a preset first allocation strategy;
and respectively sending each slice data to the corresponding first data center.
Optionally, the determining, from the storage system, a first data center for storing each slice of data according to a preset first allocation policy includes:
obtaining a number of slice data;
determining the number of first data centers from the storage system according to a preset first distribution strategy;
and allocating a first data center for each slice of data, wherein the first data centers allocated for each slice of data are different.
Optionally, the determining, from the storage system, a first data center for storing each slice of data according to a preset first allocation policy includes:
sequencing the data centers included in the storage system according to a preset first distribution strategy;
determining a first data center according to the sorting, wherein the sum of the determined residual storage spaces of the first data center is larger than the sum of the determined data amount of the slice data;
and distributing the determined slice data to the first data centers according to the remaining storage space of the first data centers, so that each slice data is distributed to one first data center.
Optionally, the data storage manner is: erasure code storage.
Optionally, after the sending the slice data to the corresponding first data centers, the method further includes:
when a storage device for locally storing the first data fails, extracting slice data of the first data from each first data center;
reconstructing the first data using the extracted slice data in accordance with the data storage manner.
Optionally, after the sending the slice data to the corresponding first data centers, the method further includes:
receiving a request for extracting the first data sent by a first client;
detecting whether a storage device locally storing the first data fails;
and if not, sending the first data stored locally to the first client.
Optionally, locally storing slice data of the second data;
the method further comprises the following steps:
receiving a request for extracting second data sent by a second client;
determining a data center storing slice data of the second data as a second data center;
extracting slice data of the second data from the respective second data centers;
reconstructing the second data by using the extracted slice data of the second data according to the data storage mode;
and sending the reconstructed second data to the second client.
In a second aspect, an embodiment of the present invention provides a data storage apparatus, which is applied to a target data center in a storage system, where the storage system includes at least two data centers;
the device comprises:
the acquisition module is used for acquiring first data and locally storing the first data;
the first determining module is configured to determine slice data of the first data according to a preset data storage manner, where the data storage manner is: a storage mode capable of reconstructing data from slice data of the data;
the second determining module is used for determining a first data center for storing each slice data from the storage system according to a preset first distribution strategy;
and the first sending module is used for sending each piece of slice data to the corresponding first data center.
Optionally, the second determining module includes:
an obtaining submodule for obtaining a number of slice data;
the first determining submodule is used for determining the number of first data centers from the storage system according to a preset first distribution strategy;
and the first allocating submodule is used for allocating a first data center for each slice of data, wherein the first data centers allocated to each slice of data are different.
Optionally, the second determining module includes:
the sorting submodule is used for sorting the data centers included in the storage system according to a preset first distribution strategy;
a second determining submodule, configured to determine a first data center according to the sorting, where a sum of remaining storage spaces of the determined first data center is greater than a sum of data amounts of the determined slice data;
and the second distribution submodule is used for distributing the determined slice data to the first data centers according to the residual storage space of the first data centers so that each slice data is distributed to one first data center.
Optionally, the data storage manner is: erasure code storage.
Optionally, the apparatus further comprises:
the first extraction module is used for extracting slice data of the first data from each first data center under the condition that a storage device for locally storing the first data fails;
a first reconstruction module for reconstructing the first data using the extracted slice data according to the data storage manner.
Optionally, the apparatus further comprises:
the first receiving module is used for receiving a request for extracting the first data sent by a first client;
the detection module is used for detecting whether storage equipment for storing the first data in the target data center is in fault or not, and if yes, the first extraction module is triggered;
and the second sending module is used for sending the first data stored locally to the first client side when the detection result of the detection module is negative.
Optionally, locally storing slice data of the second data;
the device further comprises:
the second receiving module is used for receiving a request for extracting second data sent by a second client;
a third determining module, configured to determine, as a second data center, a data center in which slice data of the second data is stored;
a second extraction module for extracting slice data of the second data from each second data center;
a second reconstruction module, configured to reconstruct the second data using the extracted slice data of the second data according to the data storage manner;
and the third sending module is used for sending the reconstructed second data to the second client.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory;
a memory for storing a computer program;
and the processor is used for realizing any one of the steps of the data storage method when executing the program stored in the memory.
In a fourth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above-mentioned data storage method steps.
According to the technical scheme provided by the embodiment of the invention, first data are acquired and locally stored; determining slice data of the first data according to a preset data storage mode; determining a first data center for storing each slice of data from a storage system according to a preset first allocation strategy; and respectively sending each slice data to the corresponding first data center. According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a data storage method according to an embodiment of the present invention;
fig. 2 is another flowchart of a data storage method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a data storage device according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of another structure of a data storage device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to save the storage space of a data center in a storage system and further reduce the consumption of the storage space, the embodiment of the invention provides a data storage method and a data storage device, which are applied to a target data center in the storage system, wherein the storage system comprises at least two data centers;
the method comprises the following steps:
acquiring first data and locally storing the first data;
determining slice data of the first data according to a preset data storage mode, wherein the data storage mode is as follows: a storage mode capable of reconstructing data from slice data of the data;
determining a first data center for storing each slice of data from a storage system according to a preset first allocation strategy;
and respectively sending each slice data to the corresponding first data center.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
First, a data storage method provided in an embodiment of the present invention is described below, where the method is applied to a target data center in a storage system, where the storage system includes at least two data centers.
The data center may include one or at least two storage devices, and the data center may store data in the storage devices included in the data center, where, on one hand, the storage devices may be regarded as machines such as storage servers, and each storage server may include a plurality of hard disks; on the other hand, the storage device may also be considered as a hard disk, the hard disk may be mounted on a machine, and each machine may mount a plurality of hard disks.
In the case that the data center includes at least two storage devices, which storage device the data center stores data to may be determined by the data center itself, the policy that the data center determines the storage device for storing data may be set by a user, and the policies defined by the respective data centers may be different. In addition, the data center can record the storage position of each data.
The determination of one data center as a target data center from among at least two data centers included in the storage system may be determined according to a preset policy. The preset policy may be a policy determined according to at least one of the following information: the data center which is used as a target data center is distributed; storing the survival state of each data center in the system; the residual storage space of each data center in the storage system; the load pressure of each data center in the system is stored.
The above allocation strategy is similar to the following first allocation strategy, which will be described in detail below and will not be described herein again.
As shown in fig. 1, the method includes:
s101, acquiring first data and storing the first data locally.
The first data is data to be stored. For the storage system, a direct storage manner may be adopted, and in the direct storage manner, the manner of acquiring the first data by the target data center may be: and directly sending the first data to be stored to the target data center by the client. After receiving the first data sent by the client, the target data center may store the first data locally, that is, in a storage device included in the target data center.
For example, the client is a monitoring device, the video data acquired by the monitoring device is first data, the monitoring device directly sends the video data to the target data center, and after receiving the video data, the target data center can store the video data in one storage device of the target data center, that is, the local storage device of the target data center.
In addition, the storage system may further adopt a dump mode, and in the dump mode, the storage system may further include an access node, where the access node is connected to a client outside the storage system, and is configured to receive data sent by the client.
The manner of acquiring the first data by the target data center may be: the access node receives first data sent by the client, after receiving the first data, the access node can send the first data to the target data center, and the target data center receives the first data sent by the client.
Of course, when the target data center stores the first data locally, the target data center itself may select the storage device for storing the first data.
And S102, determining the slice data of the first data according to a preset data storage mode.
The data storage mode is as follows: the storage form of the data can be reconstructed from the slice data of the data. That is, slice data can be obtained by slicing data in accordance with the requirement of the data storage system.
Further, according to the plurality of slice data that can be obtained in the data storage system, the plurality of slice data are subjected to reconstruction processing, and data before slicing of the slice data can be obtained. Of course, when some of the slice data obtained by the data storage method is damaged, if the number of damaged slice data is less than or equal to the number preset in the data storage method, the undamaged slice data may be reconstructed, and the data before slicing of the slice data may be obtained.
For example, if the data a is sliced to obtain slice data 1, slice data 2, slice data 3, and slice data 4 according to the requirement of the data storage method, the slice data 1, slice data 2, slice data 3, and slice data 4 may be reconstructed to obtain the data a. When the data storage mode is set as follows: and if the slice data 4 is damaged, the reconstruction processing can be performed on the slice data 1, the slice data 2 and the slice data 3 to obtain data A.
The slicing processing may be performed on the first data, and the first data may be sliced into a preset number of slice data, where the preset number may be preset by a user.
For example, if the predetermined number is set to 5, the first data may be sliced to obtain 5 pieces of slice data when the first data is sliced in accordance with the request of the data storage system.
Of course, the preset number may be determined by the preset data storage manner itself. For example, if the preset number set in the preset data storage method itself is 3, 3 pieces of slice data can be obtained when the data is sliced according to the requirement of the data storage method.
The predetermined data storage method may be an erasure code storage method, which is described in detail in the following first embodiment and will not be described in detail here.
And S103, determining a first data center for storing each slice of data from the storage system according to a preset first allocation strategy.
Wherein the first allocation strategy may be to determine a first data center for storing the respective slice data according to at least one of the following information: the data center that has been previously assigned as the first data center, the survival status of each data center in the storage system, the remaining storage space of each data center in the storage system, and the load pressure of each data center in the storage system.
The first allocation strategy is described in detail in the following second embodiment and will not be described in detail here.
When the first data centers are determined from the storage system according to the preset first allocation strategy, not only the number of the first data centers but also the first data centers to be stored correspondingly to each slice of data can be determined.
For example, there are 4 slice data of the first data, which are: slice data 1, slice data 2, slice data 3, and slice data 4, and according to a preset first allocation strategy, 4 first data centers may be determined, which are: the data center comprises a data center A, a data center B, a data center C and a data center D, wherein the slice data 1 is distributed to the data center A for storage, the slice data 2 is distributed to the data center B for storage, the slice data 3 is distributed to the data center C for storage, and the slice data 4 is distributed to the data center D for storage.
The manner of determining the first data center from the storage system can be divided into two cases:
in the first case: the determined number of the first data centers is the same as the number of the slice data, each data center is used for storing one slice data, and the slice data stored in each data center is different. This first case is described in detail in the following third embodiment, and will not be described in detail here.
In the second case: for each determined first data center, one or at least two slice data may be allocated for storage.
In this second case, the data centers comprised by the storage system are ordered according to a preset first allocation policy. For example, when the first allocation policy is to determine the first data center according to the remaining storage space, the data centers included in the storage system are sorted in the order from large to small of the remaining storage space of each data center: the data center with the largest residual storage space is ranked first, and the data center with the smallest residual space is ranked last.
Determining a first data center from the storage system according to the number of the obtained slice data and the data volume size of each slice data; that is, the determined first data center is sufficient to store slice data of the first data.
After the first data centers are determined, the first data centers stored correspondingly to each slice of data are determined according to the condition of the remaining storage space of each first data center.
Therefore, the slice data can be distributed in a balanced manner according to the operation capacity of each data center, and further, each data center in the storage system can operate more uniformly. For the first data center with relatively large residual storage space and relatively small load pressure, more slice data can be allocated for storage, and for the first data center with relatively small residual storage space and relatively small load pressure, less slice data or even only one slice data can be allocated for storage.
For example, there are 4 slice data of the first data, which are: the slicing data comprises slicing data A, slicing data B, slicing data C and slicing data D, wherein the slicing data A is 2M, the slicing data B is 2M, the slicing data C is 2M, and the slicing data D is 2M; according to a preset first distribution strategy, 3 first data centers are determined from the storage system, wherein the number of the first data centers is as follows: the data center a, the data center B and the data center C, and among the determined 3 first data centers, the remaining storage space of the data center a is the largest and the load pressure is the smallest, which can be used for storing two pieces of slice data at the same time, and the data center B and the data center C can store one piece of slice data respectively, so that the slice data a and the slice data B are allocated to the data center a for storage, the slice data C is allocated to the data center B for storage, and the slice data D is allocated to the data center C for storage.
Of course, in this second case, the determined slice data may also be stored in the same first data center, and when stored in the same first data center, may be stored in a different storage device included in the first data center.
For both cases, the determined first data center may comprise a target data center, that is, the target data center may also be used to store the allocated slice data if the first allocation policy is satisfied. At this time, the target data center may store both the first data locally and the slice data of the first data locally.
And S104, respectively sending the slice data to the corresponding first data centers.
The mode of sending the slice data may be sending through a network between data centers in the storage system, where the network may be any one of a communication network, a Serial Attached SCSI (Serial Attached SCSI, abbreviated as SAS), and the like, and is not limited herein. SCSI (Small Computer System interface) is a small Computer System interface, which is a common interface standard.
And respectively sending each slice data to a corresponding first data center, wherein the correspondence refers to: the slice data is distributed to a first data center to be stored.
For example, if slice data 1 is assigned to first data center a and slice data 2 is assigned to first data center B, the target data center transmits slice data 1 to first data center a and slice data 2 to first data center B.
After receiving the slice data, each first data center stores the received slice data, and the storage location of the slice data may be determined by each first data center.
For example, a first data center a includes 3 storage devices, respectively: the storage device a, the storage device B, and the storage device C, when the first data center a receives the slice data 1, the first data center a may determine to store the slice data 1 to the storage device B.
In the first embodiment, for the step S102, the preset data storage manner may be an Erasure Coding (EC) storage manner, and the first data is sliced according to the Erasure Coding storage manner, so as to obtain a preset number N of data pieces, and further obtain a corresponding preset number M of check pieces according to the obtained N data pieces. And the sum M + N of the data sheet and the test sheet is the number of the obtained slice data. Wherein, N and M can be preset by self.
For example, an RS (Reed-Solomon) code storage method is an erasure code storage method that is commonly used in a storage system, and in the RS code storage method, the number N of data pieces may be set to 5, the number M of parity pieces may be set to 3, and the first data is sliced according to the RS code storage method, so that 5 data pieces and 3 parity pieces may be obtained, and the finally obtained slice data is 8.
Further, with respect to the slice data obtained in accordance with the erasure code storage method, the obtained slice data may be subjected to reconstruction processing, and further, data before slicing of the slice data may be obtained. And M is the number of the preset undamaged slice data in the N data slices and the M check slices obtained by slicing according to the erasure code storage mode.
In the obtained M + N slice data, when the number of the damaged slice data is less than or equal to M, the undamaged slice data may be subjected to reconstruction processing to obtain data before slicing. Therefore, the data is sliced according to the erasure code storage mode, so that the fault tolerance effect can be achieved, and the reliability of data storage can be improved.
For example, slicing data a according to an erasure code storage manner, where the obtained sliced data includes 5 data slices and 3 parity slices, where the 5 data slices are respectively: data piece A, data piece B, data piece C, data piece D, data piece E, 3 check-up pieces are respectively: test piece A, test piece B and test piece C. When the data slice A and the check slice A have errors, the data slice B, the data slice C, the data slice D, the data slice E, the check slice B and the check slice C can be reconstructed to obtain the data A.
Of course, the data storage method may not be limited to the erasure code storage method described above, and other storage methods that can reconstruct one piece of data from its slice data are possible, and are not limited herein.
In a second embodiment, the first allocation strategy may be to determine a first data center for storing the respective slice data according to at least one of the following information: the data center that has been previously assigned as the first data center, the survival status of each data center in the storage system, the remaining storage space of each data center in the storage system, and the load pressure of each data center in the storage system.
The following describes the four kinds of information.
Firstly, the basis for determining the first data center is as follows: a data center has been previously assigned as the first data center.
The target data center may record the identity of the first data center determined each time.
The first implementation manner is a centralized policy, that is, when the target data center determines the first data center, the target data center may select the same data center as the first data center determined last time.
For example, the storage system includes: the data center comprises a data center 1, a data center 2, a data center 3, a data center 4 and a data center 5, wherein the first data center determined last time recorded by the target data center is the data center 1 and the data center 2, and then when the target data center determines the first data center this time, the data center 1 and the data center 2 are still selected as the first data center.
Of course, when the first data center determined last time is in a failure, the storage space is full, the remaining storage space is not enough to store the sliced data, and the like, the first data center determined last time cannot be used as the first data center again, and the first data center can be re-determined by determining other basis of the first data center through the target data center.
A second implementation is a decentralized strategy, i.e. the target data center may choose a different data center than the previously determined first data center when determining the first data center. Two cases can be distinguished: in the first case, a data center different from the first data center determined last time may be selected; in the second case, a different data center may be selected than the first data center determined a number of times before.
The following description is made for the above two cases:
in the first case, when the target data center determines the first data center, a data center different from the first data center determined last time may be selected; that is, the determined first data center may be different from the first data center determined last time.
For example, the storage system includes: the data center 1, the data center 2, the data center 3, the data center 4, and the data center 5, the data center determined last time recorded by the target data center is the data center 1 and the data center 2, and then when the target data center determines the first data center this time, the data center can be selected from the data center 3, the data center 4, and the data center 5 as the first data center.
In the second case, the target data center may select a data center different from the first data center determined a plurality of times before when determining the first data center. The number of times of the previous times can be set by a user.
Based on the above implementation, further, in order to make the storage of the slice data in the data centers included in the storage system decentralized, the target data center may employ a round-robin strategy when determining the first data center.
The method comprises the steps that an identification sorting table for each data center contained in a storage system can be preset, the sorting of identifications corresponding to the data centers contained in the storage system is recorded in the identification sorting table, and a sorting rule can be set in a user-defined mode; when the target data center determines the first data center each time, the target data center can be distributed according to the identification sorting table, and the identification of the first data center determined each time is as follows: and the last determined identifier corresponding to the first data center is the next identifier in the identifier sorting table.
Of course, the first data center may be re-determined when the determined remaining storage space of the first data center is insufficient to store the slice data.
Secondly, determining the basis of the first data center as follows: and storing the survival state of each data center in the system.
The survival state of the data center is as follows: a fault condition or a health condition of the data center.
Each data center in the storage system can acquire the survival state of each other data center, and for the target data center, the survival state of each other data center in the storage system can be known in real time. In this way, the target data center may, when determining the first data center, remove the failed data center in a timely manner, and determine the target data center only from the good data centers.
Thirdly, determining the basis of the first data center as follows: and storing the residual storage space of each data center in the system.
The remaining storage space of each data center may be sorted, and for a data center with a larger remaining storage space, the data center may be preferentially determined as the first data center. In this way, the target data center may update the ranking of the remaining storage spaces of each data center after each determination of the first data center, and the obtained new ranking of the remaining storage spaces may be used as a basis for determining the first data center next time. In this way, the remaining storage space of each data center in the storage system may be equalized.
Fourthly, determining the basis of the first data center as follows: the load pressure of each data center in the system is stored.
The load pressures of the data centers included in the storage system may be ranked, and a data center with a low load pressure may be preferentially determined as the first data center. In this way, the target data center may update the load pressure ranking of each data center after each determination of the first data center, and the obtained new load pressure ranking may be used as a basis for determining the first data center next time. In this way, the load pressure of the various data centers in the storage system may be equalized.
Of course, the determined first data center may have enough remaining storage space to store the corresponding slice data.
For the above four kinds of information, the information can be respectively and individually used as a basis for determining the first data center by the target data center, and any two, three or four kinds of combinations can also be used as a basis for determining the first data center by the target data center. And are not limited herein.
In a third embodiment, determining a first data center for storing each slice data from a storage system according to a preset first allocation strategy (S103), may include the following steps:
obtaining a number of slice data;
determining the obtained number of first data centers from the storage system according to a preset first distribution strategy;
and allocating a first data center for each slice of data, wherein the first data centers allocated for each slice of data are different.
This embodiment is the first case in the above-described embodiment: the slice data is distributed in a one-to-one relationship with the first data center. Thus, how many slices the first data is divided into, and accordingly, how many data centers are determined from the storage system as the first data centers.
For the target data center, slicing processing is performed on the first data, so that the number of sliced data into which the first data is sliced can be acquired; the number of first data centers may then be determined.
For example, if the target data center performs slicing processing on the first data and divides the first data into 4 pieces of slice data, 4 data centers need to be determined from the storage system as the first data center in order to store the 4 pieces of slice data in a one-to-one manner.
Of course, determining the same number of first data centers as the slice data may be according to a preset first allocation strategy. Reference may be made to the second embodiment described above, which is not described in detail herein.
Because the one-to-one storage method is adopted in this embodiment, one first data center may be allocated to each slice data for the target data center, and the first data center allocated to each slice data is different.
For example, the target data center performs slicing processing on the first data, and the slicing processing is divided into 4 pieces of slice data: slice data 1, slice data 2, slice data 3, and slice data 4, and 4 data centers are determined from the storage system as first data centers, and the determined 4 first data centers are respectively: first data center a, first data center B, first data center C, and first data center D, slice data 1 may be assigned to first data center a, slice data 2 may be assigned to first data center B, slice data 3 may be assigned to first data center C, and slice data 4 may be assigned to first data center D.
The strategy of one-to-one allocation can be allocation according to the data size of each slice data and the remaining storage space of each first data center.
In one implementation, the slice data is sorted according to the size of the data volume to serve as a first sequence, and the first data centers are sorted according to the size of the remaining storage space to serve as a second sequence; and carrying out one-to-one correspondence on the first sequence and the second sequence according to the sequence. Thus, the first-row slice data in the first sequence corresponds to the first data center of the first-row in the second sequence, the second-row slice data in the first sequence corresponds to the first data center of the second-row in the second sequence, and so on.
For example, slice data is sorted in the order of the data size from large to small, and the obtained first sequence is: slice data 1, slice data 2, and slice data 3; sequencing the first data center according to the sequence of the residual storage spaces from large to small, wherein the obtained second sequence is as follows: therefore, when the data are distributed according to the correspondence relationship, the slice data 1 is distributed to the first data center C, the slice data 2 is distributed to the first data center B, and the slice data 3 is distributed to the first data center a.
The one-to-one allocation strategy may also be to allocate according to the data size of each slice data and the load pressure of each first data center, and the strategy is similar to the one-to-one allocation strategy described above and is not described herein again.
Of course, the allocation method for allocating one first data center to each slice of data is not limited to the above two allocation methods, and may include other allocation methods, which are not limited herein.
According to the technical scheme provided by the embodiment of the invention, first data are acquired and locally stored; determining slice data of the first data according to a preset data storage mode; determining a first data center for storing each slice of data from a storage system according to a preset first allocation strategy; and respectively sending each piece of slice data to the corresponding first data center, so that each first data center stores the received piece of slice data. According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
On the basis of the embodiments corresponding to fig. 1 and fig. 1, an embodiment of the present invention further provides a data storage method, as shown in fig. 2, after step S104 of the above embodiment, the method may further include the following steps:
s201, when the storage device for locally storing the first data fails, slice data of the first data are extracted from each first data center.
The storage device may be a device for storing data, such as a hard disk, a magnetic disk, and the like. For the data centers in the storage system, each data center may include a plurality of storage devices, and each data center may decide in which storage device to store data. Local storage in this application refers to storage of the target data center.
For the storage of the first data, the first data may be stored in one storage device of the target data center, or may be separately stored in a plurality of storage devices. When storing in a storage device, only need to detect whether this one storage device is out of order; when storing on multiple storage devices, the multiple storage devices need to be checked.
For a failure of a storage device, when data cannot be read from the storage device, the storage device is considered to be failed. When the target data center can also read data from the storage device, the target data center receives an instruction for extracting the first data, directly reads the first data from the storage device for storing the first data, and sends the read first data to a position or a device pointed by the instruction.
In addition, the first data center corresponds to the slice data, and the first data center stores the slice data, but the extracted slice data is the same data, i.e., the first data in the present embodiment.
The target data center extracts slice data from each of the first data centers and stores the extracted slice data locally. The mode of extracting the slice data by the target data center may be through a network or through an SAS network.
When the slice data is extracted through the network, the network is a network which is connected among all data centers in the storage system, and the process of extracting the slice data by the target data center is as follows: and sending the respective slice data to a target data center by each first data center storing the slice data, and storing the received slice data to the local by the target data center.
When slice data is extracted through the SAS network, the SAS network is a network interconnecting data centers in the storage system, and due to the characteristics of the SAS network, each data center can directly read or write data from or into other data centers. Therefore, the process of extracting slice data by the target data center is as follows: the target data center reads each slice data from each first data center storing the slice data, and stores the read slice data locally.
S202, the first data is reconstructed using the extracted slice data in a data storage manner.
After extracting each slice data of the first data, the target data center may obtain the first data by reconstructing each extracted slice data.
The data storage mode may be an erasure code storage mode, that is, the extracted slice data may be reconstructed according to the erasure code storage mode to obtain the first data. Reference may be made to the first embodiment described above, which is not described in detail herein.
For example, the slice data of the first data includes: and after the slice data 1, the slice data 2 and the slice data 3 are respectively extracted from the target data center, reconstructing the slice data 1, the slice data 2 and the slice data 3 according to an erasure code storage mode, and further obtaining the first data.
In the fourth embodiment, on the basis of the above embodiment corresponding to fig. 1, after step S104, the following steps may be further included:
1. receiving a request for extracting first data sent by a first client;
2. detecting whether the storage equipment for locally storing the first data is in failure, and if not, executing the following step 3;
3. and sending the locally stored first data to the first client.
The following describes each step in the fourth embodiment.
For step 1, the first client may be a device external to the storage system, the first client may request to extract data from any data center in the storage system, and the data requested to be extracted may be stored in the data center that receives the extraction request, or may be stored in another data center other than the data center that receives the extraction request.
The target data center is used as a data center for receiving the extraction request, and for the target data center, it is first determined whether data required to be extracted by the first client is stored locally, if so, the following step 2 may be executed, and if not, the two cases may be further divided into:
in the first case, the data to be extracted is stored in other data centers, and the target data center can extract the data from other data centers;
in the second case, the data to be extracted is not stored in the other data center, but only the slice data of the data, and the target data center may extract the slice data of the data from the data center in which the slice data is stored, and reconstruct the slice data into the data to be extracted.
Of course, in this embodiment, the first data that the first client needs to extract is stored in the target data center.
And 2, determining a storage position for locally storing the first data by the target data center, namely determining in which storage device of the target data center the first data is stored. And detecting the storage device storing the first data, and detecting whether the storage device is failed, that is, whether the stored first data can be read from the storage device.
If the storage device storing the first data in the target data center fails, the target data center cannot directly send the stored first data to the first client.
At this time, in one implementation, the first client may request any other first data center to extract the first data.
Specifically, in the storage system, the data centers are associated with each other. Therefore, for the first client, the identification corresponding to the data center storing the slice data of the first data, that is, the identification of the first data center, can be acquired from any data center. In this way, the first client can send a request to any one of the first data centers to extract the first data. The first data center receiving the request extracts slice data of the first data from other first data centers, reconstructs the extracted slice data according to a data storage mode to obtain the first data, and sends the reconstructed first data to the first client.
In another implementation, the target data center may extract slice data of the first data from the first data center, in a case where the target data center may also extract data from other data centers and perform the reconstruction process. After the target data center extracts the slice data of the first data to the local target data center, the target data center may perform reconstruction processing on the slice data of the extracted first data according to a data storage manner to obtain first data, and then send the reconstructed first data to the first client.
And 3, if the storage equipment has no fault, the target data center can directly acquire the first data from the local and send the acquired first data to the first client.
In the fifth embodiment, the target data center may locally store the slice data of the second data, that is, the target data center locally stores only the slice data of the second data, but does not store the second data.
The data storage method provided by the embodiment of the invention can further comprise the following steps:
firstly, receiving a request for extracting second data sent by a second client; determining a data center storing slice data of second data as a second data center;
secondly, slice data of the second data are extracted from each second data center; reconstructing the second data using the extracted slice data of the second data in a data storage manner; and sending the reconstructed second data to the second client.
The above steps in the fifth embodiment are described below.
For step one, the target data center requests to extract the second data, because the second data is not stored locally at the target data center, the second data cannot be directly extracted locally from the target data center. Therefore, the second data can be acquired only by extracting slice data of the second data.
The target data center may determine, from the storage system, another data center in which the slice data of the second data is stored, and use the determined data center as the second data center.
For example, the slice data of the second data includes: slice data 1, slice data 2, slice data 3, and slice data 4. in the storage system, slice data 1 is stored in a target data center, slice data 2 is stored in a data center a, slice data 3 is stored in a data center B, and slice data 4 is stored in a data center C. Then, when the target data center receives a request of the second client to extract the second data, the target data center may determine the data center a, the data center B, and the data center C storing the slice data of the second data, and use the data center a, the data center B, and the data center C as the second data center.
For the second step, after the target data center determines that the second data centers are stored, slice data of corresponding second data can be extracted from each second data center, and the second data are reconstructed by using the extracted slice data of the second data according to a data storage mode; and sending the reconstructed second data to the second client.
For the data storage manner, reference may be made to the first embodiment, and other portions may be similar to the portions of steps S201 and S202, which are not described herein again.
For example, the slice data of the second data includes: the method comprises the steps of slicing data 1, slicing data 2 and slicing data 3, wherein the slicing data 1 is stored in a target data center, the slicing data 2 is stored in a data center A, and the slicing data 3 is stored in a data center B, so that when the target data center receives a second data extraction request of a second client, the target data center can extract the slicing data 2 from the data center A and extract the slicing data 3 from the data center B, the locally stored slicing data 1, the extracted slicing data 2 and the slicing data 3 can be reconstructed according to an erasure code storage mode to obtain second data, and the reconstructed second data are sent to the second client.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
Corresponding to the embodiments shown in fig. 1 and fig. 1, the embodiment of the present invention further provides a data storage device, which is applied to a target data center in a storage system, where the storage system includes at least two data centers;
as shown in fig. 3, the apparatus includes:
an obtaining module 310, configured to obtain first data and store the first data locally;
a first determining module 320, configured to determine slice data of the first data according to a preset data storage manner, where the data storage manner is: a storage mode capable of reconstructing data from slice data of the data;
a second determining module 330, configured to determine, according to a preset first allocation policy, a first data center for storing each slice data from the storage system;
and the first sending module 340 is configured to send each slice data to the corresponding first data center.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
Optionally, in an embodiment, the second determining module 330 includes:
an obtaining submodule for obtaining a number of slice data;
the first determining submodule is used for determining a number of first data centers from the storage system according to a preset first distribution strategy;
and the first allocating submodule is used for allocating a first data center for each slice of data, wherein the first data centers allocated to each slice of data are different.
Optionally, in an embodiment, the second determining module 330 includes:
the sorting submodule is used for sorting the data centers included in the storage system according to a preset first distribution strategy;
a second determining submodule, configured to determine a first data center according to the sorting, where a sum of remaining storage spaces of the determined first data center is greater than a sum of data amounts of the determined slice data;
and the second distribution submodule is used for distributing the determined slice data to the first data centers according to the residual storage space of the first data centers so that each slice data is distributed to one first data center.
Optionally, in an embodiment, the data storage manner may be: erasure code storage.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
Corresponding to the embodiments shown in fig. 2 and fig. 2, an embodiment of the present invention further provides a data storage device, as shown in fig. 4, and on the basis of the embodiments shown in fig. 3 and fig. 3, the data storage device may further include:
a first extracting module 410, configured to extract slice data of the first data from each first data center in the case that a storage device that locally stores the first data fails;
a first reconstruction module 420 for reconstructing the first data using the extracted slice data according to the data storage manner.
Optionally, in an embodiment, the apparatus may further include:
the first receiving module is used for receiving a request for extracting first data sent by a first client;
the detection module is used for detecting whether storage equipment for storing first data in the target data center is in fault or not, and if yes, the first extraction module 410 is triggered;
and the second sending module is used for sending the first data stored locally to the first client side when the detection result of the detection module is negative.
Optionally, in one embodiment, the slice data of the second data is stored locally;
the apparatus may further include:
the second receiving module is used for receiving a request for extracting second data sent by a second client;
the third determining module is used for determining a data center storing slice data of the second data as the second data center;
a second extraction module for extracting slice data of the second data from each of the second data centers;
a second reconstruction module for reconstructing the second data using the extracted slice data of the second data in a data storage manner;
and the third sending module is used for sending the reconstructed second data to the second client.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
An embodiment of the present invention further provides an electronic device, as shown in fig. 5, including a processor 510 and a memory 520;
a memory 520 for storing a computer program;
the processor 510, when executing the program stored in the memory 520, implements the following steps:
acquiring first data and locally storing the first data;
determining slice data of the first data according to a preset data storage mode, wherein the data storage mode is as follows: a storage mode capable of reconstructing data from slice data of the data;
determining a first data center for storing each slice of data from a storage system according to a preset first allocation strategy;
and respectively sending each slice data to the corresponding first data center.
According to the technical scheme provided by the embodiment of the invention, the data can be sliced to obtain the sliced data besides being stored locally, and the sliced data are respectively stored in the selected first data center, so that the safety and the reliability of data storage are ensured, the sliced data are respectively stored after being sliced, the storage space is saved, and the consumption of the storage space of the data center in the storage system is reduced.
Of course, an electronic device provided in the embodiment of the present invention may further execute a data storage method described in any of the above embodiments. Specifically, see the embodiments corresponding to fig. 1 and fig. 2, which are not described herein again.
In another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to perform a data storage method as described in any one of the embodiments corresponding to fig. 1 and fig. 2.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. The data storage method is characterized by being applied to a target data center in a storage system, wherein the storage system comprises at least two data centers;
the method comprises the following steps:
acquiring first data and locally storing the first data;
determining slice data of the first data according to a preset data storage mode, wherein the data storage mode is as follows: a storage mode capable of reconstructing data from slice data of the data;
determining a first data center for storing each slice of data from the storage system according to a preset first allocation strategy;
respectively sending each slice data to a corresponding first data center;
extracting slice data of first data from each first data center under the condition that a storage device for locally storing the first data fails;
reconstructing the first data using the extracted slice data in the data storage manner;
determining a first data center for storing each slice of data from the storage system according to a preset first allocation strategy, wherein the determining comprises:
sequencing the residual storage spaces of all data centers in the storage system in a descending order, and determining the data center with the highest rank as the first data center, wherein the sum of the determined residual storage spaces of the first data center is larger than the sum of the determined data amount of the slice data, and the determined number of the first data centers is smaller than the number of the slice data;
and distributing the slice data to the first data centers according to the condition of the residual storage space and the load pressure of the first data centers, wherein the first data centers with larger residual storage space and smaller load pressure are distributed with more slice data.
2. The method according to claim 1, wherein the data storage means is: erasure code storage.
3. The method according to claim 1, wherein after sending the slice data to the corresponding first data centers, further comprising:
receiving a request for extracting the first data sent by a first client;
detecting whether a storage device locally storing the first data fails;
and if not, sending the first data stored locally to the first client.
4. The method of claim 1, wherein the slice data of the second data is stored locally;
the method further comprises the following steps:
receiving a request for extracting second data sent by a second client;
determining a data center storing slice data of the second data as a second data center;
extracting slice data of the second data from the respective second data centers;
reconstructing the second data by using the extracted slice data of the second data according to the data storage mode;
and sending the reconstructed second data to the second client.
5. The data storage device is applied to a target data center in a storage system, wherein the storage system comprises at least two data centers;
the device comprises:
the acquisition module is used for acquiring first data and locally storing the first data;
the first determining module is configured to determine slice data of the first data according to a preset data storage manner, where the data storage manner is: a storage mode capable of reconstructing data from slice data of the data;
the second determining module is used for determining a first data center for storing each slice data from the storage system according to a preset first distribution strategy;
the first sending module is used for sending each piece of slice data to the corresponding first data center;
the first extraction module is used for extracting slice data of the first data from each first data center under the condition that a storage device for locally storing the first data fails;
a first reconstruction module for reconstructing the first data using the extracted slice data according to the data storage manner;
determining a first data center for storing each slice of data from the storage system according to a preset first allocation strategy, wherein the determining comprises:
sequencing the residual storage spaces of all data centers in the storage system in a descending order, and determining the data center with the highest rank as the first data center, wherein the sum of the determined residual storage spaces of the first data center is larger than the sum of the determined data amount of the slice data, and the determined number of the first data centers is smaller than the number of the slice data;
and distributing the slice data to the first data centers according to the condition of the residual storage space and the load pressure of the first data centers, wherein the first data centers with larger residual storage space and smaller load pressure are distributed with more slice data.
6. The apparatus of claim 5, wherein the data storage means is: erasure code storage.
7. The apparatus of claim 5, further comprising:
the first receiving module is used for receiving a request for extracting the first data sent by a first client;
the detection module is used for detecting whether storage equipment for storing the first data in the target data center fails or not;
and the second sending module is used for sending the first data stored locally to the first client side when the detection result of the detection module is negative.
8. The apparatus of claim 5, wherein slice data of the second data is stored locally;
the device further comprises:
the second receiving module is used for receiving a request for extracting second data sent by a second client;
a third determining module, configured to determine, as a second data center, a data center in which slice data of the second data is stored;
a second extraction module for extracting slice data of the second data from each second data center;
a second reconstruction module, configured to reconstruct the second data using the extracted slice data of the second data according to the data storage manner;
and the third sending module is used for sending the reconstructed second data to the second client.
9. An electronic device comprising a processor and a memory;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 4 when executing a program stored in the memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 4.
CN201810193321.6A 2018-03-09 2018-03-09 Data storage method and device Active CN110244903B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810193321.6A CN110244903B (en) 2018-03-09 2018-03-09 Data storage method and device
PCT/CN2019/077440 WO2019170133A1 (en) 2018-03-09 2019-03-08 Data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810193321.6A CN110244903B (en) 2018-03-09 2018-03-09 Data storage method and device

Publications (2)

Publication Number Publication Date
CN110244903A CN110244903A (en) 2019-09-17
CN110244903B true CN110244903B (en) 2021-08-13

Family

ID=67845874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810193321.6A Active CN110244903B (en) 2018-03-09 2018-03-09 Data storage method and device

Country Status (2)

Country Link
CN (1) CN110244903B (en)
WO (1) WO2019170133A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268523B (en) * 2021-05-14 2022-02-01 刘伟铭 Product multi-process industrial data equal-division slice alignment storage and search system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279777A (en) * 2011-08-18 2011-12-14 成都市华为赛门铁克科技有限公司 Method and device for processing data redundancy and distributed storage system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8656253B2 (en) * 2011-06-06 2014-02-18 Cleversafe, Inc. Storing portions of data in a dispersed storage network
MX2013005303A (en) * 2013-05-10 2013-08-07 Fondo De Informacion Y Documentacion Para La Ind Infotec High-performance system and process for treating and storing data, based on affordable components, which ensures the integrity and availability of the data for the handling thereof.
CN105573680A (en) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 Storage method and device for replicated data
CN107436725B (en) * 2016-05-25 2019-12-20 杭州海康威视数字技术股份有限公司 Data writing and reading methods and devices and distributed object storage cluster
CN106686095A (en) * 2016-12-30 2017-05-17 郑州云海信息技术有限公司 Data storage method and device based on erasure code technology
CN106909470A (en) * 2017-01-20 2017-06-30 深圳市中博科创信息技术有限公司 Distributed file system storage method and device based on correcting and eleting codes
CN106649891A (en) * 2017-02-24 2017-05-10 深圳市中博睿存信息技术有限公司 Distributed data storage method and system
CN107273048B (en) * 2017-06-08 2020-08-04 浙江大华技术股份有限公司 Data writing method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279777A (en) * 2011-08-18 2011-12-14 成都市华为赛门铁克科技有限公司 Method and device for processing data redundancy and distributed storage system

Also Published As

Publication number Publication date
WO2019170133A1 (en) 2019-09-12
CN110244903A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
US9851906B2 (en) Virtual machine data placement in a virtualized computing environment
CN109725831B (en) Method, system and computer readable medium for managing storage system
CN108153622B (en) Fault processing method, device and equipment
US11416166B2 (en) Distributed function processing with estimate-based scheduler
US10387331B2 (en) Process for maintaining data write ordering through a cache
CN106844108B (en) A kind of date storage method, server and storage system
CN106293492B (en) Storage management method and distributed file system
CN103942112A (en) Magnetic disk fault-tolerance method, device and system
US11733866B2 (en) Electronic storage system
CN113126887B (en) Method, electronic device and computer program product for reconstructing a disk array
CN111124264A (en) Method, apparatus and computer program product for reconstructing data
CN111176888A (en) Cloud storage disaster recovery method, device and system
US20230109530A1 (en) Synchronous object placement for information lifecycle management
WO2019205788A1 (en) Data storage method, storage server and cloud storage system
US20190347165A1 (en) Apparatus and method for recovering distributed file system
US20240311251A1 (en) Storing Encoded Data Slices in Primary Storage Slots
CN110244904B (en) Data storage system, method and device
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN109426586B (en) Data file repairing method, device and computer readable storage medium
CN110244903B (en) Data storage method and device
CN108121497B (en) Storage method and storage system
US9952951B2 (en) Preserving coredump data during switchover operation
CN113055495B (en) Data processing method and device and distributed storage system
CN109753383B (en) Score calculation method and device
CN112612412A (en) Data reconstruction method in distributed storage system and storage node equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant