CN113297005A - Data processing method, device and equipment - Google Patents

Data processing method, device and equipment Download PDF

Info

Publication number
CN113297005A
CN113297005A CN202010733462.XA CN202010733462A CN113297005A CN 113297005 A CN113297005 A CN 113297005A CN 202010733462 A CN202010733462 A CN 202010733462A CN 113297005 A CN113297005 A CN 113297005A
Authority
CN
China
Prior art keywords
copy
storage node
storage
copies
storage nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010733462.XA
Other languages
Chinese (zh)
Other versions
CN113297005B (en
Inventor
陈云星
陈希
张包峰
潘岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010733462.XA priority Critical patent/CN113297005B/en
Publication of CN113297005A publication Critical patent/CN113297005A/en
Application granted granted Critical
Publication of CN113297005B publication Critical patent/CN113297005B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data processing method, which comprises the following steps: acquiring target data to be stored, wherein the target data comprises data fragments, and the data fragments comprise a first copy and a second copy; acquiring a storage node for storing target data; sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on the storage nodes to generate storage spaces according to the serial numbers of the storage nodes; sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.

Description

Data processing method, device and equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage device.
Background
In a database product, in order to ensure that an application system can still run normally and smoothly during maintenance after a copy is damaged, and the influence is not affected (or slightly affected), a conventional copy organizes the copies in a multiple group manner, for example, 3 copies need to be managed as a group by 3 physical nodes.
In the prior art, the following method is adopted to place the copies: ceph: the copy is placed in a hash-hash mode, and the hash mode is adopted, so that the balance effect is achieved only when the number of the objects is large.
Therefore, the method for placing the copies in the prior art has the problem that the method has no balancing effect when the number of the copies is small.
Disclosure of Invention
The application provides a data processing method, electronic equipment and storage equipment, and aims to solve the problem that a method for placing copies in the prior art has no balancing effect when the number of copies is small.
The application provides a data processing method, which comprises the following steps:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
Optionally, the staggering of the numbers of the second copy and the numbers of the second copy includes:
the number of the first copy in the first copy storage node row in the same order on the same storage node is different from the number of the second copy in the second copy storage node row.
Optionally, the first copy is a copy providing read-write service, and the second copy performs data synchronization with the first copy.
Optionally, the sequentially placing the first copies on the storage nodes according to the number of the first copy and the number of the storage nodes to obtain a first copy storage node row of a first number, where the first copy storage node row sequentially places the first copies on storage spaces of all the storage nodes according to the number of the storage nodes to generate a storage space, including:
numbering the first copy to obtain the number of the first copy;
numbering the storage nodes to obtain the numbers of the storage nodes;
and sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows.
Optionally, the sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows includes:
obtaining an ordinal number specifying a row of a first copy storage node;
obtaining a number of a first copy in a row of a designated first copy storage node;
obtaining the number of the second copy according to the ordinal number of the appointed first copy storage node line and the number of the first copy in the appointed first copy storage node line;
and according to the serial number of the second copy, placing the corresponding second copy on the storage node to obtain a first number of second copy storage node rows.
Optionally, the obtaining, according to the ordinal number of the specified first copy storage node line and the number of the first copy in the specified first copy storage node line, the number of the second copy, includes:
performing remainder calculation according to the ordinal number of the appointed first copy storage node row and the number of the storage nodes to obtain a first calculation result;
obtaining the offset of the staggered placement of the second copy and the first copy according to the first calculation result;
and acquiring the number of the data fragment corresponding to the second copy according to the offset and the number of the first copy.
Optionally, the data slice includes a third copy;
the method further comprises the following steps:
when the third copy is placed, whether a uniform placement solution exists is determined according to the number of the storage node;
and if not, placing the third copy of the data fragment by adopting a heuristic algorithm.
Optionally, the method further includes:
receiving failure information of a storage node reported by a failed storage node;
and executing failover operation on the first copy of the data fragment on the failed storage node according to the failure information of the storage node.
Optionally, the storage node failure information includes at least one of:
inputting and outputting abnormal information;
and the storage node reads the overtime information exceeding the preset time length when reading the copy of the data fragment on the storage node.
Optionally, the performing, according to the storage node failure information, a failover operation on a first copy of the data segment on the failed storage node includes:
obtaining the number of the first copy on the failed storage node;
obtaining the target positions of second copies corresponding to the first copies on the failed storage nodes on other storage nodes according to the serial numbers of the first copies on the failed storage nodes;
obtaining a new first copy according to the target position;
and executing the work of the first copy on the failed storage node by using the new first copy.
The present application also provides a data processing apparatus, comprising:
the device comprises a target data acquisition unit, a storage unit and a processing unit, wherein the target data acquisition unit is used for acquiring target data to be stored, the target data comprises data fragments, and the data fragments comprise a first copy and a second copy;
a storage node acquisition unit configured to acquire a storage node for storing the target data;
a first copy placing unit, configured to sequentially place first copies on the storage nodes according to the number of the first copies and the number of the storage nodes, to obtain a first number of first copy storage node rows, where the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes according to the number of the storage nodes;
a second copy placing unit, configured to place second copies on the storage nodes in sequence according to a specified rule, so as to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
The present application further provides an electronic device, comprising:
a processor; and
a memory for storing a program of a data processing method, the apparatus performing the following steps after being powered on and running the program of the data processing method by the processor:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
The present application also provides a storage device storing a program of a data processing method, the program being executed by a processor to perform the steps of:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
Compared with the prior art, the invention has the following advantages:
according to the data processing method and device, the electronic equipment and the storage equipment, target data to be stored are obtained, wherein the target data comprise data fragments, and the data fragments comprise a first copy and a second copy; acquiring a storage node for storing the target data; sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of storage node rows; sequentially placing the second copies on the storage nodes according to a specified rule, wherein a first number of second storage node rows are obtained; wherein, the first storage node row with the same sequence contains the same number set of the first copy and the second storage node row contains the same number set of the second copy, and the number of the first copy is arranged by mistake with the number of the second copy. According to the data processing method, the first copies are sequentially placed on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes, and a first number of storage node rows are obtained; sequentially placing the second copies on the storage nodes according to a specified rule, wherein a first number of second storage node rows are obtained; the number set of the first copies contained in the first storage node row and the number set of the second copies contained in the second storage node row in the same order are the same, and the numbers of the first copies and the numbers of the second copies are arranged in a staggered mode, so that the first copies and the second copies are distributed on each storage node in a balanced mode, the number of the copies is irrelevant, and the problem that the balanced effect is not achieved when the number of the copies is small in the method for placing the copies in the prior art is solved.
Drawings
Fig. 1 is a flowchart of a data processing method according to a first embodiment of the present application.
Fig. 2 is a schematic diagram of a data distribution effect provided in the first embodiment of the present application.
Fig. 3 is a schematic diagram of a data processing apparatus according to a second embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather construed as limited to the embodiments set forth herein.
A first embodiment of the present application provides a data processing method, and an execution subject of the data processing method may be a control node server. The following description will be made with reference to fig. 1 and 2.
As shown in fig. 1, in step S101, target data to be stored is obtained, where the target data includes data fragments, and the data fragments include a first copy and a second copy.
The data shard, i.e., the shard, is divided into a plurality of shards (e.g., 1024) by the distributed storage system, and then the shards are managed as a unit, and each shard has a plurality of copies. The data slice may also be referred to as a data management unit.
Replicas, i.e., redundant physical units of data employed by a distributed storage system to circumvent failures, typically have independent failure probabilities (no dependencies) across multiple replicas. Data synchronization can be carried out between the copies through a protocol, and the consistency of the data among the multiple copies is ensured. Generally, at least 3 copies are set for improving reliability and consistency, and the consistency of data is ensured through a voting mechanism.
The first copy may be a copy providing read-write service, and the second copy performs data synchronization with the first copy.
The copy for providing the read-write service, namely the Leader copy, is used for providing the read-write service for the outside.
The first copy and the second copy having the same number are not placed on the same storage node.
As shown in FIG. 2, the first copy and the second copy, having the same number, are not located on the same storage node.
As shown in fig. 1, in step S102, a storage node for storing the target data is acquired.
In fig. 2, the distributed storage system has storage nodes Node0-Node17, for a total of 18 storage nodes. These storage nodes are numbered 0-17.
As shown in fig. 1, in step S103, first copies are sequentially placed on the storage nodes according to the number of the first copies and the number of the storage nodes, and a first number of first copy storage node rows are obtained, where the first copy storage node rows sequentially place the first copies on the storage spaces of all the storage nodes according to the number of the storage nodes.
The first number refers to the number of rows of the first replica storage node row. As in fig. 2, the first number is 34.
The sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first copy storage node row of a first quantity, wherein the first copy storage node row sequentially places the first copies on storage spaces of all the storage nodes according to the serial numbers of the storage nodes to generate storage spaces, and the method comprises the following steps:
numbering the first copy to obtain the number of the first copy;
numbering the storage nodes to obtain the numbers of the storage nodes;
and sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the storage nodes to obtain a first number of first copy storage node rows.
As shown in FIG. 2, the first copy has a number ranging from 0-611, and the storage nodes in the distributed storage system have a number ranging from 0-17. The first copy, numbered 0-17, is placed in the first row of storage nodes 0-17, the first copy, numbered 18-35, is placed in the fourth row, and so on, and the first copy, numbered 594 and 611, is placed in row 100.
As can be seen in fig. 2, the line wrapping is performed after the first copy 17 is placed on Node 17. Starting from the fourth line, the placement of the first copy of the next line is resumed, the second and third lines being used to place the second and third copies, respectively.
The storage space in which the first copy 0-17 is located is referred to as the first copy storage node row.
The storage space in which the first copy 18-35 is located is referred to as the second copy storage node row. And so on.
As shown in fig. 1, in step S104, the second copies are placed on the storage nodes in sequence according to a specified rule, wherein the set of numbers of the first copies included in the first copy storage node row and the set of numbers of the second copies included in the second copy storage node row in the same order are the same, and the numbers of the first copies are arranged with a misalignment with the numbers of the second copies.
The number of the second copy is staggered with the number of the second copy, and the method comprises the following steps:
the number of the first copy in the first copy storage node row in the same order on the same storage node is different from the number of the second copy in the second copy storage node row.
As shown in fig. 2, a second row (numbered 1-17, 0), a fifth row (numbered 20-35, 18, 19), an eighth row (numbered 39-53, 36-38), and so on, with a second copy placed every third row.
The data slice may also include a third copy; as shown in FIG. 2, a third row (numbered 2-17, 0, 1), a sixth row (numbered 22-35, 18-21), a ninth row (numbered 42-53, 36-41), and so on, with a third copy placed every third row.
The first copy, the second copy and the third copy with the same number store the same data. Wherein the Leader copy can provide read-write service for the outside world.
In fig. 2, each set of storage node rows includes a first replica storage node row, a second replica storage node row, and a third replica storage node row. Each group of storage node rows constitutes a group. As shown in FIG. 2, Group0 includes a first copy of a first row, a second copy of a second row, and a third copy of a third row. Group1 includes a first copy of the fourth row, a second copy of the fifth row, and a third copy of the sixth row. And so on. There are 34 packets in total.
As shown in FIG. 2, the first copy of the first row is numbered 0-17 and the second copy of the second row is numbered 1-17, 0. Both having the same set of numbering. The difference between the two is that the numbering is staggered.
Sequentially placing the second copies on the storage nodes according to a specified rule, wherein a first number of second storage node rows are obtained; wherein, the set of the numbers of the first copies contained in the first storage node row in the same order is the same as the set of the numbers of the second copies contained in the second storage node row, and the numbers of the first copies in the first storage node row in the same order on the same storage node are different from the numbers of the second copies in the second storage node row, including:
obtaining an ordinal number specifying a row of a first copy storage node;
obtaining a number of a first copy in a row of a designated first copy storage node;
obtaining the number of the second copy according to the ordinal number of the appointed storage node line and the number of the first copy in the appointed first copy storage node line;
and according to the serial number of the second copy, placing the corresponding second copy on the storage node to obtain a first number of second copy storage node rows.
The obtaining the number of the second copy according to the ordinal number of the designated storage node line and the number of the first copy in the designated first copy storage node line includes:
performing remainder calculation according to the ordinal number of the appointed first copy storage node row and the number of the storage nodes to obtain a first calculation result;
obtaining the offset of the staggered placement of the second copy and the first copy according to the first calculation result;
and acquiring the number of the data fragment corresponding to the second copy according to the offset and the number of the first copy.
Specifically, performing remainder calculation according to the ordinal number of the designated first copy storage node row and the number of the storage nodes to obtain a first calculation result, including:
when the ordinal number of the appointed first copy storage node row is smaller than or equal to the number of the storage nodes, the ordinal number of the appointed first copy storage node row is used for surplus of the number of the storage nodes to obtain a first calculation result;
when the ordinal number of the appointed first copy storage node row is larger than the number of the storage nodes, subtracting 1 from the number of the storage nodes to obtain a second calculation result;
obtaining a third calculation result according to the ordinal number of the appointed first copy storage node row and the second calculation result;
and the third calculation result is used for carrying out remainder on the number of the storage nodes to obtain a first calculation result.
The third calculation result is obtained according to the ordinal number of the designated first copy storage node row and the second calculation result, which may be obtained by subtracting the multiple of the second calculation result from the ordinal number of the designated first copy storage node row.
The obtaining of the offset of the second copy and the first copy in a staggered manner according to the first calculation result includes:
subtracting the difference value of the first calculation result from the number of the storage nodes to be used as the offset of the staggered placement of the second copy and the first copy; alternatively, the first and second electrodes may be,
and taking the first calculation result as an offset of the second copy and the first copy in a staggered way.
As shown in FIG. 2, for a first set of storage node rows, i.e., group0, the first replica storage node row has an ordinal number of 1, a number of storage nodes of 18, and the first row is the first replica, which is numbered (0-17). First, the ordinal number 1 of the first copy storage node line is complemented by 18, the first calculation result is 1, 1 is taken as an offset in which the second copy and the first copy are placed in a staggered manner, the first copy is numbered 0, and the number of the data segment corresponding to the second copy is 1, so that the first copy numbered 0 corresponds to the second copy numbered 1, and the second copy numbered 1 corresponds to the next line of the first copy numbered 0. The first copy number is 1, and the number of the data fragment corresponding to the second copy is 2, so that the first copy number 1 corresponds to the second copy number 2, and the second copy number 2 is correspondingly placed at the next line of the first copy number 1.
For example, for the fourth group of storage node lines, i.e., group3, the first copy stores ordinal number 4 of the node line and the number of storage nodes is 18, first, the ordinal number 4 of the first copy storage node line is left over to 18, the first calculation result is 4, 3 is 4 as an offset in which the second copy is displaced from the first copy, the first copy is numbered 54, and the number 54+4 of the data segment corresponding to the second copy is 58, so the first copy with number 54 corresponds to the second copy with number 58 and the second copy with number 58 corresponds to the next line of the first copy with number 54.
For another example, for the nineteenth group of storage node rows, i.e., group18, the ordinal number 19 of the first copy storage node row is 18, and first, the number of storage nodes is decreased by 1 to obtain 17; subtracting 17 from the ordinal number 19 of the first copy storage node row to obtain a third calculation result of 2; the third calculation result 2 is obtained by subtracting the number of storage nodes to obtain a first calculation result 2, and 2 is used as an offset for placing the second copy and the first copy in a staggered manner, the first copy is numbered 54, and the number 325+2 of the data slice corresponding to the second copy is obtained as 327, so that the first copy with the number 325 corresponds to the second copy with the number 327, and the second copy with the number 327 is correspondingly placed at the next line of the first copy with the number 325.
As an implementation manner, the first embodiment of the present application may further include:
when the third copy is placed, whether a uniform placement solution exists is determined according to the number of the storage node;
and if not, placing the third copy of the data fragment by adopting a heuristic algorithm.
In the system shown in fig. 2, when all storage nodes are working normally, the first copy is used as a Leader copy, each node has 34 Leader copies working, and there are 34 second copies and 34 third copies of data synchronization. The workload on the various storage nodes is very balanced.
As an implementation manner, the first embodiment of the present application may further include:
receiving failure information of a storage node reported by a failed storage node;
and executing failover operation on the first copy on the failed storage node according to the failure information of the storage node.
The storage node failure information includes at least one of the following:
inputting and outputting abnormal information;
and the storage node reads the overtime information exceeding the preset time length when reading the copy of the data fragment on the storage node.
The failure of the storage node may be represented as an input/output exception, or as timeout information exceeding a preset time when the storage node reads a copy of the data fragment thereon.
The executing failover operation on the first copy of the data fragment on the failed storage node according to the storage node failure information includes:
obtaining the number of the first copy on the failed storage node;
obtaining the target positions of second copies corresponding to the first copies on the failed storage nodes on other storage nodes according to the serial numbers of the first copies on the failed storage nodes;
obtaining a new first copy according to the target position;
and executing the work of the first copy on the failed storage node by using the new first copy.
As with the system shown in fig. 2, when storage Node0, i.e., Node0, fails, the leader copy on Node0 (numbered 0, 306, 18, 324, 36, 342, 54, 306, 72, 378, 90, 396, 72, 378, 126, 432, 144, 450, 162, 468, 180, 486, 198, 504, 216, 522, 234, 540, 594) cannot function properly. At this time, Node17 (second copy 0, second copy 306), Node16 (second copy 18, second copy 324), Node15 (second copy 36, second copy 342), Node14 (second copy 54, second copy 306), Node13 (second copy 72, second copy 378), Node12 (second copy 90, second copy 396), Node11 (second copy 108, second copy 414), Node10 (second copy 126, second copy 432), Node9 (second copy 144, second copy 450), Node 84 (second copy 162, second copy 468), Node7 (second copy 180, second copy 486), Node6 (second copy 198, second copy 504), Node5 (second copy 216, second copy 522), Node4 (second copy 234, second copy 36540), Node3 (second copy 252, second copy 288), second copy 3985, second copy 3972, second copy 576, and work continues. The load on the various storage nodes remains balanced.
The data processing method provided by the first embodiment of the present application not only realizes that the system load is uniformly distributed on all healthy nodes when there is no fault, but also realizes that the system load is uniformly distributed on all healthy nodes when one node has a fault, and is a fixed mapping, and when one node has a fault, the problem of serious performance sag does not occur, and in addition, the method of the present application also has no requirement on the number of data management units.
Corresponding to the data processing method provided by the first embodiment of the present application, a second embodiment of the present application also provides a data processing apparatus.
As shown in fig. 3, the data processing apparatus includes:
a target data obtaining unit 301, configured to obtain target data to be stored, where the target data includes data fragments that include a first copy and a second copy;
a storage node acquisition unit 302 configured to acquire a storage node for storing the target data;
a first copy placing unit 303, configured to sequentially place first copies on the storage nodes according to the number of the first copies and the number of the storage nodes, to obtain a first number of first copy storage node rows, where the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes according to the number of the storage nodes;
a second copy placing unit 304, configured to place second copies on the storage nodes in sequence according to a specified rule, where a first number of second copy storage node rows are obtained; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
Optionally, the staggering of the numbers of the second copy and the numbers of the second copy includes:
the number of the first copy in the first copy storage node row in the same order on the same storage node is different from the number of the second copy in the second copy storage node row.
Optionally, the first copy is a copy providing read-write service, and the second copy performs data synchronization with the first copy.
Optionally, the first copy placing unit is specifically configured to:
numbering the first copy to obtain the number of the first copy;
numbering the storage nodes to obtain the numbers of the storage nodes;
and sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows.
Optionally, the second copy placing unit is specifically configured to:
obtaining an ordinal number specifying a row of a first copy storage node;
obtaining a number of a first copy in a row of a designated first copy storage node;
obtaining the number of the second copy according to the ordinal number of the appointed first copy storage node line and the number of the first copy in the appointed first copy storage node line;
and according to the serial number of the second copy, placing the corresponding second copy on the storage node to obtain a first number of second copy storage node rows.
Optionally, the second copy placing unit is specifically configured to:
performing remainder calculation according to the ordinal number of the appointed first copy storage node row and the number of the storage nodes to obtain a first calculation result;
obtaining the offset of the staggered placement of the second copy and the first copy according to the first calculation result;
and acquiring the number of the data fragment corresponding to the second copy according to the offset and the number of the first copy.
Optionally, the data slice includes a third copy;
the device further comprises: a third sub-sample placement unit for placing a sample,
the storage node is used for determining whether a uniform placement solution exists or not according to the number of the storage node when the third copy is placed;
and if not, placing the third copy of the data fragment by adopting a heuristic algorithm.
Optionally, the apparatus further comprises:
the storage node failure information receiving unit is used for receiving the storage node failure information reported by the failed storage node;
and the failover operation unit is used for executing the failover operation on the first copy of the data fragment on the failed storage node according to the storage node failure information.
Optionally, the storage node failure information includes at least one of:
inputting and outputting abnormal information;
and the storage node reads the overtime information exceeding the preset time length when reading the copy of the data fragment on the storage node.
Optionally, the failover operating unit is specifically configured to:
obtaining the number of the first copy on the failed storage node;
obtaining the target positions of second copies corresponding to the first copies on the failed storage nodes on other storage nodes according to the serial numbers of the first copies on the failed storage nodes;
obtaining a new first copy according to the target position;
and executing the work of the first copy on the failed storage node by using the new first copy.
It should be noted that, for the detailed description of the apparatus provided in the second embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not described here again.
Corresponding to the data processing method provided in the first embodiment of the present application, a third embodiment of the present application also provides an electronic device.
The electronic device includes:
a processor; and
a memory for storing a program of a data processing method, the apparatus performing the following steps after being powered on and running the program of the data processing method by the processor:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
Optionally, the staggering of the numbers of the second copy and the numbers of the second copy includes:
the number of the first copy in the first copy storage node row in the same order on the same storage node is different from the number of the second copy in the second copy storage node row.
Optionally, the first copy is a copy providing read-write service, and the second copy performs data synchronization with the first copy.
Optionally, the sequentially placing the first copies on the storage nodes according to the number of the first copy and the number of the storage nodes to obtain a first copy storage node row of a first number, where the first copy storage node row sequentially places the first copies on storage spaces of all the storage nodes according to the number of the storage nodes to generate a storage space, including:
numbering the first copy to obtain the number of the first copy;
numbering the storage nodes to obtain the numbers of the storage nodes;
and sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows.
Optionally, the sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows includes:
obtaining an ordinal number specifying a row of a first copy storage node;
obtaining a number of a first copy in a row of a designated first copy storage node;
obtaining the number of the second copy according to the ordinal number of the appointed first copy storage node line and the number of the first copy in the appointed first copy storage node line;
and according to the serial number of the second copy, placing the corresponding second copy on the storage node to obtain a first number of second copy storage node rows.
Optionally, the obtaining, according to the ordinal number of the specified first copy storage node line and the number of the first copy in the specified first copy storage node line, the number of the second copy, includes:
performing remainder calculation according to the ordinal number of the appointed first copy storage node row and the number of the storage nodes to obtain a first calculation result;
obtaining the offset of the staggered placement of the second copy and the first copy according to the first calculation result;
and acquiring the number of the data fragment corresponding to the second copy according to the offset and the number of the first copy.
Optionally, the data slice includes a third copy;
the electronic device further performs the steps of:
when the third copy is placed, whether a uniform placement solution exists is determined according to the number of the storage node;
and if not, placing the third copy of the data fragment by adopting a heuristic algorithm.
Optionally, the electronic device further performs the following steps:
receiving failure information of a storage node reported by a failed storage node;
and executing failover operation on the first copy of the data fragment on the failed storage node according to the failure information of the storage node.
Optionally, the storage node failure information includes at least one of:
inputting and outputting abnormal information;
and the storage node reads the overtime information exceeding the preset time length when reading the copy of the data fragment on the storage node.
Optionally, the performing, according to the storage node failure information, a failover operation on a first copy of the data segment on the failed storage node includes:
obtaining the number of the first copy on the failed storage node;
obtaining the target positions of second copies corresponding to the first copies on the failed storage nodes on other storage nodes according to the serial numbers of the first copies on the failed storage nodes;
obtaining a new first copy according to the target position;
and executing the work of the first copy on the failed storage node by using the new first copy.
It should be noted that, for the detailed description of the electronic device provided in the third embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.
In correspondence with the data processing method provided in the first embodiment of the present application, a fourth embodiment of the present application also provides a storage device storing a program of the data processing method, the program being executed by a processor to perform the steps of:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first storage node rows, wherein the storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the storage spaces according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule, wherein a first number of second storage node rows are obtained; wherein, the first storage node row with the same sequence contains the same number set of the first copy and the second storage node row contains the same number set of the second copy, and the number of the first copy is arranged by mistake with the number of the second copy.
It should be noted that, for the detailed description of the storage device provided in the fourth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not described here again.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (13)

1. A method of data processing, comprising:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
2. The method of claim 1, the second copy number being staggered from the second copy number, comprising:
the number of the first copy in the first copy storage node row in the same order on the same storage node is different from the number of the second copy in the second copy storage node row.
3. The method of claim 1, wherein the first copy is a copy that provides read-write services, and the second copy is data synchronized with the first copy.
4. The method according to claim 1, wherein the sequentially placing the first copies on the storage nodes according to the numbers of the first copies and the numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes according to the numbers of the storage nodes to generate the storage spaces, comprises:
numbering the first copy to obtain the number of the first copy;
numbering the storage nodes to obtain the numbers of the storage nodes;
and sequentially placing the first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows.
5. The method of claim 1, the placing the second copies on the storage nodes in order according to a specified rule, obtaining a first number of rows of second copy storage nodes, comprising:
obtaining an ordinal number specifying a row of a first copy storage node;
obtaining a number of a first copy in a row of a designated first copy storage node;
obtaining the number of the second copy according to the ordinal number of the appointed first copy storage node line and the number of the first copy in the appointed first copy storage node line;
and according to the serial number of the second copy, placing the corresponding second copy on the storage node to obtain a first number of second copy storage node rows.
6. The method of claim 5, the obtaining the number of the second copy as a function of the ordinal number of the specified first copy storage node row and the number of the first copy in the specified first copy storage node row, comprising:
performing remainder calculation according to the ordinal number of the appointed first copy storage node row and the number of the storage nodes to obtain a first calculation result;
obtaining the offset of the staggered placement of the second copy and the first copy according to the first calculation result;
and acquiring the number of the data fragment corresponding to the second copy according to the offset and the number of the first copy.
7. The method of claim 1, the data slice comprising a third copy;
the method further comprises the following steps:
when the third copy is placed, whether a uniform placement solution exists is determined according to the number of the storage node;
and if not, placing the third copy of the data fragment by adopting a heuristic algorithm.
8. The method of any of claims 2-7, further comprising:
receiving failure information of a storage node reported by a failed storage node;
and executing failover operation on the first copy of the data fragment on the failed storage node according to the failure information of the storage node.
9. The method of claim 8, the storage node failure information comprising at least one of:
inputting and outputting abnormal information;
and the storage node reads the overtime information exceeding the preset time length when reading the copy of the data fragment on the storage node.
10. The method of claim 8, the performing a failover operation on a first copy of a data segment on the failed storage node according to the storage node failure information, comprising:
obtaining the number of the first copy on the failed storage node;
obtaining the target positions of second copies corresponding to the first copies on the failed storage nodes on other storage nodes according to the serial numbers of the first copies on the failed storage nodes;
obtaining a new first copy according to the target position;
and executing the work of the first copy on the failed storage node by using the new first copy.
11. A data processing apparatus comprising:
the device comprises a target data acquisition unit, a storage unit and a processing unit, wherein the target data acquisition unit is used for acquiring target data to be stored, the target data comprises data fragments, and the data fragments comprise a first copy and a second copy;
a storage node acquisition unit configured to acquire a storage node for storing the target data;
a first copy placing unit, configured to sequentially place first copies on the storage nodes according to the number of the first copies and the number of the storage nodes, to obtain a first number of first copy storage node rows, where the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes according to the number of the storage nodes;
a second copy placing unit, configured to place second copies on the storage nodes in sequence according to a specified rule, so as to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
12. An electronic device, comprising:
a processor; and
a memory for storing a program of a data processing method, the apparatus performing the following steps after being powered on and running the program of the data processing method by the processor:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
13. A storage device storing a program of a data processing method, the program being executed by a processor, and performing the steps of:
acquiring target data to be stored, wherein the target data comprises data fragments which comprise a first copy and a second copy;
acquiring a storage node for storing the target data;
sequentially placing first copies on the storage nodes according to the serial numbers of the first copies and the serial numbers of the storage nodes to obtain a first number of first copy storage node rows, wherein the first copy storage node rows sequentially place the first copies on storage spaces of all the storage nodes to generate the first copy storage node rows according to the serial numbers of the storage nodes;
sequentially placing the second copies on the storage nodes according to a specified rule to obtain a first number of second copy storage node rows; the first copy storage node row and the second copy storage node row in the same order contain the same number set of the first copy and the second copy, and the number of the first copy and the number of the second copy are arranged in a staggered mode.
CN202010733462.XA 2020-07-27 2020-07-27 Data processing method, device and equipment Active CN113297005B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010733462.XA CN113297005B (en) 2020-07-27 2020-07-27 Data processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010733462.XA CN113297005B (en) 2020-07-27 2020-07-27 Data processing method, device and equipment

Publications (2)

Publication Number Publication Date
CN113297005A true CN113297005A (en) 2021-08-24
CN113297005B CN113297005B (en) 2024-01-05

Family

ID=77318207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010733462.XA Active CN113297005B (en) 2020-07-27 2020-07-27 Data processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN113297005B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170115884A1 (en) * 2015-10-26 2017-04-27 SanDisk Technologies, Inc. Data Folding in 3D Nonvolatile Memory
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
WO2018028229A1 (en) * 2016-08-10 2018-02-15 华为技术有限公司 Data shard storage method, device and system
CN110209345A (en) * 2018-12-27 2019-09-06 中兴通讯股份有限公司 The method and device of data storage

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170115884A1 (en) * 2015-10-26 2017-04-27 SanDisk Technologies, Inc. Data Folding in 3D Nonvolatile Memory
CN107219997A (en) * 2016-03-21 2017-09-29 阿里巴巴集团控股有限公司 A kind of method and device for being used to verify data consistency
WO2018028229A1 (en) * 2016-08-10 2018-02-15 华为技术有限公司 Data shard storage method, device and system
CN110209345A (en) * 2018-12-27 2019-09-06 中兴通讯股份有限公司 The method and device of data storage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王岩;汪晋宽;: "云存储中动态副本放置机制研究", 计算机工程与科学, no. 09 *

Also Published As

Publication number Publication date
CN113297005B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
US10324812B2 (en) Error recovery in a storage cluster
AU2018202944B2 (en) Efficient data reads from distributed storage systems
US8677063B2 (en) Parity declustered storage device array with partition groups
US9311179B2 (en) Threshold decoding of data based on trust levels
US9794337B2 (en) Balancing storage node utilization of a dispersed storage network
US9696914B2 (en) System and method for transposed storage in RAID arrays
US9582363B2 (en) Failure domain based storage system data stripe layout
US7590672B2 (en) Identification of fixed content objects in a distributed fixed content storage system
US9606867B2 (en) Maintaining data storage in accordance with an access metric
US20170063402A1 (en) Configuring storage resources of a dispersed storage network
US8266475B2 (en) Storage management device, storage management method, and storage system
US10761934B2 (en) Reconstruction of data of virtual machines
CN106293492B (en) Storage management method and distributed file system
CN108540315A (en) Distributed memory system, method and apparatus
WO2013080299A1 (en) Data management device, data copy method, and program
CN112543920A (en) Data reconstruction method, device, computer equipment, storage medium and system
US20190227872A1 (en) Method, apparatus and computer program product for managing data storage in data storage systems
CN113297005B (en) Data processing method, device and equipment
CN106991029A (en) A kind of acquisition methods and device of sequence data
US11860798B2 (en) Data access path optimization
Lohmann XRootD Erasure Coding Repair Tool
WO2022196104A1 (en) Information processing device, storage system, information processing method, and information processing program
US20150215404A1 (en) Replication device, replication method, and replication system
CN113326232A (en) Data updating method and device
Frahm et al. A Block Device Driver for Parallel and Fault-tolerant Storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40058603

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant