CN113157660A - Data unit copy placement method and device, electronic equipment and system - Google Patents

Data unit copy placement method and device, electronic equipment and system Download PDF

Info

Publication number
CN113157660A
CN113157660A CN202110089492.6A CN202110089492A CN113157660A CN 113157660 A CN113157660 A CN 113157660A CN 202110089492 A CN202110089492 A CN 202110089492A CN 113157660 A CN113157660 A CN 113157660A
Authority
CN
China
Prior art keywords
data unit
copy
parameter
distributed system
available
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110089492.6A
Other languages
Chinese (zh)
Other versions
CN113157660B (en
Inventor
汪翔
沈春辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taobao China Software Co Ltd
Original Assignee
Taobao China Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taobao China Software Co Ltd filed Critical Taobao China Software Co Ltd
Priority to CN202110089492.6A priority Critical patent/CN113157660B/en
Publication of CN113157660A publication Critical patent/CN113157660A/en
Application granted granted Critical
Publication of CN113157660B publication Critical patent/CN113157660B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Abstract

A data unit copy placement method and device in a distributed system and electronic equipment are provided. The method comprises the following steps: receiving a write request of a data unit, wherein the write request at least comprises the number of copies of the data unit to be written and a placement parameter, the placement parameter comprises a height parameter and a minimum available group parameter of an available group, the available group is an influence range of single node failure on the availability of the data unit when a failure occurs, the height parameter is used for describing an availability level of the data unit, and the minimum available group parameter is used for describing the number of available groups for placing copies; determining a node used for writing in a data unit copy in the distributed system according to the placement parameter; and writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system. In the embodiment of the present specification, the data unit copy placement policy is expressed in a parameterized manner, and may be applied to a distributed system with any network topology.

Description

Data unit copy placement method and device, electronic equipment and system
Technical Field
The embodiment of the present specification relates to the field of computer technologies, and in particular, to a data unit duplicate placement method in a distributed system, a data unit duplicate placement apparatus in a distributed system, an electronic device, a computer-readable storage medium, and a system.
Background
The distributed system integrates computing and storage resources of a plurality of nodes through a network by using distributed system software, and provides services such as bottom-layer transparent data storage, a database, big data computing and the like. Such distributed systems have been commonly used in the fields of the internet, the internet of things, the smart industry, business intelligence, information management, and the like.
One large core issue in the field of distributed systems is availability. Failure is a small probability event for a single node/device. In a large distributed system, a large number of nodes are connected through network devices to cooperatively provide services. With a large number of node/device samples, it is a frequent occurrence that any node or network device in a distributed system fails. How to recover from the failure and reduce the effect of the failure is an essential element for designing a distributed system. The following are several common faults:
1. single machine failure: the single node is unavailable due to the fact that the single node goes down due to hardware, power supply and the like;
2. frame failure: a plurality of nodes are connected into a network through a rack, and all the nodes in the rack are unavailable due to the unavailable rack power supply or the unavailable network;
3. core switch failure: multiple chassis access the main network through core switches, which when such a core switch fails, may result in all nodes joining the main network from that core switch being unavailable;
4. data center failure: when a problem occurs with the power supply equipment or the network connection equipment of the data center, all the nodes in the access data center may be unavailable.
Multiple data unit replication technology is one of the most common designs in distributed systems. The core idea is to place the same data unit in multiple nodes. When a distributed system fails, it may happen that copies of the data unit are not available on multiple nodes. But as long as the number of available copies of the data unit is greater than 0, the data unit is still available.
The distributed system determines how to place copies of data units on various nodes by a data unit copy placement method. To combat different levels of failure, the data unit replica placement strategy also varies.
In general, a data unit replica has a higher level of availability when the node it places spans a higher level of network topology. Such as when data unit copies are placed across multiple racks, it can tolerate rack level failures. When a data unit replica is placed across multiple core switches, it can tolerate core switch level failures. When a data unit replica spans multiple data centers, it can tolerate data center level failures. And when the data unit copies are placed across a higher-level network topology, higher network overhead and read-write delay are often implied. Traffic across a data center is more expensive and delayed than traffic across a rack.
In a Hadoop Distributed File System (HDFS), there are generally two methods for data unit copy placement: one is a data unit copy placement method that is not rack aware. Multiple copies of a data unit of data are randomly placed at any node. When multiple copies of a data unit are placed in the same rack and the rack fails, its data unit becomes unavailable. The second is a rack-aware data unit copy placement method. One tier of racks may be perceived and copies of data units may be preferentially placed on a local basis and then distributed as far as possible to other racks. So that the data unit is still available when any one rack fails. However, when a device at a higher layer fails, such as a core switch, its data may be distributed in multiple racks of the same core switch, and its data unit is still unavailable.
The two data unit copy placement methods have the disadvantages that the scheme uses the same data unit copy placement method for all data units in one set of distributed system, cannot meet the requirements of individual availability guarantee levels of different data units in the same set of distributed system, cannot be applied to a variable multilayer network topology, and only supports the availability of rack level faults.
Disclosure of Invention
The embodiment of the specification provides a new technical scheme for placing data unit copies in a distributed system.
According to a first aspect of the present specification, there is provided a data unit copy placement method in a distributed system, including:
receiving a write request of a data unit, wherein the write request at least comprises the number of data unit copies to be written and a data unit copy placement parameter, the data unit copy placement parameter comprises a height parameter of an available group and a minimum available group parameter, the available group refers to an influence range of a single node failure on the availability of the data unit when a failure occurs, the height parameter of the available group is used for describing an availability level of the data unit, and the minimum available group parameter is used for describing the number of available groups for placing the data unit copies;
determining a node used for writing the data unit copy in the distributed system according to the data unit copy placement parameter;
and writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system.
Optionally, the availability levels include at least one of a single node level, a rack level, a core switch level, and a data center level.
Optionally, the data unit copy placement parameter further includes a remainder data unit copy affinity parameter, where the remainder data unit copy affinity parameter is used to describe that a remainder data unit copy of the data unit copy is placed in a remotely available group or a locally available group; the local available group is an available group which is closer to the writer network; the remote available group refers to an available group that is a greater distance from the writer network.
Optionally, wherein the minimum available group parameter is used to describe the number of available groups to place the copy of the data unit.
Optionally, the determining, according to the data unit copy placement parameter, a node in the distributed system for writing the data unit copy includes:
determining, from the available set height parameter, a network topology level to be spanned by nodes for writing the copies of the data units;
determining whether a node for writing the data unit copy is located in the locally available group or a remotely available group according to the remainder data unit copy affinity parameter;
determining the number of available groups to be spanned by the nodes for writing the copies of the data units according to the minimum available group parameter;
determining a node in the distributed system for writing the copy of the unit of data based on the determined network topology hierarchy to be spanned by the node for writing the copy of the unit of data, whether it is in the locally available group or a remotely available group, and the number of available groups to be spanned.
Optionally, if the determined network topology level to be crossed by the node for writing the data unit copy exceeds the number of layers of the network topology of the distributed system, the number of layers of the network topology of the distributed system is reduced by one to serve as the network topology level to be crossed by the node for writing the data unit copy.
Optionally, when the minimum available group parameter is greater than the number of copies of the data unit, determining that the number of available groups to be spanned by the node for writing the copy of the data unit is the same as the number of copies of the data unit.
According to a second aspect of the present specification, there is provided a data unit copy placement apparatus in a distributed system, comprising:
a receiving module, configured to receive a write request for a data unit, where the write request at least includes a number of data unit copies to be written and a data unit copy placement parameter, where the data unit copy placement parameter includes a height parameter and a minimum available group parameter of an available group, where the available group refers to a range of influence of a single node failure on availability of the data unit when a failure occurs, the height parameter of the available group is used to describe a level of availability of the data unit, and the minimum available group parameter is used to describe a number of available groups for placing the data unit copies;
a determining module, configured to determine, according to the data unit copy placement parameter, a node in the distributed system, where the data unit copy is written in;
and the writing module is used for writing the data unit copy into the determined node and recording the distribution information of the data unit copy in the distributed system.
According to a third aspect of the present specification, there is also provided an electronic apparatus, including:
a data unit replica placement device in a distributed system as described in the second aspect of the present specification; alternatively, the first and second electrodes may be,
a processor and a memory for storing instructions for controlling the processor to perform a method according to any one of the first aspects of the present description.
According to a fourth aspect of the present description, there is also provided a computer-readable storage medium storing executable instructions that, when executed by a processor, perform the method of any one of the first aspects of the present description.
According to a fifth aspect of the present specification, there is also provided a distributed system including the data unit copy placement apparatus in the distributed system according to the second aspect.
In one embodiment, a write request of a data unit is received, wherein the write request at least comprises the number of data unit copies to be written and a data unit copy placement parameter; wherein the data unit copy placement parameters include the available set height parameter, a remainder data unit copy affinity parameter, and a minimum available set parameter; the available group refers to the influence range of single node failure on the availability of the data unit when the failure occurs; determining a node used for writing the data unit copy in the distributed system according to the data unit copy placement parameter; and writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system. In the embodiment of the specification, the parameterized data unit copy placement strategy is used for expressing the data unit copy placement strategy, the parameterized data unit copy placement strategy describes different availability guarantee capabilities, personalized availability requirements can be met, and the parameterized data unit copy prevention and control strategy is irrelevant to a specific network topology structure, so that the parameterized data unit copy prevention and control strategy can be applied to a distributed system with any network topology structure, and the practicability is higher.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a block diagram of an electronic device that may be used to implement the data unit copy placement method in a distributed system of one embodiment;
FIG. 2 is a flow chart illustrating a method for placing copies of a data unit in a distributed system according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a distributed system network topology;
FIGS. 4 and 5 are schematic diagrams of different available group partitions in the network topology for the distributed system shown in FIG. 3;
FIG. 6 is a schematic diagram of placement of residue copies in accordance with an embodiment of the present description;
FIG. 7 is a schematic diagram of copy placement under different MGRs, in accordance with an embodiment of the present description;
FIG. 8 is a functional block diagram of a data unit replica placement device in a distributed system that can be used with embodiments of the present description;
FIG. 9 is a functional block diagram of an electronic device that may be used to implement embodiments of the present description.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
< hardware configuration >
Multiple data unit duplication techniques are one of the designs in distributed systems. The core idea is to place the same data unit in multiple nodes. When a distributed system fails, it may happen that copies of the data unit are not available on multiple nodes. But as long as the number of available copies of the data unit is greater than 0, the data unit is still available.
The distributed system determines how to place copies of data units on various nodes by a data unit copy placement method. To combat different levels of failure, the data unit replica placement strategy also varies.
In general, a data unit replica has a higher level of availability when the node it places spans a higher level of network topology. Such as when data unit copies are placed across multiple racks, it can tolerate rack level failures. When a data unit replica is placed across multiple core switches, it can tolerate core switch level failures. When a data unit replica spans multiple data centers, it can tolerate data center level failures. And when the data unit copies are placed across a higher-level network topology, higher network overhead and read-write delay are often implied. When placed across multiple chassis, write and copy traffic across the chassis is borne. When placed across data centers, write and copy traffic across data centers needs to be borne. Traffic across a data center is more expensive and delayed than traffic across a rack.
The level of availability of a data unit is proportional to the network overhead/read-write latency. In practical application, a data unit copy placement strategy is designed according to practical requirements in a distributed system. The present specification proposes a design scheme of a parameterized data unit copy placement strategy, which has the following characteristics: the method can be suitable for any number of copies of the data unit, can be suitable for network topology with any number of layers, and can be used for personalized data unit copy placement strategies described by a parameterized language aiming at different data units.
Fig. 1 is a schematic block diagram of an electronic device that can be used to implement the data unit copy placement method in the distributed system according to an embodiment.
As shown in fig. 1, the electronic device 1000 of the present embodiment may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600, a speaker 1700, a microphone 1800, and the like.
Processor 1100 is configured to execute program instructions, which may be in the instruction set of architectures such as x86, Arm, RISC, MIPS, SSE, and the like. The memory 1200 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1300 includes, for example, a USB interface, a headphone interface, and the like. Communication device 1400 is capable of wired or wireless communication, for example. The display device 1500 is, for example, a liquid crystal display panel, a touch panel, or the like. The input device 1600 may include, for example, a touch screen, a keyboard, and the like. The speaker 1700 is used to output voice information. The microphone 1800 is used to collect voice information.
The electronic device 1000 may be any device such as a smart phone, a laptop, a desktop computer, and a tablet computer.
In this embodiment, the memory 1200 of the electronic device 1000 is configured to store instructions for controlling the processor 1100 to operate so as to support implementation of the data unit copy placement method in the distributed system according to any embodiment of the present disclosure. The skilled person can design the instructions according to the solution disclosed in the present specification. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.
It should be understood by those skilled in the art that although a plurality of devices of the electronic apparatus 1000 are illustrated in fig. 1, the electronic apparatus 1000 of the embodiments of the present specification may refer to only some of the devices, for example, the processor 1100, the memory 1200, the display device 1500, the input device 1600, and the like.
The electronic device 1000 shown in fig. 1 is merely illustrative and is in no way intended to limit the description, its applications, or uses.
< method examples >
Fig. 2 is a flowchart illustrating a method for placing data unit copies in a distributed system according to an embodiment of the present disclosure, where the method may be implemented by an electronic device, such as the electronic device 1000 shown in fig. 1.
As shown in FIG. 2, the method for placing copies of data units in a distributed system according to this embodiment may include the following steps 2100 to 2300:
step 2100, receiving a write request of a data unit, where the write request at least includes a number of data unit copies to be written and a data unit copy placement parameter.
The data unit copy placement parameters include a height parameter and a minimum available set parameter for the available set. An available group refers to the extent to which a single node failure affects the availability of the data unit in question when a failure occurs. The height parameter of the available set is used to describe the availability level of the data unit. The minimum available set parameter is used to describe the number of available sets to place the copy of the data unit.
Step 2200 determines, according to the data unit copy placement parameter, a node in the distributed system for writing the data unit copy.
Specifically, the electronic device 1000 may determine, according to the available group height parameter, a network topology level to be spanned by the nodes for writing the copies of the data unit; determining whether a node for writing the data unit copy is located in the locally available group or a remotely available group according to the remainder data unit copy affinity parameter; determining the number of available groups to be spanned by the nodes for writing the copies of the data units according to the minimum available group parameter; determining a node in the distributed system for writing the copy of the unit of data based on the determined network topology hierarchy to be spanned by the node for writing the copy of the unit of data, whether it is in the locally available group or a remotely available group, and the number of available groups to be spanned.
Step 2300, writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system.
As shown in fig. 3, a plurality of nodes under different data centers and different switches are connected through a network. Each node's network path is assigned an attribute that identifies the node's network location throughout the system. The network path is generated from top to bottom by the network device to which the node is connected. For example, if Node 2[ Node2] accesses the system through Switch 1[ Switch 1], data center 0[ IDC 0], its network path is [/IDC0/Switch1/Node2 ]. The network paths of other nodes may be analogized.
The distributed system shown in fig. 3 is a three-tier network topology in which both Node0 and Node 1 become unavailable when Switch0 fails. When [ idc 0] fails, neither of [ Node0] [ Node 1] [ Node2] becomes available. The method of the embodiment can be applied to a distributed system with any layer number of network topologies. For simplicity of description, the following description will be made by taking a distributed system of a three-layer network topology shown in fig. 3 as an example.
In this embodiment, the string of the same portion of the network path prefixes of the plurality of nodes is referred to as a public network path. For example, the common network path for [ Node0] and [ Node 1] is/idc 0/switch 0; the common network path of [ Node0] and [ Node2] is/idc 0; the common network path of [ Node0] and [ Node 3] is/.
The number of elements included in a network path is called depth (depth), and the number of layers of the network topology minus the depth in the distributed system is called height (level). For example, the depth of the code/idc 0/switch1/node2 is 3, and the level is 0; if depth of/idc 0/switch/is 2, level is 1; for example, idc0 has depth of 1 and level of 2.
Further, in the present embodiment, a concept of an available Group (Availability Group) is proposed. The available groups are used to describe the availability impact range of a single device failure under a certain failure scenario. Taking the network topology shown in fig. 3 as an example: for single node failures, each node forms an available group; any single node fails, only rendering the node itself unusable. For a switch failure, the nodes contained in each box in fig. 4 form an available group; failure of any one switch may cause all nodes within one of the available groups in fig. 4 to be unavailable. For data center failures, the nodes contained in each box in fig. 5 form an available group; failure of any one data center may result in all nodes within one of the available groups in fig. 5 being unavailable.
Based on the available Group concept, the present embodiment proposes a concept of an available Group height (AGL) parameter. The AGL parameter is defined as the height of a common network path for all nodes in an available group, and is used to describe the availability level of the data unit. The availability levels include at least one of a single node level, a rack level, a core switch level, and a data center level. For a single machine node fault (a), the AGL parameter is 0; for switch failure (b), its AGL parameter is 1; for data center failure (c), the AGL parameter is 2.
Obviously, the larger the AGL parameter is, the higher its level of availability, and the larger the write latency is, the larger the consumed network bandwidth resources are. Different data units can set different AGL parameters, so that different availability guarantee capacities are achieved.
It should be noted that, if the determined network topology level to be spanned by the node for writing the data unit copy exceeds the number of layers of the network topology of the distributed system, that is, the AGL parameter is greater than the number of layers of the network topology of the distributed system, the availability guarantee level required by the data already exceeds the upper limit of the system, and the system can make the data unit copy use the available group of the higher level as best as possible, and at this time, the number of layers of the network topology of the distributed system is reduced by one to serve as the network topology level to be spanned by the node for writing the data unit copy.
Further, in this embodiment, a Remainder Affinity (RA) parameter is also proposed for describing whether the remainder data unit copy of the data unit copy is placed in a remote available group or a local available group. A locally available group refers to an available group that is closer to the writer network, and a remotely available group refers to an available group that is further away from the writer network.
In practical applications, a certain situation may arise: according to the AGL owned by the data unit, the number of available groups selected is smaller than the number of copies of the data unit or cannot be divided by the number of copies of the data unit, as shown in FIG. 6, the network path of the writer of the data is/idc 0/switch1/writer, the data unit to be written has 5 copies of the data unit, and the AGL is 2. The first 4 copies of data elements are placed in turn in two available groups of/idc 0 and/ idc 1, 2 copies of data elements per available group. The 5 th copy of the data unit that cannot be divided exactly is called the remainder data unit copy. How the remainder data unit copies are placed is defined by the RA parameter.
Optionally, the RA parameter takes two types, namely local and remote, and in one example, RA ═ 0 may represent RA ═ local, and RA ═ 1 may represent RA ═ remote.
When the RA parameter takes on local, the remainder data unit copy is preferably placed in the available group that is closer to the writer's network, i.e., the node in idc0 is preferably selected. When the RA parameter takes the value remote, the remainder data unit copy preferentially selects an available group of writers that are further away from the network for placement, i.e., preferentially selects a node in/idc 1.
It should be noted that, if the number of copies of a data unit specified in the actual application is an integer divisible of the available groups, the copies of the data unit are placed in each available group equally. RA does not affect the placement of the data unit copy.
Further, in this embodiment, a Minimum Groups Requirement (MGR) parameter is proposed for describing the number of available Groups for placing the data unit copies. The MGR parameter defines how many available groups a copy of a data unit spans, the availability of which is considered reliable. It will be appreciated that when the current placement of data unit replicas does not meet the MGR parameter, a data unit replica migration, i.e., a process of copying one data unit replica to another node and deleting the original data unit replica, will be initiated to ensure that the data unit replica placement meets the availability required by the MGR parameter.
As shown in fig. 7, when AGL is 1, we consider the nodes under each group of switches to form an available group, and the available group is represented by a box. A data unit has 3 copies of the data unit, which are respectively placed on [ Node 3], [ Node 4], [ Node 5 ]. Here, the influence on the system behavior when the MGR parameter has different values is explained.
When MGR is 1, we consider that the data unit copy is reliable as long as it occupies 1 available group. Obviously, no adjustment is needed for the placement of the copy of the data unit, which already occupies an available group and is already in a reliable state.
When MGR is 2, we consider that the data unit copy occupies at least 2 available groups, and the data is reliable. The current data unit copy placement occupies only 1 available group. So 1 copy of the data unit will be migrated so that the data occupies at least two available groups.
When MGR is 3, we consider that the data unit copy occupies at least 3 available groups and the data is reliable. The current data unit copy placement occupies only 1 available group. Thus, 2 copies of the data unit are migrated to two other available groups, so that the data occupies at least 3 available groups.
It should be noted that when the minimum available group parameter is zero, i.e., MGR is 0, the data unit copy should occupy as much of the available group as possible. When the minimum available set parameter is greater than the number of copies of the data unit, determining that the number of available sets to be spanned by the node for writing the copy of the data unit is the same as the number of copies of the data unit. When the MGR parameter is greater than the number of available groups included in the distributed system, the number of available groups included in the distributed system is taken as a criterion.
Thus, AGL/RA/MGR collectively constitute the data unit copy placement parameter described in the parameter description in this embodiment. Each data unit may be configured with different AGL/RA/MGR parameters. Parameterized data unit copy placement policies are persisted in meta-information in the distributed system. The data unit copy placement is performed according to these 3 parameters of the data unit such that the availability requirements of the data unit requirements are met. And, the AGL/RA/MGR is independent of the specific network topology layer number, and is a placement strategy capable of describing any multilayer network topology. Any AGL/RA/MGR combination can be applied to any network topology.
In this embodiment, a write request of a data unit is received, where the write request at least includes the number of data unit copies to be written and a data unit copy placement parameter; wherein the data unit copy placement parameters include the available set height parameter, a remainder data unit copy affinity parameter, and a minimum available set parameter; the available group refers to the influence range of single node failure on the availability of the data unit when the failure occurs; determining a node used for writing the data unit copy in the distributed system according to the data unit copy placement parameter; and writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system. In the embodiment of the specification, the parameterized data unit copy placement strategy is used for expressing the data unit copy placement strategy, the parameterized data unit copy placement strategy describes different availability guarantee capabilities, personalized availability requirements can be met, and the parameterized data unit copy prevention and control strategy is irrelevant to a specific network topology structure, so that the parameterized data unit copy prevention and control strategy can be applied to a distributed system with any network topology structure, and the practicability is higher.
< apparatus embodiment >
In this embodiment, a data unit copy placing apparatus in a distributed system is also provided, and the data unit copy placing apparatus in the distributed system may be disposed in the electronic device 1000 shown in fig. 1, for example.
As shown in fig. 8, the data unit copy placing apparatus 3000 in the distributed system includes: receiving module 3100, determining module 3200, and writing module 3300.
The receiving module 3100 is configured to receive a write request of a data unit, where the write request at least includes a number of copies of the data unit to be written and a placement parameter of the copies of the data unit. The data unit copy placement parameters include a height parameter for the available set and a minimum available set parameter. An available group refers to the extent to which a single node failure affects the availability of the data unit in question when a failure occurs. The height parameter of the available set is used to describe the availability level of the data unit. The minimum available set parameter is used to describe the number of available sets to place the copy of the data unit.
A determining module 3200, configured to determine, according to the data unit copy placement parameter, a node in the distributed system to write the data unit copy.
A writing module 3300, configured to write the data unit copy into the determined node, and record distribution information of the data unit copy in the distributed system.
Specifically, the availability levels include at least one of a single node level, a rack level, a core switch level, and a data center level.
In particular, the data unit copy placement parameter may also include a remainder data unit copy affinity parameter. A remainder data unit copy affinity parameter for describing a placement of a remainder data unit copy of the data unit copy in a remotely available group or a locally available group; the local available group is an available group which is closer to the writer network; the remote available group refers to an available group that is a greater distance from the writer network.
In one example, the determining module 3200 may be specifically configured to determine, according to the available group height parameter, a network topology level to be spanned by nodes for writing the copies of the data unit; determining whether a node for writing the data unit copy is located in the locally available group or a remotely available group according to the remainder data unit copy affinity parameter; determining the number of available groups to be spanned by the nodes for writing the copies of the data units according to the minimum available group parameter; determining a node in the distributed system for writing the copy of the unit of data based on the determined network topology hierarchy to be spanned by the node for writing the copy of the unit of data, whether it is in the locally available group or a remotely available group, and the number of available groups to be spanned.
And if the determined network topology level to be crossed by the node for writing the data unit copy exceeds the number of layers of the network topology of the distributed system, subtracting one from the number of layers of the network topology of the distributed system to serve as the network topology level to be crossed by the node for writing the data unit copy.
Wherein when the minimum available set parameter is greater than the number of copies of the data unit, determining that the number of available sets to be spanned by the node for writing the copy of the data unit is the same as the number of copies of the data unit.
The data unit copy placement device in the distributed system of this embodiment may be used to implement the technical solution of the foregoing method embodiment, and the implementation principle and technical effect thereof are similar, and are not described here again.
< apparatus embodiment >
In this embodiment, an electronic device is further provided, where the electronic device includes a data unit copy placement apparatus 3000 in a distributed system described in this specification apparatus embodiment; alternatively, the electronic device is the electronic device 4000 shown in fig. 9, and includes:
a memory 4100 for storing executable commands.
Processor 4200 is configured to perform methods described in any of the method embodiments herein under the control of executable commands stored in memory 4100.
The implementation subject of the embodiment of the method executed in the electronic device may be a server or an electronic device.
< computer-readable storage Medium embodiment >
The present embodiments provide a computer-readable storage medium having stored therein an executable command that, when executed by a processor, performs a method described in any of the method embodiments of the present specification.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims (11)

1. A data unit copy placement method in a distributed system comprises the following steps:
receiving a write request of a data unit, wherein the write request at least comprises the number of data unit copies to be written and a data unit copy placement parameter, the data unit copy placement parameter comprises a height parameter of an available group and a minimum available group parameter, the available group refers to an influence range of a single node failure on the availability of the data unit when a failure occurs, the height parameter of the available group is used for describing an availability level of the data unit, and the minimum available group parameter is used for describing the number of available groups for placing the data unit copies;
determining a node used for writing the data unit copy in the distributed system according to the data unit copy placement parameter;
and writing the data unit copy into the determined node, and recording the distribution information of the data unit copy in the distributed system.
2. The method of claim 1, wherein the availability level comprises at least one of a single node level, a rack level, a core switch level, and a data center level.
3. The method of claim 1, wherein the data unit copy placement parameters further include a remainder data unit copy affinity parameter describing placement of remainder data unit copies of the data unit copies in a remotely available group or a locally available group, the locally available group being an available group that is closer to a writer network, the remotely available group being an available group that is further from the writer network.
4. The method of claim 1, wherein the minimum available set parameter describes a number of available sets to place the copy of the data unit.
5. The method according to any one of claims 2 to 4, wherein the determining a node in the distributed system for writing the copy of the data unit according to the data unit copy placement parameter comprises:
determining, from the available set height parameter, a network topology level to be spanned by nodes for writing the copies of the data units;
determining whether a node for writing the data unit copy is located in the locally available group or a remotely available group according to the remainder data unit copy affinity parameter;
determining the number of available groups to be spanned by the nodes for writing the copies of the data units according to the minimum available group parameter;
determining a node in the distributed system for writing the copy of the unit of data based on the determined network topology hierarchy to be spanned by the node for writing the copy of the unit of data, whether it is in the locally available group or a remotely available group, and the number of available groups to be spanned.
6. The method of claim 5, wherein if the determined network topology level to be spanned by the nodes for writing the copy of the data unit exceeds the number of layers of the network topology of the distributed system, the number of layers of the network topology of the distributed system is reduced by one as the network topology level to be spanned by the nodes for writing the copy of the data unit.
7. The method of claim 5, wherein determining that the number of available groups to be spanned by the node for writing the copy of the data unit is the same as the number of copies of the data unit when the minimum available group parameter is greater than the number of copies of the data unit.
8. A data unit copy placement apparatus in a distributed system, comprising:
a receiving module, configured to receive a write request for a data unit, where the write request at least includes a number of data unit copies to be written and a data unit copy placement parameter, where the data unit copy placement parameter includes a height parameter and a minimum available group parameter of an available group, where the available group refers to a range of influence of a single node failure on availability of the data unit when a failure occurs, the height parameter of the available group is used to describe a level of availability of the data unit, and the minimum available group parameter is used to describe a number of available groups for placing the data unit copies;
a determining module, configured to determine, according to the data unit copy placement parameter, a node in the distributed system, where the data unit copy is written in;
and the writing module is used for writing the data unit copy into the determined node and recording the distribution information of the data unit copy in the distributed system.
9. An electronic device, comprising:
the distributed system of claim 8 wherein the data unit replica placement means; alternatively, the first and second electrodes may be,
a processor and a memory for storing instructions for controlling the processor to perform the method of any of claims 1 to 7.
10. A computer readable storage medium storing executable instructions that, when executed by a processor, perform the method of any one of claims 1 to 7.
11. A distributed system comprising the data unit replica placement device in a distributed system according to claim 8.
CN202110089492.6A 2021-01-22 2021-01-22 Data unit copy placement method, device, electronic equipment and system Active CN113157660B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110089492.6A CN113157660B (en) 2021-01-22 2021-01-22 Data unit copy placement method, device, electronic equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110089492.6A CN113157660B (en) 2021-01-22 2021-01-22 Data unit copy placement method, device, electronic equipment and system

Publications (2)

Publication Number Publication Date
CN113157660A true CN113157660A (en) 2021-07-23
CN113157660B CN113157660B (en) 2023-06-16

Family

ID=76879242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110089492.6A Active CN113157660B (en) 2021-01-22 2021-01-22 Data unit copy placement method, device, electronic equipment and system

Country Status (1)

Country Link
CN (1) CN113157660B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150088827A1 (en) * 2013-09-26 2015-03-26 Cygnus Broadband, Inc. File block placement in a distributed file system network
CN107241430A (en) * 2017-07-03 2017-10-10 国家电网公司 A kind of enterprise-level disaster tolerance system and disaster tolerant control method based on distributed storage
CN107968809A (en) * 2016-10-20 2018-04-27 北京金山云网络技术有限公司 A kind of Replica placement method and device
CN110677306A (en) * 2019-10-25 2020-01-10 上海交通大学 Network topology replica server configuration method and device, storage medium and terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150088827A1 (en) * 2013-09-26 2015-03-26 Cygnus Broadband, Inc. File block placement in a distributed file system network
CN107968809A (en) * 2016-10-20 2018-04-27 北京金山云网络技术有限公司 A kind of Replica placement method and device
CN107241430A (en) * 2017-07-03 2017-10-10 国家电网公司 A kind of enterprise-level disaster tolerance system and disaster tolerant control method based on distributed storage
CN110677306A (en) * 2019-10-25 2020-01-10 上海交通大学 Network topology replica server configuration method and device, storage medium and terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
秦耀: ""异构存储环境的HDFS副本放置管理策略与检索算法研究"", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
陈伟: "一种改进的HDFS副本放置策略", 《长春师范大学学报》 *

Also Published As

Publication number Publication date
CN113157660B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
WO2019092530A1 (en) Dynamic selection of deployment configurations of software applications
US10997127B2 (en) Preventing inefficient recalls in a hierarchical storage management (HSM) system
US10255140B2 (en) Fully distributed intelligent rebuild
US10983822B2 (en) Volume management by virtual machine affiliation auto-detection
CN102938784A (en) Method and system used for data storage and used in distributed storage system
US20130007091A1 (en) Methods and apparatuses for storing shared data files in distributed file systems
WO2019033949A1 (en) Data migration method, apparatus and device
CN108319618B (en) Data distribution control method, system and device of distributed storage system
US11308043B2 (en) Distributed database replication
US11163464B1 (en) Method, electronic device and computer program product for storage management
US11895087B2 (en) Adjusting firewall parameters based on node characteristics
US10581668B2 (en) Identifying performance-degrading hardware components in computer storage systems
CN110162429A (en) System repair, server and storage medium
US11151093B2 (en) Distributed system control for on-demand data access in complex, heterogenous data storage
US9037762B2 (en) Balancing data distribution in a fault-tolerant storage system based on the movements of the replicated copies of data
US11256584B2 (en) One-step disaster recovery configuration on software-defined storage systems
CN111857557B (en) Method, apparatus and computer program product for RAID type conversion
CN113157660B (en) Data unit copy placement method, device, electronic equipment and system
US11226743B2 (en) Predicting and preventing events in a storage system using copy services
US10712959B2 (en) Method, device and computer program product for storing data
US11336723B1 (en) Replicating data volume updates from clients accessing the data volume across fault tolerance zones
US20150302189A1 (en) Efficient modification and creation of authorization settings for user accounts
US10884874B1 (en) Federated restore of availability group database replicas
US11200256B2 (en) Record replication for multi-column partitioning on distributed database systems
CN115982101B (en) Machine room data migration method and device based on multi-machine room copy placement strategy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40056502

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant