CN115955488B - Distributed storage copy cross-machine room placement method and device based on copy redundancy - Google Patents

Distributed storage copy cross-machine room placement method and device based on copy redundancy Download PDF

Info

Publication number
CN115955488B
CN115955488B CN202310225524.XA CN202310225524A CN115955488B CN 115955488 B CN115955488 B CN 115955488B CN 202310225524 A CN202310225524 A CN 202310225524A CN 115955488 B CN115955488 B CN 115955488B
Authority
CN
China
Prior art keywords
copy
deployed
file
machine room
placement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310225524.XA
Other languages
Chinese (zh)
Other versions
CN115955488A (en
Inventor
胡梦宇
贾承昆
张俊杰
陈曦
赵兵
李大海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhizhe Sihai Beijing Technology Co ltd
Original Assignee
Zhizhe Sihai Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhizhe Sihai Beijing Technology Co ltd filed Critical Zhizhe Sihai Beijing Technology Co ltd
Priority to CN202310225524.XA priority Critical patent/CN115955488B/en
Publication of CN115955488A publication Critical patent/CN115955488A/en
Application granted granted Critical
Publication of CN115955488B publication Critical patent/CN115955488B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a distributed storage copy cross-machine room placement method and device based on copy redundancy, wherein the method is applied to a distributed storage system and comprises the following steps: deploying a plurality of metadata nodes into a first machine room in the distributed storage system, and determining a main metadata node; setting a copy placement strategy for a file to be deployed; responding to a writing request initiated by a client, and completing the placement of the files to be deployed across machine rooms by the main metadata node according to the copy placement strategy of the files to be deployed; and acquiring access data of the file to be deployed, and modifying a copy placement strategy of the file to be deployed based on the access data of the file to be deployed. By setting the copy placement strategy, the distributed storage system can sense the copy distribution condition so as to solve the problems of file reading, file writing, copy recovery, copy deletion and cross-machine room traffic under the traffic scene of the computing node task.

Description

Distributed storage copy cross-machine room placement method and device based on copy redundancy
Technical Field
The application relates to the technical field of computers, in particular to a distributed storage copy cross-machine room placement method and device based on copy redundancy.
Background
With the advent of the cloud computing era, the multi-cloud architecture has become a mainstream architecture of various large government enterprises, scientific and technological companies and internet companies. The distributed storage is used as a base of the whole data architecture, so that stable and reliable data read-write service can be provided, and therefore, how to provide stable and reliable distributed storage read-write service in a cloudy environment is a crucial matter, which directly determines whether data analysis can fully utilize cloudy resources, and whether data decision can be timely transmitted.
At present, the distributed storage based on copy redundancy has two cases for supporting multi-cloud deployment, the first case is that multi-cloud deployment is not supported, for example, HDFS does not provide a multi-cloud solution, if the distributed storage service is forcedly deployed in a multi-cloud mode, the read-write flow of the distributed storage service is far greater than the dedicated bandwidth between machine rooms, and even the multi-cloud service cannot normally communicate. Secondly, limited support is provided for multi-cloud deployment, a user can access corresponding distributed storage in different machine rooms by deploying multiple sets of distributed storage, and copies can be placed across the machine rooms by adopting the copies to enable the user to read the corresponding machine room copies in the corresponding machine rooms; however, the cost of deploying multiple sets of distributed storage is high, the copies are placed across the machine room and only stay on the copies corresponding to the machine room to be read when the copies are read after being placed across the machine room, and more scenes are not covered. Therefore, there is a need to solve the problem of cross-machine room traffic of distributed storage to achieve stable and reliable multi-cloud deployment.
Disclosure of Invention
The application provides a method and a device for placing distributed storage copies across machine rooms based on copy redundancy, which are used for solving the problem of flow of the distributed storage across the machine rooms, so that stable and reliable multi-cloud deployment is realized.
In a first aspect, the present application provides a method for placing a distributed storage copy across a machine room based on copy redundancy, where the method is applied to a distributed storage system, and the method includes:
deploying a plurality of metadata nodes into a first machine room in the distributed storage system, and determining a main metadata node;
setting a copy placement strategy for a file to be deployed, wherein the copy placement strategy comprises the following steps: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field; the main machine room is used for determining whether a copy is written into a second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under the certain directory;
responding to a writing request initiated by a client, and completing the placement of the files to be deployed across machine rooms by the main metadata node according to the copy placement strategy of the files to be deployed;
And acquiring access data of the file to be deployed, and modifying a copy placement strategy of the file to be deployed based on the access data of the file to be deployed.
According to the distributed storage copy cross-machine room placement method based on copy redundancy, the main metadata node completes cross-machine room placement of the files to be deployed according to the copy placement strategy of the files to be deployed, and the method comprises the following steps:
obtaining a copy placement strategy and the current deployment copy number of a file to be deployed, and recording the copy placement strategy and the current deployment copy number in the main metadata node;
determining the initial copy number and the copy distribution position of the file to be deployed based on the azPolicy field, wherein the azPolicy field consists of machine room identifiers, the same machine room identifiers are adjacent, and the initial copy number is the number of the machine room identifiers;
determining a host room based on the main nacz field, wherein the host room is used for deploying redundant copies of the files to be deployed;
if the current deployment copy number is less than or equal to the initial copy number, sequentially deploying the copies of the file to be deployed based on the azPolicy field;
if the current deployment copy number is greater than the initial copy number, calculating the redundant copy number, sequentially deploying the copies of the initial copy number to-be-deployed files based on the azPolicy field, and deploying the copies of the redundant copy number to-be-deployed files in the host room.
According to the distributed storage copy cross-machine room placement method based on copy redundancy provided by the application, modifying the copy placement policy of the file to be deployed includes adding copies, and modifying the copy placement policy of the file to be deployed based on access data of the file to be deployed includes: based on the access data of the files to be deployed, the client initiates a copy adding request, wherein the copy adding request comprises the copy number to be added; responding to a copy adding request initiated by a client, and determining the copy number to be added by the main metadata node; and modifying the azPolicy field of the current copy placement strategy according to the order from left to right based on the number of copies to be added and the number of the current deployment copies, so as to obtain a modified copy placement strategy.
According to the distributed storage copy cross-machine room placement method based on copy redundancy provided by the application, modifying the copy placement policy of the file to be deployed includes deleting copies, and modifying the copy placement policy of the file to be deployed based on access data of the file to be deployed includes: based on the access data of the files to be deployed, a client initiates a copy deletion request, wherein the copy deletion request contains the number of copies to be deleted; responding to a request of deleting copies initiated by a client, and determining the copy number to be deleted by the main metadata node; and modifying the azPolicy field of the current copy placement strategy according to the order from right to left based on the copy number to be deleted and the current deployment copy number, so as to obtain a modified copy placement strategy.
According to the distributed storage copy cross-machine room placement method based on copy redundancy, the method further comprises the following steps:
responding to a calculation task request initiated by a client, and determining an original file matched with the calculation task request by the main metadata node; the calculation task request comprises reading an original file, writing an intermediate result, reading the intermediate result and writing a final result;
obtaining a copy placement strategy of the original file;
and modifying the copy placement strategy of the original file to enable the intermediate result generated after the client reads the original file to be stored in the second machine room.
According to the distributed storage copy cross-machine room placement method based on copy redundancy provided by the application, the copy placement strategy of the original file is modified, so that an intermediate result generated after the client reads the original file is stored in the second machine room, and the method comprises the following steps: adding a temporary file placement strategy, wherein the temporary file placement strategy comprises the following steps: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery and the native write field is set to true; the temporary file placement strategy is used for storing an intermediate result generated after the client reads the original file.
According to the distributed storage copy cross-machine room placement method based on copy redundancy, the first machine room is any one of multiple machine rooms of a distributed storage system, and the second machine room is the machine room where a client initiating a writing request is or where a client initiating a computing task request is.
In a second aspect, the present application further provides a device for placing a distributed storage copy across a machine room based on copy redundancy, where the device is applied to a distributed storage system, and the device includes:
the node deployment module is used for deploying a plurality of metadata nodes into a first machine room in the distributed storage system and determining a main metadata node;
the copy setting module is used for setting copy setting policies for files to be deployed, and the copy setting policies comprise: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field; the main machine room is used for determining whether a copy is written into a second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under the certain directory;
The copy placement module is used for responding to a write-in request initiated by the client, and the master metadata node completes the placement of the files to be deployed across the machine room according to the copy placement strategy of the files to be deployed;
the copy modification module is used for acquiring access data of the files to be deployed and modifying a copy placement strategy of the files to be deployed based on the access data of the files to be deployed.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores a computer program, and when the processor runs the computer program, the processor executes the steps in any implementation manner of the distributed storage copy placement method based on copy redundancy across a machine room.
In a fourth aspect, embodiments of the present application further provide a readable storage medium having a computer program stored therein, where the computer program, when executed on a processor, performs the steps in any implementation of the above-described distributed storage copy placement method based on copy redundancy across a machine room.
In summary, the method and the device for placing the distributed storage copy across the machine room based on the copy redundancy provided by the embodiment of the invention analyze the scene of generating the traffic across the machine room, set the copy placement strategy of multiple fields, so that the distributed storage system can sense the copy distribution condition through the copy placement strategy, and meanwhile, the copy placement strategy also solves the problem of the traffic across the machine room in the scene of file reading, file writing, copy recovering, copy deleting and computing node task traffic, thereby greatly reducing the traffic across the machine room and providing powerful support for realizing stable and reliable multi-cloud deployment.
Drawings
For a clearer description of the present application or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a distributed storage copy cross-machine room placement method based on copy redundancy provided by the application;
FIG. 2 is a flow chart of a method for optimizing task traffic of a computing node provided by the present application;
FIG. 3 is a schematic structural diagram of a distributed storage copy cross-machine room placement device based on copy redundancy provided in the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Icon: 300-network topology system; 310-node deployment module; 320-copy setting module; 330-copy placement module; 340-copy modification module; 400-an electronic device; 410-a memory; 420-a processor; 430-bus.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Distributed storage system: and a massive file storage service is provided, files are organized in a directory tree form, and a user is provided with an access mode similar to a linux path. Common open source implementations are HDFS (Hadoop Distributed File System), alluxio, etc.
Distributed storage based on duplicate redundancy: for distributed file storage, where there are multiple copies of a file, each copy is available independently, as long as there is one copy available, the file is considered available.
Metadata node: the distributed storage is used for storing metadata, and the nodes receive metadata requests of users, such as directory viewing, directory creation, directory deletion and the like. The metadata nodes generally have a plurality of metadata nodes which provide high availability services in a master-slave mode without providing real data read-write services.
Data node: the distributed storage is used for truly storing the data, and receives the request of a user for reading and writing the data, such as reading a file, writing the file and the like.
Region (Region): mainly in physical areas such as city and county. One Region may have multiple rooms. The cost and delay of the private line across regions can be relatively high.
Machine room (Zone): a physical machine room, typically thousands to tens of thousands of machines, will make up a machine room.
An available Zone (AZ for short): refers to a usable area composed of a plurality of Zone. AZ and AZ are connected by dedicated lines, the distance between AZ is generally not too great (< 100 km). Generally, we refer to the flow of the special line between AZs.
Analysis summary is performed on the cross-machine room traffic, which generally occurs in the following five scenarios of file reading, file writing, copy recovery, copy deletion and computing node task traffic.
(1) Scene one: file reading
A certain client initiates a file reading request, for example, the reading request is for reading the file 1, but a copy of the file 1 does not exist in the machine room where the client is located, and at this time, the machine room where the client is located needs to read the copy of the file 1 from other machine rooms, that is, the reading request is completed across the machine rooms, so as to generate the flow across the machine rooms.
(2) Scene II: file writing
A certain client initiates a file writing request, for example, the machine room where the client is located is the machine room 1, and the writing request is to write two copies of the file 1 in the machine room 1 and the machine room 2 respectively, at this time, when the two copies of the file 1 are written in the machine room 2, the writing needs to be performed across the machine rooms, that is, the flow across the machine rooms is generated.
(3) Scene III: copy recovery
The distributed storage system generally has a lost copy recovery mechanism, and under the condition that the copy is damaged or the data node is down, the actual copy number of the data node is smaller than the copy number recorded by the metadata node, and at this time, the distributed storage system triggers and starts the lost copy recovery mechanism. However, in the process of copy recovery, the copy data is generally copied between the data nodes to recover the lost copy, and if the data nodes that copy each other at this time do not belong to the same machine room, the flow across the machine room will be generated in the process of copy recovery. Even in extreme cases, such as when a large number of data nodes in a certain machine room are down, the lost copy recovery mechanism can generate larger cross-machine room traffic.
(4) Scene four: deduplication
The distributed storage system generally has a redundant copy deletion mechanism, and under the condition that the copy is added according to the copy adjustment requirement, the actual copy number of the data node is more than the copy number recorded by the metadata node, and at this time, the distributed storage system triggers and starts the redundant copy deletion mechanism. However, there is a time difference between the process of adding the copy and the process of updating the copy number recorded by the metadata node, if the new added copy is copied across the machine room, and the new added copy is considered as a redundant copy to be deleted, the copy needs to be copied across the machine room again, and in this case, the flow across the machine room is generated.
(5) Scene five: calculating node task traffic
When a user performs computation based on distributed storage, since the computation is generally divided into multiple steps, multiple data nodes in the same machine room may be involved in the computation process performed by the client, and each data node may need to read related files across the machine room. If the original file is read, the intermediate result is written, the intermediate result is read, the final result is written, and the like in the calculation process, and the reading and writing of the cross-machine room are related, the flow of the cross-machine room can be generated.
Therefore, in order to solve the problem of cross-machine room traffic generated in the above five scenarios, the copy placement strategy can be modified, so that the distributed storage system can sense the copy placement strategy, and the cross-machine room traffic is avoided, thereby realizing stable and reliable multi-cloud deployment.
Fig. 1 is a flow chart of a method for placing a distributed storage copy across a machine room, which is provided in the present application and is applied to a distributed storage system, as shown in fig. 1, and the method includes:
step 101, deploying a plurality of metadata nodes into a first machine room in the distributed storage system, and determining a main metadata node.
The first machine room is any one of a plurality of machine rooms of the distributed storage system.
Specifically, it can be understood that one machine room is randomly selected as a first machine room from a plurality of machine rooms in the distributed storage system, and a plurality of metadata nodes are deployed to the first machine room; one metadata node is selected as a master metadata node, and the rest metadata nodes are used as slave metadata nodes, so that master-slave synchronization is ensured, and the machine rooms are not crossed, and the consistency and the usability of multiple machine rooms are maintained. In some examples, after the primary metadata node is down, the remaining metadata nodes may replace the primary metadata node to continue to enhance service. The setting method may refer to an HDFS dual-node setting method, for example, a dual-node is adopted, one is in an active state, and the other is in a backup state, and when the metadata node of the active is down, the backup is immediately switched to the active.
Step 102, setting copy placement strategies for files to be deployed, wherein the copy placement strategies comprise: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field.
The main machine room is used for determining whether a copy is written into the second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under a certain directory.
Specifically, it can be understood that different copy placement policies are set for different files or directories, corresponding path keys are set for each file or directory in the master metadata node, and the corresponding copy placement policies are stored under the path keys, so that the copy placement policies of the files to be deployed can be queried according to the path keys later. The copy placement policy may be described in the form of a json string, for example,
{
"/":
{
"mainAz": "az1",
"nativeWrite": false,
"azPolicy": "az1,az1,az2",
"disableRedundantDelete": false,
"disableBlockRecovery": false
}
}
wherein, the main machine room is az1, which is denoted as az1 after the main az field; the azPolicy field is followed by "az1, az1, az2" to indicate that the copy distribution is az1, az1, az2 when writing the file, i.e., one data node is selected from az1 to place a first copy, a second data node is selected from az1 to place a second copy, and a third data node is selected from az2 to place a third copy; the nativeWrite, disableRedundantDelete, disableBlockRecovery fields are all configured as false, indicating that the function of the above fields is not enabled. In addition, if the nativeWrite, disableRedundantDelete, disableBlockRecovery field is configured to true, this indicates that the function of the corresponding field is turned on.
It should be noted that, when multiple copies of a file to be deployed are deployed to a certain machine room according to a copy placement policy of the file to be deployed, the distributed storage system itself has a corresponding mechanism to ensure that the multiple copies are placed at different nodes of the machine room.
Specifically, it can be further understood that the copy placement strategy can solve the flow across machine rooms generated in the third or fourth scenario, and the main principle is as follows: by setting the disableRedundantDelete field and configuring the disableRedundantDelete field as true, a redundant duplicate deletion mechanism is disabled in the duplicate recovery process, and the recovered duplicate is prevented from being identified as a redundant duplicate, so that the generation of cross-machine-room flow in the duplicate recovery process is prevented; even for the extreme case that a large number of data nodes are down, the copy recovery mechanism can cause a large amount of cross-machine-room traffic to occur to the cluster, and the disable block recovery field can help to disable the copy recovery mechanism temporarily, so that the distributed storage system has enough time to repair the down data nodes, after repairing the nodes, the copy recovery can not be performed again when the number of actually deployed copies is consistent with the number of copies recorded by the metadata nodes, and the cross-machine-room traffic can not be generated naturally.
It should be noted that, each field in the copy placement policy may be understood as a configuration item, and the function of each field is specifically implemented in a code form, and in practical application, the function of each field may also be modified. In some embodiments, the copy placement policy may also be described in terms of a property file, a yaml file, or an xml file.
And step 103, responding to a write-in request initiated by the client, and completing the placement of the files to be deployed across machine rooms by the main metadata node according to the copy placement strategy of the files to be deployed.
The write-in request comprises files to be deployed and the current deployment copy number.
Specifically, it may be understood that the main metadata node completes the placement of the files to be deployed across machine rooms according to the copy placement policy of the files to be deployed, including the following steps:
step 1031, obtaining a copy placement strategy and a current deployment copy number of the file to be deployed, and recording the copy placement strategy and the current deployment copy number in the master metadata node.
Step 1032, determining an initial copy number and a copy distribution position of the file to be deployed based on the azPolicy field, where the azPolicy field is composed of machine room identifiers, the same machine room identifiers are adjacent, and the initial copy number is the number of the machine room identifiers.
Specifically, the machine room identifiers may be az1, az2 and az3, and the condition that the same machine room identifiers are adjacent refers to that the same machine room is distributed in adjacent positions, for example, azPolicy fields are "az1, az1 and az2", and when a file is written in a pipe form, a data node is selected from az1 to write a first copy; selecting a second data node from the az1, and copying data from the first data node, wherein the first node and the second node belong to the same machine room az1 at the moment, and no cross-dedicated line traffic is generated; the third data node is selected from az2 and the data is copied from the second data node. Aiming at the file writing of the second scene, the distribution position of the copies is adjusted through the azPolicy field so as to realize that the same machine room is positioned at the adjacent position, and the times of crossing the machine room are reduced when the file is written, so that the flow of crossing the machine room is reduced; for the copy recovery of the third scene, the copy recovery is performed through the copy distribution sequence of the azPolicy fields, and because the same machine room is positioned at the adjacent position, the data nodes which are down can be copied from the data nodes of the same machine room to recover the copy during the copy recovery, so that the flow across the machine rooms is avoided.
And step 1033, determining a host room based on the main az field, wherein the host room is used for deploying redundant copies of the files to be deployed.
Specifically, the main machine room may be configured by a main naz field, for example, a main machine room with az1 after the main naz field indicates that the main machine room is az1, and a main machine room with az2 after the main naz field indicates that the main machine room is az2. If the number of the current deployed copies exceeds the initial copy number configured by the azPolicy, the extra copies are placed in the host computer room, so that writing failure caused by the copy placement policy is avoided, and the writing success rate of the copy placement policy is ensured.
Step 1034, if the current deployment copy number is less than or equal to the initial copy number, sequentially deploying copies of the file to be deployed based on the azPolicy field; if the current deployment copy number is greater than the initial copy number, calculating the redundant copy number, sequentially deploying the copies of the initial copy number to-be-deployed files based on the azPolicy field, and deploying the copies of the redundant copy number to-be-deployed files in the host room.
Specifically, for example, the file to be deployed in the write request is file 1, and in the copy placement policy of file 1, the main az field is az1, the azPolicy field is "az1, az2, az2", and if the current number of copies deployed of file 1 is 2, 1 copy of file 1 needs to be deployed in machine room az1, az2 in sequence; if the current deployment copy number of the file 1 is 5, the redundant copy number needs to be calculated, wherein the redundant copy number is the difference between the copy deployment number of the file to be deployed and the initial copy number, namely, the redundant copy number is 2, then copies of the file 1 are sequentially deployed in machine rooms az1, az2 and az2 based on azPolicy fields, a main machine room is determined to be az1 further through a main az field, and redundant 2 copies are deployed in the machine room az 1.
Step 104, access data of the file to be deployed is obtained, and a copy placement strategy of the file to be deployed is modified based on the access data of the file to be deployed.
The access data of the files to be deployed can be monitored in real time or collected periodically. In some embodiments, the access data of the files to be deployed may be read through an audit log of the metadata node, where the audit log records information such as files accessed by each user, request IP/machine room, and the like.
Specifically, it may be understood that the modifying the copy placement policy of the file to be deployed includes adding a copy, and the modifying the copy placement policy of the file to be deployed based on the access data of the file to be deployed includes:
and a step a1, based on the access data of the file to be deployed, the client initiates a copy adding request, wherein the copy adding request contains the copy number to be added.
In some embodiments, when it is monitored that a certain file access requirement increases, or a certain machine room has a file that is not read in the machine room, an increase copy request may be initiated to adjust a copy placement policy, so as to ensure a file utilization rate, and meanwhile, by writing data configuration that may be read across the machine rooms into multiple machine rooms, it is ensured that the data has a copy in each machine room, and when the data is read, the flow of reading the file across the machine rooms is greatly reduced.
And a2, responding to a copy adding request initiated by the client, and determining the copy number to be added by the main metadata node.
And a step a3 of modifying the azPolicy field of the current copy placement strategy according to the order from left to right based on the number of copies to be added and the number of the current deployment copies, so as to obtain a modified copy placement strategy.
For example, the azPolicy field of the copy placement policy of a certain file is "az1, az2, az2, az3", but the current deployment copy number recorded by the master metadata node is 2 (i.e. the distribution of the actual deployment copies is az1, az 2), if the copy number is 1 in the copy adding request, the azPolicy field is modified according to the order from left to right, so as to obtain modified azPolicy fields "az1, az2, az2"; if the copy adding request contains the copy number to be added as 2, the azPolicy fields are modified according to the order from left to right, and modified azPolicy fields 'az 1, az2, az2 and az 3' are obtained.
Specifically, it may be further understood that the modifying the copy placement policy of the file to be deployed includes deleting a copy, and the modifying the copy placement policy of the file to be deployed based on the access data of the file to be deployed includes:
And b1, based on the access data of the files to be deployed, the client initiates a copy deletion request, wherein the copy deletion request contains the copy number to be deleted.
In some embodiments, when it is monitored that a certain file access requirement is reduced, a delete copy request may be initiated to adjust the copy placement policy to ensure file utilization.
And b2, responding to a client to initiate a copy deletion request, and determining the copy number to be deleted by the main metadata node.
And b3, modifying the azPolicy field of the current copy placement strategy according to the order from right to left based on the copy number to be deleted and the current deployment copy number, and obtaining a modified copy placement strategy.
For example, the azPolicy field of the copy placement policy of a certain file is "az1, az2, az2, az3", the current deployment copy number recorded by the master metadata node is 4, if the copy deletion request includes 1 copy number to be deleted, the azPolicy field is modified according to the order from right to left, and the modified azPolicy fields are "az1, az2, az2"; if the copy adding request contains the copy number to be added as 2, the azPolicy fields are modified according to the order from left to right, and modified azPolicy fields 'az 1, az 2' are obtained.
It is noted that, after modifying the copy placement policy of the file to be deployed based on the access data of the file to be deployed, the master metadata node needs to increase the number of copies based on the requirement, update the current number of deployed copies recorded in the master metadata node, or update the current number of deployed copies recorded in the master metadata node based on the requirement to delete the number of copies.
According to the distributed storage copy cross-machine room placement method based on copy redundancy, analysis is conducted on scenes generating cross-machine room traffic, a multi-field copy placement strategy is set, a host room for storing redundant copies is determined through a main az field, and therefore the writing success rate of file copies is guaranteed; determining a second machine room through a native write field, and storing the temporary file in the second machine room so as to avoid unnecessary flow across the machine rooms in the calculation task request; the azPolicy fields are used for determining copy distribution, and copies are added or deleted strictly according to the copy distribution sequence of the azPolicy fields, so that the cross-machine room flow in the processes of writing files and recovering the copies is reduced; disabling a redundant duplicate deletion mechanism through a disableRedundantDelete field, and avoiding the error deletion of the restored duplicate in the duplicate restoration process, thereby avoiding the generation of cross-machine room traffic; the lost copy recovery mechanism is disabled through the disable block recovery field, so that a large amount of cross-machine-room traffic is avoided in the copy recovery process.
In summary, by setting the multi-field copy placement strategy, the distributed storage system can sense the copy distribution condition through the copy placement strategy, and meanwhile, the copy placement strategy solves the problem of the flow of the copy in the scenes of file reading, file writing, copy recovery and copy deletion, greatly reduces the flow of the copy in the machine room, and provides powerful support for realizing stable and reliable multi-cloud deployment.
Fig. 2 is a flow chart of a computing node task flow optimization method provided by the present application, as shown in fig. 2, where the method further includes optimizing a cross-machine room flow generated by a scenario five, and specifically includes the following steps:
in step 201, in response to a computing task request initiated by a client, the master metadata node determines an original file that matches the computing task request.
The calculation task request comprises reading an original file, writing an intermediate result, reading the intermediate result and writing a final result.
Step 202, obtaining a copy placement strategy of the original file.
And 203, modifying a copy placement strategy of the original file, so that an intermediate result generated after the client reads the original file is stored in the second machine room.
Specifically, it can be understood that the second machine room is the machine room where the client initiating the writing request is located, or the machine room where the client initiating the computing task request is located; in some embodiments, the second machine room may also be the machine room where the client that initiates the read request is located.
In particular, it is further understood that modifying the copy placement policy of the original file includes adding a temporary file placement policy, the temporary file placement policy including: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery and the native write field is set to true; the temporary file placement strategy is used for storing an intermediate result generated after the client reads the original file.
The temporary file placement policy is for example:
{
"/tmp": {
"mainAz": "az1",
"nativeWrite": true,
"azPolicy": "az1,az1,az1",
"disableRedundantDelete": false,
"disableBlockRecovery": false
}
notably, the configuration priority of the native write field is highest, and when the native write field is configured as true, the azPolicy field and the main az field are ignored. The native write field is a field specifically set for a path that does not participate in downstream computation, and after being opened, the client initiates a request in which machine room to write a copy into which machine room. Therefore, if the client initiating the calculation task request opens the native write field based on the temporary file placement policy, the temporary files are all stored in the machine room (i.e., the second machine room) where the client initiating the calculation task request is located, and the temporary files are deleted after the calculation task request is completed.
According to the computing node task flow optimization method, the temporary file placement strategy is added to modify the copy placement strategy, the temporary file is stored in the second machine room by utilizing the native write field, and the cross-machine room flow is avoided, so that the cross-machine room flow under the computing node task flow scene is solved.
Fig. 3 is a schematic structural diagram of a distributed storage copy placement device based on copy redundancy, which is provided in the present application and may be used to implement the method described in the foregoing embodiments. As shown in fig. 3, the apparatus is applied to a distributed storage system, and the apparatus includes:
a node deployment module 310, configured to deploy a plurality of metadata nodes into a first machine room in the distributed storage system, and determine a master metadata node;
a copy setting module 320, configured to set a copy placement policy for a file to be deployed, where the copy placement policy includes: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field; the main machine room is used for determining whether a copy is written into a second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under the certain directory;
The copy placement module 330 is configured to respond to a write request initiated by a client, where the master metadata node completes placement of the file to be deployed across machine rooms according to a copy placement policy of the file to be deployed;
the copy modification module 340 is configured to obtain access data of a file to be deployed, and modify a copy placement policy of the file to be deployed based on the access data of the file to be deployed.
For a detailed description of the above-mentioned distributed storage copy placement device based on copy redundancy across the machine room, please refer to the description of the related method steps in the above-mentioned embodiment, and the repetition is omitted. The apparatus embodiments described above are merely illustrative, wherein the "module" as illustrated as a separate component may or may not be physically separate, as may be a combination of software and/or hardware implementing the intended function. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Fig. 4 is a schematic structural diagram of an electronic device provided in the present application, and as shown in fig. 4, the electronic device 400 includes: the memory 410 and the processor 420 are connected through the bus 430, the memory 410 stores a computer program, and when the processor 420 reads and runs the computer program, the electronic device 400 can execute all or part of the flow of the method in the embodiment, so as to realize that the distributed storage copy based on copy redundancy is placed across the machine room.
The embodiment of the application also provides a readable storage medium, wherein the readable storage medium stores a computer program, and the computer program executes the steps in the distributed storage copy cross-machine room placement method based on copy redundancy when running on a processor.
It should be understood that the electronic device may be an electronic device with a logic computing function, such as a personal computer, a tablet computer, a smart phone, etc.; the readable storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory ), a magnetic disk, an optical disk, or the like.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and they should not fall within the scope of the present invention.

Claims (9)

1. A method for placing a distributed storage copy across a machine room based on copy redundancy, the method being applied to a distributed storage system, the method comprising:
deploying a plurality of metadata nodes into a first machine room in the distributed storage system, and determining a main metadata node;
setting a copy placement strategy for a file to be deployed, wherein the copy placement strategy comprises the following steps: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field; the main machine room is used for determining whether a copy is written into a second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under the certain directory;
responding to a writing request initiated by a client, and completing the placement of the files to be deployed across machine rooms by the main metadata node according to the copy placement strategy of the files to be deployed;
acquiring access data of a file to be deployed, and modifying a copy placement strategy of the file to be deployed based on the access data of the file to be deployed;
The main metadata node completes the placement of the files to be deployed across machine rooms according to the copy placement strategy of the files to be deployed, and the method comprises the following steps:
obtaining a copy placement strategy and the current deployment copy number of a file to be deployed, and recording the copy placement strategy and the current deployment copy number in the main metadata node;
determining the initial copy number and the copy distribution position of the file to be deployed based on the azPolicy field, wherein the azPolicy field consists of machine room identifiers, the same machine room identifiers are adjacent, and the initial copy number is the number of the machine room identifiers;
determining a host room based on the main nacz field, wherein the host room is used for deploying redundant copies of the files to be deployed;
if the current deployment copy number is less than or equal to the initial copy number, sequentially deploying the copies of the file to be deployed based on the azPolicy field;
if the current deployment copy number is greater than the initial copy number, calculating the redundant copy number, sequentially deploying the copies of the initial copy number to-be-deployed files based on the azPolicy field, and deploying the copies of the redundant copy number to-be-deployed files in the host room.
2. The method of claim 1, wherein the modifying the copy placement policy of the to-be-deployed file comprises adding a copy, the modifying the copy placement policy of the to-be-deployed file based on access data of the to-be-deployed file comprising:
Based on the access data of the files to be deployed, the client initiates a copy adding request, wherein the copy adding request comprises the copy number to be added;
responding to a copy adding request initiated by a client, and determining the copy number to be added by the main metadata node;
and modifying the azPolicy field of the current copy placement strategy according to the order from left to right based on the number of copies to be added and the number of the current deployment copies, so as to obtain a modified copy placement strategy.
3. The method of claim 1, wherein the modifying the copy placement policy of the to-be-deployed file comprises deleting a copy, the modifying the copy placement policy of the to-be-deployed file based on access data of the to-be-deployed file comprising:
based on the access data of the files to be deployed, a client initiates a copy deletion request, wherein the copy deletion request contains the number of copies to be deleted;
responding to a request of deleting copies initiated by a client, and determining the copy number to be deleted by the main metadata node;
and modifying the azPolicy field of the current copy placement strategy according to the order from right to left based on the copy number to be deleted and the current deployment copy number, so as to obtain a modified copy placement strategy.
4. The method according to claim 1, wherein the method further comprises: optimizing the cross-machine room flow generated by calculating the node task flow scene; the cross-machine room flow generated for the computing node task flow scene is optimized, and specifically comprises the following steps:
responding to a calculation task request initiated by a client, and determining an original file matched with the calculation task request by the main metadata node; the calculation task request comprises reading an original file, writing an intermediate result, reading the intermediate result and writing a final result;
obtaining a copy placement strategy of the original file;
and modifying the copy placement strategy of the original file, so that an intermediate result generated after the client reads the original file is stored in the second machine room, and the generation of flow across the machine rooms is avoided.
5. The method of claim 4, wherein modifying the copy placement policy of the original file to store the intermediate result generated after the client reads the original file in the second machine room comprises:
adding a temporary file placement strategy, wherein the temporary file placement strategy comprises the following steps: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery and the native write field is set to true;
The temporary file placement strategy is used for storing an intermediate result generated after the client reads the original file.
6. The method of claim 5, wherein the first machine room is any one of a plurality of machine rooms of a distributed storage system, and the second machine room is a machine room where a client that initiates a write request is located, or a machine room where a client that initiates a compute task request is located.
7. A distributed storage copy placement apparatus across a machine room based on copy redundancy, the apparatus being applied to a distributed storage system, the apparatus comprising:
the node deployment module is used for deploying a plurality of metadata nodes into a first machine room in the distributed storage system and determining a main metadata node;
the copy setting module is used for setting copy setting policies for files to be deployed, and the copy setting policies comprise: mainAz, nativeWrite, azPolicy, disableRedundantDelete, disableBlockRecovery field; the main machine room is used for determining whether a copy is written into a second machine room, the azPolicy field is used for determining copy distribution, the disable redundancy delete field is used for temporarily disabling a redundant copy deletion mechanism under a certain directory, and the disable block recovery field is used for temporarily disabling a lost copy recovery mechanism under the certain directory;
The copy placement module is used for responding to a write-in request initiated by the client, and the master metadata node completes the placement of the files to be deployed across the machine room according to the copy placement strategy of the files to be deployed;
the copy modification module is used for acquiring access data of the file to be deployed and modifying a copy placement strategy of the file to be deployed based on the access data of the file to be deployed;
the main metadata node completes the placement of the files to be deployed across machine rooms according to the copy placement strategy of the files to be deployed, and the method comprises the following steps:
obtaining a copy placement strategy and the current deployment copy number of a file to be deployed, and recording the copy placement strategy and the current deployment copy number in the main metadata node;
determining the initial copy number and the copy distribution position of the file to be deployed based on the azPolicy field, wherein the azPolicy field consists of machine room identifiers, the same machine room identifiers are adjacent, and the initial copy number is the number of the machine room identifiers;
determining a host room based on the main nacz field, wherein the host room is used for deploying redundant copies of the files to be deployed;
if the current deployment copy number is less than or equal to the initial copy number, sequentially deploying the copies of the file to be deployed based on the azPolicy field;
If the current deployment copy number is greater than the initial copy number, calculating the redundant copy number, sequentially deploying the copies of the initial copy number to-be-deployed files based on the azPolicy field, and deploying the copies of the redundant copy number to-be-deployed files in the host room.
8. An electronic device comprising a memory storing a computer program and a processor executing the distributed storage copy cross-machine room placement method based on copy redundancy of any one of claims 1 to 6 when the computer program is run.
9. A readable storage medium, characterized in that a computer program is stored in the readable storage medium, which computer program, when run on a processor, performs the distributed storage copy across machine room placement method based on copy redundancy of any one of claims 1 to 6.
CN202310225524.XA 2023-03-10 2023-03-10 Distributed storage copy cross-machine room placement method and device based on copy redundancy Active CN115955488B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310225524.XA CN115955488B (en) 2023-03-10 2023-03-10 Distributed storage copy cross-machine room placement method and device based on copy redundancy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310225524.XA CN115955488B (en) 2023-03-10 2023-03-10 Distributed storage copy cross-machine room placement method and device based on copy redundancy

Publications (2)

Publication Number Publication Date
CN115955488A CN115955488A (en) 2023-04-11
CN115955488B true CN115955488B (en) 2023-05-23

Family

ID=85891196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310225524.XA Active CN115955488B (en) 2023-03-10 2023-03-10 Distributed storage copy cross-machine room placement method and device based on copy redundancy

Country Status (1)

Country Link
CN (1) CN115955488B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240873B (en) * 2023-11-08 2024-03-29 阿里云计算有限公司 Cloud storage system, data reading and writing method, device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110225121A1 (en) * 2010-03-11 2011-09-15 Yahoo! Inc. System for maintaining a distributed database using constraints
CN103793534B (en) * 2014-02-28 2017-09-08 苏州博纳讯动软件有限公司 Distributed file system and balanced metadata storage and the implementation method for accessing load
CN114003180A (en) * 2021-11-11 2022-02-01 中国建设银行股份有限公司 Data processing method and device based on cross-machine-room Hadoop cluster
CN114385561A (en) * 2022-01-10 2022-04-22 北京沃东天骏信息技术有限公司 File management method and device and HDFS system
CN115543965A (en) * 2022-10-21 2022-12-30 北京火山引擎科技有限公司 Cross-machine-room data processing method, device, storage medium, and program product

Also Published As

Publication number Publication date
CN115955488A (en) 2023-04-11

Similar Documents

Publication Publication Date Title
CN106407040B (en) A kind of duplicating remote data method and system
US20190163382A1 (en) Enabling data integrity checking and faster application recovery in synchronous replicated datasets
US9454446B2 (en) System and method for using local storage to emulate centralized storage
US7987158B2 (en) Method, system and article of manufacture for metadata replication and restoration
US20150213100A1 (en) Data synchronization method and system
CN106776130B (en) Log recovery method, storage device and storage node
JP6264666B2 (en) Data storage method, data storage device, and storage device
CN103597463A (en) Automatic configuration of a recovery service
CN113220729B (en) Data storage method and device, electronic equipment and computer readable storage medium
US20150186411A1 (en) Enhancing Reliability of a Storage System by Strategic Replica Placement and Migration
CN110515557B (en) Cluster management method, device and equipment and readable storage medium
JP2008217306A (en) Replication method and system, storage device, and program
CN106605217B (en) For the method and system for being moved to another website from a website will to be applied
CN115955488B (en) Distributed storage copy cross-machine room placement method and device based on copy redundancy
RU2643642C2 (en) Use of cache memory and another type of memory in distributed memory system
CN106528338A (en) Remote data replication method, storage equipment and storage system
US11288237B2 (en) Distributed file system with thin arbiter node
JP6070146B2 (en) Information processing apparatus and backup method
CN104951475A (en) Distributed file system and implementation method
US9588855B2 (en) Management and utilization of fault domains in distributed cache systems
US9037762B2 (en) Balancing data distribution in a fault-tolerant storage system based on the movements of the replicated copies of data
TW201638799A (en) Distributed storage of software images in computing systems
CN116389233A (en) Container cloud management platform active-standby switching system, method and device and computer equipment
CN111752892A (en) Distributed file system, method for implementing the same, management system, device, and medium
JP6376626B2 (en) Data storage method, data storage device, and storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant