CN102946323A - Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof - Google Patents

Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof Download PDF

Info

Publication number
CN102946323A
CN102946323A CN2012104110497A CN201210411049A CN102946323A CN 102946323 A CN102946323 A CN 102946323A CN 2012104110497 A CN2012104110497 A CN 2012104110497A CN 201210411049 A CN201210411049 A CN 201210411049A CN 102946323 A CN102946323 A CN 102946323A
Authority
CN
China
Prior art keywords
hadoop
computing node
address
hdfs
rack
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104110497A
Other languages
Chinese (zh)
Inventor
马庆怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN2012104110497A priority Critical patent/CN102946323A/en
Publication of CN102946323A publication Critical patent/CN102946323A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a realizing method for location awareness of a compute node cabinet in an HDFS (Hadoop Distributed File System) and a realizing system thereof; the method comprises the following steps of: A. starting a Hadoop distributed file system; B. checking the configuration options in standard files of the Hadoop distributed file system; C. linking the configuration options to checked script files; D. obtaining IP addresses of compute nodes in a Hadoop compute cluster; E. judging if the compute nodes belong to the Hadoop compute cluster; F. judging if the IP addresses contain corresponding cabinet messages; G. returning the corresponding cabinet messages of the Hadoop distributed file system; H. returning the defaulted cabinet messages of the Hadoop distributed file system; and I. presenting the Hadoop distributed file system in an abnormal state. The method overcomes the problems that the interconnection of exchangers among the cabinets becomes the bottleneck of the data search and operation among the nodes, and all copies of the same data block are possibly stored in one cabinet, so that the data safety of the system is difficultly ensured when one cabinet loses power.

Description

The implementation method of computing node rack location aware and realization system thereof among a kind of HDFS
Technical field
The present invention relates to high-performance calculation and cluster field, be specifically related to the implementation method of computing node rack location aware among a kind of HDFS.
Background technology
Hdfs(Hadoop Distributed File System, Hadoop distributed file system) scale of cluster is generally larger, usually can be deployed in several in addition tens racks in.A general rack uses two layers of convergence switch, therefore exchanges data between the switch is generally little than switch internal exchange of data bandwidth, and the network traffics between the node are usually more efficient than striding the internodal network traffics of rack in the same rack in cluster.Simultaneously, a management node is placed in the different racks copy of a piece to improve system survivability as far as possible.The prerequisite that realizes above-mentioned two kinds of technology is to allow the Hdfs system know that a node belongs to which rack or its rack ID, that is to say, it should have rack consciousness.
At present for how allowing the Hdfs file system have the clearer and more definite solution of rack perceptional function neither one.
If Hdfs does not have the rack perceptional function, will run into following two problems:
1. the interconnection of rack room switch becomes the bottleneck when data search is with operation between node.
2. all copies of same data block may be in same rack, and the fail safe of the data of system is difficult to guarantee when a rack power down.
Summary of the invention
For the deficiencies in the prior art, the invention provides the implementation method of computing node rack location aware among a kind of HDFS, bottleneck when the interconnection that the method has overcome the rack room switch becomes data search and operation between node and all copies of same data block may be in same racks, the problem that the fail safe of the data of system is difficult to guarantee when a rack power down.
The objective of the invention is to adopt following technical proposals to realize:
The implementation method of computing node rack location aware among a kind of HDFS, its improvements be, described method comprises the steps:
A, startup Hadoop distributed file system;
Config option in B, the described Hadoop distributed file system normative document of inspection;
C, described config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster;
E, judge whether described computing node belongs to this Hadoop and calculate cluster;
F, the described IP of judgement address have or not corresponding rack information;
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
Wherein, among the described step B, described normative document represents with hadoop-default.xml.
Wherein, among the described step D, described Hadoop calculates in the cluster and comprises at least one computing node and a management node, and computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
Wherein, in the described step e, verify the legitimacy of described IP address after, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster.
Wherein, judge that described computing node belongs to this Hadoop and calculates cluster, then carry out step F; Otherwise carry out step I.
Wherein, described step F has corresponding rack information if judge described IP address, then carries out step G; Otherwise carry out step H.
Wherein, there are mapping relations between described computing node and the rack.
The present invention is based on the realization system of computing node rack location aware among a kind of HDFS that another purpose provides, its improvements are, described system comprises following module:
Start module: be used for starting described Hadoop distributed file system;
Checking module: the config option that is used for checking described Hadoop distributed file system normative document;
Obtain the IP address module: be used for obtaining the IP address that Hadoop calculates the cluster computing node;
Judge the computing node module: be used for judging whether described computing node belongs to this Hadoop and calculate cluster;
Judge the IP address module: be used for judging that described IP address has or not corresponding rack information.
Compared with the prior art, the beneficial effect that reaches of the present invention is:
Implementation method and its implementation of computing node rack location aware among the HDFS provided by the invention, make the Hdfs file system that the rack perceptional function arranged after, have following two benefits:
1. allow being distributed in the same rack that data trnascription tries one's best, thus when guaranteeing data search and operation rapidly, the optimization system performance.
2. allow the copy of same data block can not be distributed in fully in the same rack, thereby guarantee that the data of system can be used when a rack power down, improve Security of the system.
Description of drawings
Fig. 1 is the flow chart of the implementation method of computing node rack location aware among the HDFS provided by the invention.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in further detail.
The implementation method flow process of computing node rack location aware comprises the steps: as shown in Figure 1 among the HDFS provided by the invention
A, startup Hadoop distributed file system;
Topology.script.file.name config option among B, the inspection Hadoop distributed file system normative document hadoop-default.xml;
C, config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster: Hadoop calculates in the cluster and comprises a plurality of computing nodes and a management node, computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
After the legitimacy of E, checking IP address, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster: judge that computing node belongs to this Hadoop calculating cluster, then carries out step F; Otherwise carry out step I.
F, judgement IP address have or not corresponding rack information: if judge described IP address corresponding rack information is arranged, then carry out step G; Otherwise carry out step H.
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
Embodiment
The below is the content example of the topological arrangement file of computing node and rack information corresponding relation:
Datanode1 /dc1/rack1
Datanode2 /dc1/rack1
Datanode3 /dc1/rack2
Wherein, Datanode represents the computing node in the Hadoop system; Dc(datacenter) be the abbreviation of data center; Rack represents rack information.
This file delegation represents an information: illustrate datanode belongs to which rack of which data center.
Method provided by the invention is called the Hadoop distributed system by script file java interface passes to the Hadoop group system with the rack information of Datanode computing node, thereby realize cluster to the perception of node location, thereby optimization system system and raising security of system.Usually, in order to guarantee the fail safe of data, we can back up data.When certain machine breaks down, can avoid losing of data.In the Hadoop group system, modal is to deposit two backups to data, and best situation is that a backup is placed in the same rack of initial data, and another part is placed on another rack.If a machine is out of joint, yes goes for its backup in rack for our first-selection, because such transfer of data is rapid, also need not to transmit data (this has just been avoided switch " bottleneck " problem) by switch.Certainly, also have the impaired situation of whole rack, at this moment, we just can remove to seek Backup Data to other rack.At same rack, this just relates to the location recognition problem of computing node, and method of the present invention just can well be tackled this situation.
Should be noted that at last: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although with reference to above-described embodiment the present invention is had been described in detail, those of ordinary skill in the field are to be understood that: still can make amendment or be equal to replacement the specific embodiment of the present invention, and do not break away from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (8)

1. the implementation method of computing node rack location aware among the HDFS is characterized in that, described method comprises the steps:
A, startup Hadoop distributed file system;
Config option in B, the described Hadoop distributed file system normative document of inspection;
C, described config option is linked to the script file of detection;
D, the IP address that obtains computing node in the Hadoop calculating cluster;
E, judge whether described computing node belongs to this Hadoop and calculate cluster;
F, the described IP of judgement address have or not corresponding rack information;
G, return the corresponding rack information of Hadoop group system;
H, return Hadoop group system acquiescence rack information;
I, Hadoop group system are unusual.
2. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1 is characterized in that, among the described step B, described normative document represents with hadoop-default.xml.
3. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1, it is characterized in that, among the described step D, described Hadoop calculates in the cluster and comprises at least one computing node and a management node, computing node of every detection obtains the IP address of this computing node when chaining management node, and the IP address is sent to the script file of detection.
4. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1, it is characterized in that, in the described step e, verify the legitimacy of described IP address after, the information of IP address and config option is compared, judge whether described computing node belongs to this Hadoop and calculate cluster.
5. the implementation method of computing node rack location aware among the HDFS as claimed in claim 4 is characterized in that, judges that described computing node belongs to this Hadoop and calculates cluster, then carries out step F; Otherwise carry out step I.
6. the implementation method of computing node rack location aware among the HDFS as claimed in claim 1 is characterized in that, described step F has corresponding rack information if judge described IP address, then carries out step G; Otherwise carry out step H.
7. such as the implementation method of computing node rack location aware among each described HDFS among the claim 1-6, it is characterized in that, have mapping relations between described computing node and the rack.
8. the realization system of computing node rack location aware among the HDFS is characterized in that, described system comprises following module:
Start module: be used for starting described Hadoop distributed file system;
Checking module: the config option that is used for checking described Hadoop distributed file system normative document;
Obtain the IP address module: be used for obtaining the IP address that Hadoop calculates the cluster computing node;
Judge the computing node module: be used for judging whether described computing node belongs to this Hadoop and calculate cluster;
Judge the IP address module: be used for judging that described IP address has or not corresponding rack information.
CN2012104110497A 2012-10-24 2012-10-24 Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof Pending CN102946323A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104110497A CN102946323A (en) 2012-10-24 2012-10-24 Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104110497A CN102946323A (en) 2012-10-24 2012-10-24 Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof

Publications (1)

Publication Number Publication Date
CN102946323A true CN102946323A (en) 2013-02-27

Family

ID=47729232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104110497A Pending CN102946323A (en) 2012-10-24 2012-10-24 Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof

Country Status (1)

Country Link
CN (1) CN102946323A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103561033A (en) * 2013-11-08 2014-02-05 西安电子科技大学宁波信息技术研究院 Device and method for user to have remote access to HDFS cluster
CN104615606A (en) * 2013-11-05 2015-05-13 阿里巴巴集团控股有限公司 Hadoop distributed file system and management method thereof
CN105592178A (en) * 2015-09-17 2016-05-18 杭州华三通信技术有限公司 Method and device for determining position of data node

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901275A (en) * 2010-08-23 2010-12-01 华中科技大学 Distributed storage system and method thereof
CN102196049A (en) * 2011-05-31 2011-09-21 北京大学 Method suitable for secure migration of data in storage cloud
US20120236761A1 (en) * 2011-03-15 2012-09-20 Futurewei Technologies, Inc. Systems and Methods for Automatic Rack Detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901275A (en) * 2010-08-23 2010-12-01 华中科技大学 Distributed storage system and method thereof
US20120236761A1 (en) * 2011-03-15 2012-09-20 Futurewei Technologies, Inc. Systems and Methods for Automatic Rack Detection
CN102196049A (en) * 2011-05-31 2011-09-21 北京大学 Method suitable for secure migration of data in storage cloud

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TOM WHITE: "《Hadoop: The Definitive Guide, Third Edition》", 7 May 2012, O’REILLY MEDIA *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615606A (en) * 2013-11-05 2015-05-13 阿里巴巴集团控股有限公司 Hadoop distributed file system and management method thereof
CN104615606B (en) * 2013-11-05 2018-04-06 阿里巴巴集团控股有限公司 A kind of Hadoop distributed file systems and its management method
CN103561033A (en) * 2013-11-08 2014-02-05 西安电子科技大学宁波信息技术研究院 Device and method for user to have remote access to HDFS cluster
CN103561033B (en) * 2013-11-08 2016-11-02 西安电子科技大学宁波信息技术研究院 User remotely accesses the device and method of HDFS cluster
CN105592178A (en) * 2015-09-17 2016-05-18 杭州华三通信技术有限公司 Method and device for determining position of data node
CN105592178B (en) * 2015-09-17 2018-12-25 新华三技术有限公司 A kind of back end method for determining position and device

Similar Documents

Publication Publication Date Title
CN109308223B (en) Service request response method and equipment
CN106534328B (en) Node connection method and distributed computing system
CN109344014B (en) Main/standby switching method and device and communication equipment
CN102025630A (en) Load balancing method and load balancing system
CN109787827B (en) CDN network monitoring method and device
US20170331676A1 (en) Handling Failure Of Stacking System
CN108737574A (en) A kind of node off-line judgment method, device, equipment and readable storage medium storing program for executing
CN103458013A (en) Streaming media server cluster load balancing system and balancing method
CN112118130B (en) Self-adaptive distributed cache active-standby state information switching method and device
CN107729205B (en) Fault processing method and device for business system
CN102904977B (en) Network address allocation method, server and node
CN102694689A (en) Method and device for discovering network topology
CN112217847A (en) Micro service platform, implementation method thereof, electronic device and storage medium
CN113268351A (en) Load balancing method and device for gateway service
CN110990448B (en) Distributed query method and device supporting fault tolerance
CN112291116A (en) Link fault detection method and device and network equipment
CN104796283B (en) A kind of method of monitoring alarm
CN102946323A (en) Realizing method for location awareness of compute node cabinet in HDFS (Hadoop Distributed File System) and realizing system thereof
CN107592199B (en) Data synchronization method and system
CN105306566A (en) Method and system for electing master control node in cloud storage system
CN109104319B (en) Data storage device and method
CN104935614B (en) Data transmission method and device
EP3171565B1 (en) Methods, devices and system for netconf hello packets interaction
CN109992531A (en) Date storage method and device
CN114090342A (en) Storage disaster tolerance link management method, message execution node and storage control cluster

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20130227

RJ01 Rejection of invention patent application after publication