CN107463582A

CN107463582A - The method and device of distributed deployment Hadoop clusters

Info

Publication number: CN107463582A
Application number: CN201610395969.2A
Authority: CN
Inventors: 高林林
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2016-06-03
Filing date: 2016-06-03
Publication date: 2017-12-12
Anticipated expiration: 2036-06-03
Also published as: CN107463582B; WO2017206667A1

Abstract

The invention provides a kind of method and device of distributed deployment Hadoop clusters, wherein, this method includes：The Template Information for disposing Hadoop clusters is received, wherein, Template Information is used for the mission bit stream and host information for indicating Hadoop clusters, and mission bit stream is used to describe task of needing Hadoop clusters to complete；The parameter information of one or more main frames of Hadoop clusters is gathered according to host information, wherein, each main frame is used to dispose one or more assemblies, and component is disposed by proxy server, for performing corresponding task；According to mission bit stream and parameter information to one or more assemblies deployment task.By the present invention, solve because artificial deployment Hadoop clusters cause complex operation in correlation technique, the problem of deployment time is long.

Description

The method and device of distributed deployment Hadoop clusters

Technical field

The present invention relates to the communications field, in particular to a kind of method and device of distributed deployment Hadoop clusters.

Background technology

The Hadoop of correlation technique is a distributed system architecture, is the distribution developed by Apache funds club Formula architecture, Hadoop is not an abbreviation, but an imaginary name, it is said that may be with the child of the group creator A toy name it is related, without actual meaning.Hadoop is the software of an exploitation and operation processing large-scale data Platform and open source software framework, realize and Distributed Calculation, user are carried out to mass data in the cluster of a large amount of computers composition It can develop distributed program in the case where not knowing about distributed low-level details, make full use of the power high-speed computation of cluster And storage.

In correlation technique, distributed deployment Hadoop clusters need administrative staff to understand in the Hadoop ecospheres and cluster Each host hardware resource situation, to deployment, Hadoop cluster management personnel propose high requirement, and easily malfunction.Using hand Dynamic configuration Hadoop clusters, complex steps, efficiency is low, under particularly extensive Hadoop cluster environment, dynamic capacity-expanding and contracting The elastic managements such as appearance are difficult.

However, realize that the system of Hadoop automatically disposes has problems with present：

Before Hadoop clusters are disposed, according to cluster environment software and hardware information and the component of deployment, Hadoop clusters are designed Network topology structure；The program is higher to cluster management personnel requirement, it is necessary to which cluster management personnel are familiar with environment software and hardware information With the Hadoop ecospheres；In the case where no cluster management personnel intervene, automatically dispose system then arbitrarily distributes Master With the node such as Slave, reasonable distribution and cluster hardware and system load information can not be utilized；

Hadoop cluster component version bag loading sources are single, the shortcomings of causing the Hadoop clustered deploy(ment) times uncontrollable.

Hadoop clustered deploy(ment)s propose high requirement, it is necessary to which it is familiar with the Hadoop ecospheres operation maintenance personnel；Understand cluster Interior each node resources information；Design Hadoop cluster networks topology；2nd, Hadoop clusters component node distribution is any；3、Hadoop The clustered deploy(ment) time is longer.

For above mentioned problem present in correlation technique, at present it is not yet found that the solution of effect.

The content of the invention

The embodiments of the invention provide a kind of method and device of distributed deployment Hadoop clusters, at least to solve correlation Because artificial deployment Hadoop clusters cause complex operation in technology, the problem of deployment time is long.

According to one embodiment of present invention, there is provided a kind of method of distributed deployment Hadoop clusters, including：Receive For disposing the Template Information of Hadoop clusters, wherein, the Template Information is used for the task letter for indicating the Hadoop clusters Breath and host information, the mission bit stream are used to describe task of needing the Hadoop clusters to complete；Believed according to the main frame Breath gathers the parameter information of one or more main frames of the Hadoop clusters, wherein, each main frame is used to dispose one Or multiple components, the component are disposed by proxy server, for performing corresponding task；According to the mission bit stream and the parameter Information is to one or more deployment of components tasks.

Alternatively, the parameter information includes at least one of：Host operating system information, mainframe network information, master Machine CPU information, host memory information, host CPU utilization rate, host memory utilization rate, host disk IO utilization rates, mainframe network Time delay, main frame average I/O operation stand-by period, host disk information, the progress information of main frame inner assembly.

Alternatively, according to the mission bit stream and the parameter information to one or more groups in the Hadoop clusters Part deployment task includes：According to the mission bit stream and parameter information generation deployment task list, wherein, the deployment is appointed Business list includes the mission bit stream, performs the parameter information of the required by task, and the priority of the task；From The mission dispatching of highest priority is selected in the deployment task list to corresponding component.

Alternatively, the priority is related to the attribute of the task and/or the parameter information of the execution task.

Alternatively, according to the Template Information and the parameter information to one or more deployment of components tasks it Afterwards, methods described also includes：Monitor the tasks carrying progress and/or log information of one or more of components.

Alternatively, the Template Information includes at least one of：Hadoop cluster systems number, needs are disposed Each component client of Hadoop clusters module information, Hadoop distributed file system HDFS copies number, Hadoop clusters connects Connect number and time-out time, OC NCV ambda, host subscriber's name and password, daily record disc information, data storage disk information, member Data storage disk information.

Alternatively, receiving for after disposing the Template Information of Hadoop clusters, methods described also to include：Described in parsing Template Information and the legitimacy for verifying the Template Information.

According to another embodiment of the invention, there is provided a kind of device of distributed deployment Hadoop clusters, including：Connect Module is received, for receiving the Template Information for being used for disposing Hadoop clusters, wherein, the Template Information is described for indicating The mission bit stream and host information of Hadoop clusters, the mission bit stream are used to describe times for needing the Hadoop clusters to complete Business；Acquisition module, the parameter information of one or more main frames for gathering the Hadoop clusters according to the host information, Wherein, each main frame includes one or more assemblies, and the component is disposed by proxy server, for performing corresponding task； Deployment module, for according to the mission bit stream and the parameter information to one or more deployment of components tasks.

Alternatively, deployment module also includes：Generation unit, for according to the mission bit stream and parameter information generation Deployment task list, wherein, the deployment task list includes the mission bit stream, performs the parameter of the required by task Information, and the priority of the task；Selecting unit, for selecting appointing for highest priority from the deployment task list Business is handed down to corresponding component.

Alternatively, described device also includes：Monitoring module, in the deployment module according to the Template Information and institute After parameter information is stated to one or more deployment of components tasks, the tasks carrying for monitoring one or more of components enters Degree and/or log information.

According to still another embodiment of the invention, a kind of storage medium is additionally provided.The storage medium is arranged to storage and used In the program code for performing following steps：

Receive the Template Information for disposing Hadoop clusters, wherein, the Template Information be used for indicate mission bit stream and The host information of the Hadoop clusters, the mission bit stream are used to describe task of needing the Hadoop clusters to complete；

The parameter information of one or more main frames of the Hadoop clusters is gathered according to the host information, wherein, often The individual main frame includes one or more assemblies, and the component is used to perform corresponding task；

According to the mission bit stream and the parameter information to one or more deployment of components tasks.

By the present invention, the Template Information for disposing Hadoop clusters is received, wherein, the Template Information is used to indicate The mission bit stream and host information of the Hadoop clusters, the mission bit stream are used to describe to need the Hadoop clusters to complete Task；The parameter information of one or more main frames of the Hadoop clusters is gathered according to the host information, wherein, each The main frame is used to dispose one or more assemblies, and the component is disposed by proxy server, for performing corresponding task；According to institute Mission bit stream and the parameter information are stated to one or more deployment of components tasks.Due to have received mission bit stream and main frame Information, and the loading condition of main frame and component by acquisition parameter acquisition of information, therefore can rationally to Hadoop clusters Each main frame and deployment of components task, can solve because artificial deployment Hadoop clusters cause complex operation in correlation technique, The problem of deployment time is long.

Brief description of the drawings

Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings：

Fig. 1 is the general structure frame figure of the distributed deployment Hadoop clusters of the embodiment of the present invention；

Fig. 2 is the flow chart of the method for distributed deployment Hadoop clusters according to embodiments of the present invention；

Fig. 3 is the structured flowchart of the device of distributed deployment Hadoop clusters according to embodiments of the present invention；

Fig. 4 is the alternative construction block diagram one of the device of distributed deployment Hadoop clusters according to embodiments of the present invention；

Fig. 5 is the alternative construction block diagram two of the device of distributed deployment Hadoop clusters according to embodiments of the present invention；

Fig. 6 is the structural framing figure of proxy server in the present embodiment distributed deployment Hadoop group systems；

The deployment flow of proxy server when Fig. 7 is the original state of the present embodiment；

Fig. 8 is the flow chart of the Hadoop clustered deploy(ment) methods of the present embodiment；

Fig. 9 is the timing diagram of the Hadoop clustered deploy(ment) methods of the present embodiment.

Embodiment

Describe the present invention in detail below with reference to accompanying drawing and in conjunction with the embodiments.It should be noted that do not conflicting In the case of, the feature in embodiment and embodiment in the application can be mutually combined.

It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.

Embodiment 1

The embodiment of the present application can be run in the network architecture shown in Fig. 1, and Fig. 1 is the distributed portion of the embodiment of the present invention The general structure frame figure of Hadoop clusters is affixed one's name to, as shown in figure 1, the network architecture includes：Dispose the management system of Hadoop clusters System, Hadoop clusters, wherein, the management system of deployment Hadoop clusters includes each functional module and performs agent node, Hadoop clusters also include the agent node of multiple scattered execution tasks, and deployment system and Hadoop clusters carry out communication link Connect.

A kind of distributed deployment for the management system for running on above-mentioned deployment Hadoop clusters is provided in the present embodiment The method of Hadoop clusters, Fig. 2 are the flow charts of the method for distributed deployment Hadoop clusters according to embodiments of the present invention, such as Shown in Fig. 2, the flow comprises the following steps：

Step S202, the Template Information for disposing Hadoop clusters is received, wherein, Template Information is used to indicate Hadoop The mission bit stream and host information of cluster, mission bit stream are used to describe task of needing Hadoop clusters to complete；

Step S204, the parameter information of one or more main frames of Hadoop clusters is gathered according to host information, wherein, often Individual main frame is used to dispose one or more assemblies, and component is disposed by proxy server, for performing corresponding task；Optionally, dispose Task is performed by proxy server.

Step S206, according to mission bit stream and parameter information to one or more assemblies deployment task.

By above-mentioned steps, the Template Information for disposing Hadoop clusters is received, wherein, Template Information is used to indicate The mission bit stream and host information of Hadoop clusters, mission bit stream are used to describe task of needing Hadoop clusters to complete；According to Host information collection Hadoop clusters one or more main frames parameter information, wherein, each main frame be used for dispose one or Multiple components, component are disposed by proxy server, for performing corresponding task；According to mission bit stream and parameter information to one or more Individual deployment of components task.Due to have received mission bit stream and host information, and main frame and group by acquisition parameter acquisition of information The loading condition of part, therefore can solve related skill rationally to each main frame and deployment of components task of Hadoop clusters Because artificial deployment Hadoop clusters cause complex operation in art, the problem of deployment time is long.

Alternatively, the executive agent of above-mentioned steps can be the control terminal, client etc. of Hadoop clusters, but be not limited to This.

Optionally, parameter information can be, but not limited to for：Host operating system information, mainframe network information, host CPU letter Breath (such as core number, dominant frequency size), host memory information, host CPU utilization rate, host memory utilization rate, host disk IO make With rate, mainframe network time delay, the main frame average I/O operation stand-by period, host disk information, main frame inner assembly progress information.

Optionally, Template Information can be, but not limited to for：The Hadoop clusters that Hadoop cluster systems number, needs are disposed Each component client connection number of module information, Hadoop distributed file system HDFS copies number, Hadoop clusters and time-out Time, OC NCV ambda, host subscriber's name and password, daily record disc information, data storage disk information, metadata storage dish Information.

In the optional embodiment according to the present embodiment, according to mission bit stream and parameter information in Hadoop clusters One or more assemblies deployment task includes：

S11, deployment task list is generated according to mission bit stream and parameter information, wherein, deployment task list includes task Information, the parameter information for performing required by task, and the priority of task；

S12, the mission dispatching of highest priority is selected from deployment task list to corresponding component.Optionally, preferentially Level is related to the attribute of task and/or the parameter information of execution task.

Optionally, after according to Template Information and parameter information to one or more assemblies deployment task, method is also wrapped Include：

Monitor the tasks carrying progress and/or log information of one or more assemblies.

Optionally, receiving for after disposing the Template Information of Hadoop clusters, method also to include：Parse Template Information And the legitimacy of validation template information.In the case where Template Information is legal, just go to perform subsequent step.Legal deployment template At least will be including but not limited to herein below：Hadoop clustered nodes number, needs dispose Hadoop clusters module information, Each component client of HDFS copies number, Hadoop clusters connection number and time-out time, OC NCV ambda, user name and close The information such as code, daily record storage dish, data storage disk, metadata storage dish.

Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but a lot In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, calculate Machine, server, or network equipment etc.) method that performs each embodiment of the present invention.

Embodiment 2

A kind of device of distributed deployment Hadoop clusters is additionally provided in the present embodiment, and the device is above-mentioned for realizing Embodiment and preferred embodiment, repeating no more for explanation was carried out.As used below, term " module " can be real The combination of the software and/or hardware of existing predetermined function.Although device described by following examples is preferably realized with software, But hardware, or the realization of the combination of software and hardware is also what may and be contemplated.

Fig. 3 is the structured flowchart of the device of distributed deployment Hadoop clusters according to embodiments of the present invention, such as Fig. 3 institutes Show, the device includes：

Receiving module 30, for receiving the Template Information for being used for disposing Hadoop clusters, wherein, Template Information is used to indicate The mission bit stream and host information of Hadoop clusters, mission bit stream are used to describe task of needing Hadoop clusters to complete；

Acquisition module 32, the parameter information of one or more main frames for gathering Hadoop clusters according to host information, Wherein, each main frame includes one or more assemblies, and component is disposed by proxy server, for performing corresponding task；

Deployment module 34, for according to mission bit stream and parameter information to one or more assemblies deployment task.

Fig. 4 is the alternative construction block diagram one of the device of distributed deployment Hadoop clusters according to embodiments of the present invention, such as Shown in Fig. 4, in addition to including all modules shown in Fig. 3, deployment module 34 also includes the device：

Generation unit 40, for generating deployment task list according to mission bit stream and parameter information, wherein, deployment task row Table includes mission bit stream, performs the parameter information of required by task, and the priority of task；

Selecting unit 42, for selecting the mission dispatching of highest priority from deployment task list to corresponding component.

Fig. 5 is the alternative construction block diagram two of the device of distributed deployment Hadoop clusters according to embodiments of the present invention, such as Shown in Fig. 5, in addition to including all modules shown in Fig. 3, device also includes the device：Monitoring module 50, in deployment module According to Template Information and parameter information to one or more assemblies deployment task after, the monitoring one or more assemblies of the task is held Traveling degree and/or log information.

It should be noted that above-mentioned modules can be realized by software or hardware, for the latter, Ke Yitong Cross in the following manner realization, but not limited to this：Above-mentioned module is respectively positioned in same processor；Or above-mentioned modules are with any The form of combination is located in different processors respectively.

Embodiment 3

The present embodiment is according to an alternative embodiment of the invention, for carrying out specific detailed explanation to the application and saying It is bright：

Present embodiments provide a kind of distributed deployment Hadoop cluster methods and system.Overcome to disposing Hadoop collection Group administrative staff require the shortcomings of high, Hadoop cluster component nodes arbitrarily distribute, installation kit loading source is single.It is of the invention abundant One-touch distributed deployment Hadoop clusters are realized using hardware resource, each load on host computers situation in cluster.

A kind of distributed deployment Hadoop group systems of the present embodiment are included with lower component, framework as shown in Figure 1, bag Include：

Template parser：Deployment template includes but is not limited to herein below：OC NCV ambda, user name, password, Hadoop module informations, number of nodes information, carry disk information.The Template Information that template parser inputs to user parses And carry out legitimacy verifies.

Monitor：Monitor is responsible at Hadoop deployment of components tasks carrying situations and the daily record of the transmission of Receiving Agent device Reason.

Collector：Collector is responsible for the host information of Receiving Agent device transmission (including but not limited to herein below：Operation system System information, CPU information, memory information, the network information, cpu busy percentage, memory usage, disk I/O utilization rate, network delay Deng) and persistence.

Task generator：Host information that task generator gathers according to collector, deployment template information generation Hadoop Deployment of components task list.

Task dispatcher：Host information, load on host computers situation and the deployment task that task dispatcher gathers according to collector The deployment task of list selection high priority is issued to proxy server.

Proxy server：Proxy server includes the components such as collector, deployment device, parameter configuration device, monitor.Collector is responsible for timing Collection host information is simultaneously sent to the collector of system；Deployment device receives and performed the task that task dispatcher issues；Parameter is matched somebody with somebody Device is put to be responsible for configuring each component profiles of Hadoop；Monitor is responsible for monitoring deployment task implementation status and log collection, Fig. 6 It is the structural framing figure of proxy server in the present embodiment distributed deployment Hadoop group systems, as shown in Figure 6.

The deployment flow of proxy server when Fig. 7 is the original state of the present embodiment, as shown in fig. 7, the distribution of the present embodiment Hadoop cluster methods are disposed including following：

Initialize deployment system

When system starts, monitor, collector and proxy server in distributed deployment Hadoop group systems are initialized, it is accurate The standby deployment template for receiving user and submitting.

Dispose proxy server

Proxy server deployment task is generated by task generator and is performed by task dispatcher scheduler task.Proxy server has been disposed Cheng Hou, collector timing acquiring node resources information simultaneously feed back to management system.

User submits Hadoop clustered deploy(ment) templates

User is filled in the Hadoop cluster informations for needing to dispose by deployment template requirement according to demand, submits deployment template.

Parse Hadoop clustered deploy(ment) templates

The monitor of distributed deployment Hadoop group systems receives the deployment template of user's submission, resolver parsing Hadoop clustered deploy(ment)s template simultaneously verifies template legitimacy.

The deployment template and resource information submitted according to user, topology generator generation Hadoop cluster network topological diagrams.

Generate Hadoop cluster component deployment task

According to Hadoop cluster network topology graph structures, by task generator formation component deployment task.

Task dispatcher performs deployment task

Task dispatcher takes out pending deployment task and each node resources information from task list, and generation is pending Task sequence；Task dispatcher takes out the deployment task of high priority successively, is handed down to corresponding proxy server.

Perform deployment task

After master agent device receives deployment task, deployment device performs deployment task；The monitor Real-time Feedback of proxy server Deployment task implementation progress to deployment system monitor, monitor notice task dispatcher continue scheduler task perform.Repeat Step " task dispatcher execution deployment task ", it is finished until needing deployment task.

The characteristics of each according to Hadoop clusters component of the present embodiment, with reference to cluster resource, reasonable distribution Hadoop cluster groups The node of part；According to the load on host computers situation dynamically distributes deployment task of collection during deployment, key distribution is realized Dispose Hadoop clusters.The present invention efficiently solves the complicated extensive Hadoop clusters of deployment, deployment time length, deployment system pressure The shortcomings of power is big.

Fig. 8 is the flow chart of the Hadoop clustered deploy(ment) methods of the present embodiment, as shown in figure 8, Fig. 9 is the present embodiment The timing diagram of Hadoop clustered deploy(ment) methods, as shown in figure 9, with reference to Fig. 8 and Fig. 9, the present embodiment includes：

System initialization：, it is necessary to be initialized to system, comprising first when distributed deployment Hadoop group systems start Beginningization monitor, collector and proxy server A1 etc..

Proxy server is disposed：Dispose first and deployment proxy server A2 tasks are performed by proxy server A1, after the completion of proxy server A2 deployment, Initialize and start proxy server A2；Then deployment proxy server A3, A4 task is performed by proxy server A1, A2, by that analogy, until collection (such as Fig. 7) is completed in the deployment of All hosts proxy server in group.

101st, user submits deployment template：After the completion of the initialization of distributed deployment Hadoop group systems, user can be to System submits qualified deployment template.Legal deployment template at least will be including but not limited to herein below：Hadoop collection Group node number, each component client of Hadoop clusters module information, HDFS copies number, Hadoop clusters for needing to dispose connect Connect number and time-out time, OC NCV ambda, user name and password, daily record storage dish, data storage disk, metadata storage dish etc. Information.

102nd, template parser receives the legitimacy for verifying template after deployment template information first, if template is not met Contractual requirements then terminate to dispose；Template is parsed if template is legal, is opened up by topological diagram generator generation Hadoop cluster networkings Flutter figure.

103rd, according to node resource, each component Arranging principles of Hadoop clusters and deployment template information, topological diagram generator Generate Hadoop cluster networkings topological diagram (such as S1).Hadoop cluster components Arranging principles include and are just not limited to following principle：1、 According to hardware resource and load on host computers situation, Hadoop component Master, Slave nodes are distributed；2nd, counted according to cluster internal segment Amount, calculate ZOOKEEPER number of nodes and distribute；3rd, according to HDFS number of nodes, calculate Journalnode number of nodes and divide Match somebody with somebody.Hadoop deployment of components task is including but not limited to following information：Component Name (such as HDFS), nodename are (such as： NameNode), OC NCV ambda, task priority etc..

104th, the topological diagram of memory topology diagram generator generation.

105th, deployment task maker generates deployment task according to Hadoop cluster networkings topological diagram.

106th, the deployment task list of deployment task maker generation is stored.

107th, task dispatcher scanning deployment task list, takes out the deployment task having not carried out, root from task list (average load, memory usage, disk I/O utilization, net are mainly examined or check according to load on host computers in node resources information computing cluster Network time delay index), generate the deployment task sequence (such as S4) according to priority arranged.

108th, task dispatcher selection selects the deployment task of high priority successively, and deployment task is handed down to respective hosts Proxy server.When performing deployment Hadoop component tasks first, a proxy server A2 Hadoop cluster is disposed by proxy server A1 Deployment of components task, proxy server A1 monitor monitoring deployment task implementation status simultaneously feed back to the monitor of deployment system (such as S10).After monitor receives deployment task execution performance, task dispatcher regenerates according to task list, resource information Task sequence (such as S5), task dispatcher selection high-priority task T3 and T4, from proxy server A1, A2 to proxy server A3, A4 portions Acting is engaged in, by that analogy (such as S11 is as S14).Ideally, when t-th of moment (t is more than 0), whole cluster has 2t-1 Proxy server is performing deployment Hadoop component tasks.Certainly, each proxy server can open multiple threads, and be sent to multiple (such as 2 It is individual) proxy server deployment Hadoop component tasks, then in the ideal case, t-th of moment (t is more than 0), whole Hadoop clusters have 3t-1 proxy server is performing deployment Hadoop component tasks.

109th, the proxy server A1 set is closed with distributed deployment Hadoop cluster management systems.

110th, the proxy server that each host node is disposed in Hadoop clusters.

Configuration generation：Parameter configuration task completes each component Configuration generation of Hadoop clusters.Scheduler needs to collect entirely Each component deployment information of Hadoop clusters (such as：The Hostname of node, daily record storage dish, data where Master and Slave The information such as storage dish, metadata storage dish) and it is handed down to together with parameter configuration task the parameter in each master agent device assembly Configurator.After all parameter configuration tasks carryings are complete in cluster, then whole each deployment of components of Hadoop clusters is completed.

201st, the hardware resource and running state information of collector timing acquiring this main frame in device assembly are acted on behalf of, and is reported Collector into deployment system, stores node resource.Wherein hardware resource and running state information includes but unlimited In herein below:Operation system information, host name, CPU information, memory information, disk, progress information, cpu busy percentage, internal memory Utilization rate, disk I/O utilization, the network information, average I/O operation stand-by period etc..

202nd, each node resource (including main frame and Hadoop module informations) information of control store monitor collector collection.

Embodiment 4

Embodiments of the invention additionally provide a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium can The program code for performing following steps to be arranged to storage to be used for：

S1, the Template Information for disposing Hadoop clusters is received, wherein, Template Information is used to indicate Hadoop clusters Mission bit stream and host information, mission bit stream are used to describe task of needing Hadoop clusters to complete；

S2, the parameter information of one or more main frames of Hadoop clusters is gathered according to host information, wherein, each main frame For disposing one or more assemblies, component is disposed by proxy server, for performing corresponding task；

S3, according to mission bit stream and parameter information to one or more assemblies deployment task.

Alternatively, in the present embodiment, above-mentioned storage medium can include but is not limited to：USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. is various can be with the medium of store program codes.

Alternatively, in the present embodiment, processor performs reception according to the program code stored in storage medium and is used for The Template Information of Hadoop clusters is disposed, wherein, Template Information is used for the mission bit stream and host information for indicating Hadoop clusters, Mission bit stream is used to describe task of needing Hadoop clusters to complete；

Alternatively, in the present embodiment, processor is performed according to main frame according to the program code stored in storage medium The parameter information of one or more main frames of information gathering Hadoop clusters, wherein, each main frame is used to dispose one or more Component, component are disposed by proxy server, for performing corresponding task；

Alternatively, in the present embodiment, processor is performed according to task according to the program code stored in storage medium Information and parameter information are to one or more assemblies deployment task.

Alternatively, the specific example in the present embodiment may be referred to described in above-described embodiment and optional embodiment Example, the present embodiment will not be repeated here.

Obviously, those skilled in the art should be understood that above-mentioned each module of the invention or each step can be with general Computing device realize that they can be concentrated on single computing device, or be distributed in multiple computing devices and formed Network on, alternatively, they can be realized with the program code that computing device can perform, it is thus possible to they are stored Performed in the storage device by computing device, and in some cases, can be with different from shown in order execution herein The step of going out or describing, they are either fabricated to each integrated circuit modules respectively or by multiple modules in them or Step is fabricated to single integrated circuit module to realize.So, the present invention is not restricted to any specific hardware and software combination.

The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.

Claims

A kind of 1. method of distributed deployment Hadoop clusters, it is characterised in that including：

The Template Information for disposing Hadoop clusters is received, wherein, the Template Information is used to indicate the Hadoop clusters Mission bit stream and host information, the mission bit stream is used to describe task of needing the Hadoop clusters to complete；

The parameter information of one or more main frames of the Hadoop clusters is gathered according to the host information, wherein, Mei Gesuo State main frame to be used to dispose one or more assemblies, the component is disposed by proxy server, for performing corresponding task；

According to the mission bit stream and the parameter information to one or more deployment of components tasks.
2. according to the method for claim 1, it is characterised in that the parameter information includes at least one of：Main frame is grasped Make system information, mainframe network information, host CPU information, host memory information, host CPU utilization rate, host memory to use Rate, host disk IO utilization rates, mainframe network time delay, main frame average I/O operation stand-by period, host disk information, group in main frame The progress information of part.
3. according to the method for claim 1, it is characterised in that according to the mission bit stream and the parameter information to described One or more assemblies deployment task in Hadoop clusters includes：

According to the mission bit stream and parameter information generation deployment task list, wherein, the deployment task list includes The mission bit stream, the parameter information for performing the required by task, and the priority of the task；

The mission dispatching of highest priority is selected from the deployment task list to corresponding component.
4. according to the method for claim 3, it is characterised in that attribute and/or execution of the priority with the task The parameter information of the task is related.
5. according to the method for claim 1, it is characterised in that according to the Template Information and the parameter information to one After individual or multiple deployment of components tasks, methods described also includes：

Monitor the tasks carrying progress and/or log information of one or more of components.
6. according to the method for claim 1, it is characterised in that the Template Information includes at least one of：Hadoop Cluster system number, needs dispose Hadoop clusters module information, Hadoop distributed file system HDFS copies number, Each component client connection number of Hadoop clusters and time-out time, OC NCV ambda, host subscriber's name and password, daily record storage Disk information, data storage disk information, metadata disc information.
7. according to the method for claim 1, it is characterised in that receive be used for dispose Hadoop clusters Template Information it Afterwards, methods described also includes：

Parse the Template Information and verify the legitimacy of the Template Information.
A kind of 8. device of distributed deployment Hadoop clusters, it is characterised in that including：

Receiving module, for receiving the Template Information for being used for disposing Hadoop clusters, wherein, the Template Information is used to indicate institute The mission bit stream and host information of Hadoop clusters are stated, the mission bit stream is used to describe to need what the Hadoop clusters were completed Task；

Acquisition module, believe for gathering the parameter of one or more main frames of the Hadoop clusters according to the host information Breath, wherein, each main frame includes one or more assemblies, and the component is disposed by proxy server, for performing corresponding appoint Business；

Deployment module, for according to the mission bit stream and the parameter information to one or more deployment of components tasks.
9. device according to claim 8, it is characterised in that deployment module also includes：

Generation unit, for generating deployment task list according to the mission bit stream and the parameter information, wherein, the deployment Task list includes the mission bit stream, performs the parameter information of the required by task, and the priority of the task；

Selecting unit, for selecting the mission dispatching of highest priority from the deployment task list to corresponding component.
10. device according to claim 9, it is characterised in that described device also includes：

Monitoring module, in the deployment module according to the Template Information and the parameter information to described in one or more After deployment of components task, the tasks carrying progress and/or log information of one or more of components are monitored.