CN106933622A - The Hadoop dispositions methods of model-driven in cloud environment - Google Patents
The Hadoop dispositions methods of model-driven in cloud environment Download PDFInfo
- Publication number
- CN106933622A CN106933622A CN201710094086.2A CN201710094086A CN106933622A CN 106933622 A CN106933622 A CN 106933622A CN 201710094086 A CN201710094086 A CN 201710094086A CN 106933622 A CN106933622 A CN 106933622A
- Authority
- CN
- China
- Prior art keywords
- hadoop
- model
- service
- deployment
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/22—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45595—Network integration; Enabling network access in virtual machine instances
Abstract
The invention discloses a kind of Hadoop dispositions methods of model-driven in cloud environment, including:The Hadoop dispositions methods of model-driven in a kind of cloud environment;Model conversion between the Hadoop demand models and the Hadoop deployment models is realized according to default transformation rule;The information change situation in the Hadoop demand models and the Hadoop deployment models is monitored using synchronization engine, and information in the Hadoop demand models and/or the Hadoop deployment models carries out synchronizing information when changing.The invention has the advantages that:Diversified software and hardware resources in cloud environment can be managed and deployment.
Description
Technical field
The present invention relates to field of software engineering, the Hadoop dispositions methods of model-driven in more particularly to a kind of cloud environment.
Background technology
Today's society, has substantial amounts of data traffic to generate daily, and the data in the whole world 90% were produced in two years in the past
Raw, mass data processing technology has been widely applied to the every field of social production, this also means that the big data epoch
It is real to arrive.
, used as a kind of open source software framework of big data distributed treatment, it can be with reliable, efficient, expansible for Hadoop
Mode process mass data.Additionally, with the Hadoop ecosystems fast-developing and its a large amount of sub-project develop it is successive
Completion, its treatment and storage for being widely used in big data under various scenes.Nowadays, Hadoop has become big data
Process one of most important Software tool.As Hadoop is deployed in cloud more and more widely, keeper is needed according to specific
Demand, is disposed and is configured to Hadoop in a different manner, therefore brings two challenges of aspect to Hadoop deployment:
(1) diversity of hardware resource:Hadoop clusters may be deployed in different types of infrastructure, including physics
Server, virtual machine and Docker containers etc., this isomerism brings difficulty and complexity to the management of clustered node.
(2) diversity of software resource:The Hadoop ecosystems include various different types of calculating and storing framework, example
Such as, HDFS, MapReduce, HBase, Yarn, Spark etc..Different types of framework has specific deployment and collocation method,
Additionally, also there is dependence or restriction relation between some frameworks.
Being currently, there are some management tools can help user to dispose Hadoop clusters, such as Cloudera Manager
With Apache Ambari.Additionally, the container engine Docker that increases income is by the encapsulation to application component, distribution, deployment, operation etc.
The management of life cycle, reaches " once encapsulating, run everywhere " of application component rank, reduces Hadoop deployment and O&M
Difficulty.Although deployment of the above-mentioned deployment tool with technology to Hadoop clusters provides solution, research with management
Emphasis is mostly the setting of the configuration with parameter of environment, and the commonly provided is a kind of deployment mode of fixation, is not accounted for
The diversified infrastructure and scaling concern of cloud platform, it is impossible to according to COS, node resource and scene characteristics come
Meet the specific Hadoop deployment requirements of user.
The content of the invention
It is contemplated that at least solving one of above-mentioned technical problem.
Therefore, the Hadoop dispositions methods it is an object of the invention to propose model-driven in a kind of cloud environment, can be right
Diversified software and hardware resources is managed and deployment in cloud environment.
To achieve these goals, embodiment of the invention discloses that the Hadoop of model-driven is disposed in a kind of cloud environment
Method, comprises the following steps:S1:Hadoop demand models and Hadoop deployment models are provided, wherein, the Hadoop demands mould
Type is used to generate corresponding administration view according to system requirements, and the Hadoop deployment models are used to describe the management examination
The node configuration information of figure, running status and software are disposed;S2:The Hadoop demands are realized according to default transformation rule
Model conversion between model and the Hadoop deployment models, wherein, the default transformation rule includes node transformation model
With cluster service transformation model, the node transformation model is used to realizing the node of the Hadoop demand models and described
Model conversion between the node of Hadoop deployment models, the cluster service transformation model is used to realize the Hadoop demands
Model conversion between the cluster service of the cluster service of model and the Hadoop deployment models;S3:Supervised using synchronization engine
The information change situation surveyed in the Hadoop demand models and the Hadoop deployment models, and in the Hadoop demands mould
Information in type and/or the Hadoop deployment models carries out synchronizing information when changing.
Further, the Hadoop demand models include:Clustered node module, the clustered node module is provided with base
Infrastructure resource, the infrastructure resources include node configured list, node listing and corresponding money in container image list
Source and attribute;Cluster service module, the cluster service module is provided with service list, and the service list includes various clothes
Business and the attribute of every kind of service.
Further, the Hadoop deployment models include:Clustered node unit, the clustered node unit is stored with void
Plan machine configured list, virtual machine list and virtual machine image list;Cluster service unit, the cluster service unit is used to provide
Cluster service.
Further, the node transformation model is disposed by the node and the Hadoop of the Hadoop demand models
Element mapping relations between the node of model come implementation model conversion, the element mapping relations include helper labels and
Mapper labels, the helper labels are used to describe the mapping relations of element between class and class, and the helper labels are used for
The mapping relations of attribute between description class and class.
Further, the cluster service transformation model carries out cluster service by restricted model and default transfer algorithm
Automatic conversion, the restricted model is used to limiting incidence relation between multiple model elements, the default transfer algorithm according to
The Hadoop demand models and restricted model generation service arrangement scheme.
Further, the default Deployment Algorithm is comprised the following steps:Arranged according to being serviced in the Hadoop demand models
Service unit under table, obtains needing the set of service of deployment;It is right according to the dependence between service unit in restricted model
Service in set of service is supplemented and sorted, and obtains being actually needed the service ordered set of deployment;Had according to the service
Ordered sets, each deployment scheme for servicing and calculating service is successively read according to the mode of backward;According to service arrangement unit
Node set, the deployment for being serviced successively.
Further, the Hadoop deployment models are constructed using SM@RT instruments.
The Hadoop dispositions methods of model-driven in cloud environment according to embodiments of the present invention, architecture mould during by operation
Type is incorporated into during Hadoop deployment, realizes meeting user spy by model proposition, model conversion and the step of mold sync three
Fixed Hadoop deployment requirements.
Additional aspect of the invention and advantage will be set forth in part in the description, and will partly become from the following description
Obtain substantially, or recognized by practice of the invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from description of the accompanying drawings below to embodiment is combined
Substantially and be readily appreciated that, wherein:
Fig. 1 be the embodiment of the present invention cloud environment in model-driven Hadoop dispositions methods flow chart;
Fig. 2 is the schematic diagram of the Hadoop demand meta-models of one embodiment of the invention;
Fig. 3 is the schematic diagram of the Hadoop deployment meta-models of one embodiment of the invention;
Fig. 4 be one embodiment of the invention model element between mapping relations schematic diagram;
Fig. 5 is the schematic diagram of the restricted model meta-model of one embodiment of the invention;
Fig. 6 is the schematic diagram of the restricted model of one embodiment of the invention;
Fig. 7 is operation parameter change explanatory diagram when carrying out of the Hadoop cluster services deployment of one embodiment of the invention;
Fig. 8 is the schematic diagram of the Hadoop deployment models with the bi-directional synchronization of runtime of one embodiment of the invention;
Fig. 9 is the schematic diagram of Hadoop demand models in the specific embodiment of the invention;
Figure 10 is the schematic diagram of Hadoop deployment models in the specific embodiment of the invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of embodiment is shown in the drawings, wherein identical from start to finish
Or similar label represents same or similar element or the element with same or like function.Retouched below with reference to accompanying drawing
The embodiment stated is exemplary, is only used for explaining the present invention, and is not considered as limiting the invention.
With reference to following description and accompanying drawing, it will be clear that these and other aspects of embodiments of the invention.In these descriptions
In accompanying drawing, specifically disclose some particular implementations in embodiments of the invention to represent implementation implementation of the invention
Some modes of the principle of example, but it is to be understood that the scope of embodiments of the invention is not limited.Conversely, of the invention
Embodiment includes all changes, modification and the equivalent that fall into the range of the spiritual and intension of attached claims.
Below in conjunction with the Description of Drawings present invention.
Fig. 1 be the embodiment of the present invention cloud environment in model-driven Hadoop dispositions methods schematic diagram.Such as Fig. 1 institutes
Show, the Hadoop dispositions methods of model-driven are comprised the following steps in the cloud environment of the embodiment of the present invention:
S1:Hadoop demand models and Hadoop deployment models are provided.Wherein, Hadoop demand models are used for basis
System requirements generate corresponding administration view, and Hadoop deployment models are used to describe node configuration information, the operation that management attempts
State and software are disposed;
In an embodiment of the invention, Hadoop demand models include:
Clustered node module, clustered node module is provided with infrastructure resources, and infrastructure resources are configured including node
List, node listing and corresponding resource and attribute in container image list;Cluster service module, cluster service module is provided with
Service list, service list includes the attribute of various services and every kind of service.
Specifically, during the deployment of Hadoop clusters, Hadoop demand models provide node resource for keeper
With the unified administration view of cluster service.As shown in Fig. 2 demand meta-model is by clustered node and cluster service two parts group
Into.
In clustered node part, Infrastructure represents infrastructure resources, comprising a NodeTypes element,
One Nodes element and an Images element.NodeTypes elements are node configured list, represent usable configuration text
The set of part, and the one node configuration of each NodeType element representation, belong to comprising id, name, ram, disk, cpus etc.
Property, the information such as identifier, title, disk, memory, CPU quantity are represented successively;Nodes element representation node listings, represent section
The set of point configuration, and one node of each Node element representation, including id, name, nodeType, imgeId, Status
Deng attribute, the information such as identifier, the title of node, node type, container image, the node state of node are represented successively;
Images is container image list, represents the set of usable container image file, and each Image element representation one
Container image, comprising attributes such as id, name, size, status, miniDisk, miniRam, represents identifier, the mirror image of mirror image
The information such as title, mirror image size, mirrored state, disk, storage.
In the service list that cluster service part, Services element representations Hadoop are provided, comprising
The elements such as HDFSService, MapReduceService, HBaseService, YarnService, SparkService, respectively
Represent the service such as HDFS, MapReduce, HBase, Yarn and Spark.With HDFS Service, MapReduce Service,
It is illustrated as a example by HBase Service:HDFSService represents the HDFS services of cluster, and the service is by multiple sections in cluster
Point is provided, and each node disposes corresponding HDFS software modules, i.e. HDFSAgent;HBaseService represents cluster
HBase is serviced, and the service is provided by multiple nodes in cluster, and each node disposes corresponding HBase software modules, i.e.,
HBaseAgent;MapReduce Service then represent the MapReduce services of cluster, and the service is by multiple nodes in cluster
There is provided, each node disposes corresponding MapReduce software modules, i.e. MapReduceAgent.Above-mentioned all of service is all
Comprising id, name, version and status attribute, identifier, title, version and current operating conditions are represented respectively, and its is soft
Part deployment module (i.e. Agent) then includes health and nodeId attributes, where expression health status and software deployment module
The information such as node location.
In an embodiment of the invention, Hadoop deployment models include:Clustered node unit, the storage of clustered node unit
There are virtual machine configuration list, virtual machine list and virtual machine image list;Cluster service unit, cluster service unit is used to provide
Cluster service.
Specifically, during Hadoop clustered deploy(ment)s, deployment model is regarded for the management that keeper provides system deployment
Figure, describes clustered node configuration, running status and software deployment unit thereon, and with runtime bi-directional synchronization.Such as
Shown in Fig. 3, deployment meta-model also includes clustered node and cluster service two parts.
In clustered node part, by taking Cloudstack as an example, Project represents a project, comprising a VMTypes unit
Element, a VMs element and an Images element.VMTypes elements are virtual machine configuration list, represent usable configuration text
The set of part, one virtual machine configuration of each VMType element representation, comprising id, name, description,
The attributes such as guestCpus, memoryMb, imageSpaceGb, successively represent identifier, title, virtual machine description, CPU quantity,
The information such as internal memory, mirror image space size;VMs element representation virtual machine lists, represent the set of virtual machine configuration, each VM units
Element represents a virtual machine, comprising the attributes such as id, name, imageId, vmtypeId, cpuutiliz, memoryutiliz, table
Show the information such as the information such as identifier, title, virtual machine image, type of virtual machine, CPU usage, memory usage;Images is
Virtual machine image list, represents the set of usable virtual machine image file, one virtual machine of each Image element representation
Image, comprising attributes such as id, name, vsize, description, represents identifier, title, mirror image size, description of image etc.
Information.
In the service list that cluster service part, Services element representations Hadoop are provided, comprising
The elements such as HDFSService, MapReduceService, HBaseService, YarnService, SparkService, respectively
The service such as HDFS, MapReduce, HBase, Yarn and Spark is represented, every kind of services package contains service unit, character units, deployment
Three kinds of main models elements such as unit.Service unit represents the specific service that Hadoop is provided;Character units represent specific
Hadoop services included different role;Deployment unit represents the software module being currently running that different role is possessed.With
It is illustrated as a example by HDFSService, MapReduceService, HBaseService.HDFS service units include three species
The character units of type, i.e. NameNode, SecondaryNameNode and DataNode;Wherein, character units NameNode is represented
The administrative center of HDFS, the duplication of NameSpace, cluster configuration information and memory block for managing file system has and only has
One deployment unit DU_NN;Character units SecondaryNameNode represents the backup node of NameNode, for backing up
The metadata that NameNode nodes are preserved, one and only one deployment unit DU_SNN;Character units DataNode represents HDFS
Working node, for dispatch storage and retrieval data, can have multiple deployment unit DU_DN.MapReduce service unit bags
Containing two kinds of character units, i.e. JobTracker and TaskTracker;Wherein, character units JobTracker is represented
The center service node of MapReduce, each subtask task for dispatching Job makes it run on TaskTracker,
The deployment unit DU_JT of one and only one JobTracker;TaskTracker represents sub-services node, for performing
The task of JobTracker distribution, can there is the deployment unit DU_TT of multiple TaskTracker.HBase service units include two
The character units of type, i.e. HMaster and HRegionServer;Wherein, during HMaster represents the management and dispatching of HBase
The heart, for distributing and managing HRegionServer, the deployment unit DU_HM of one and only one HMaster;
HRegionServer represents that HBase operates in the service on each working node, the Region for safeguarding HMaster distribution
With I/O Request, there can be the deployment unit DU_RS of multiple HRegionServer.The deployment unit of above-mentioned different role is all included
The information such as vmId and health attributes, the virtual machine position where representing the operation health status and software module of deployment unit.
S2:Model conversion between Hadoop demand models and Hadoop deployment models is realized according to default transformation rule.
Wherein, presetting transformation rule includes node transformation model and cluster service transformation model, and node transformation model is used to realize
Model conversion between the node of Hadoop demand models and the node of Hadoop deployment models, cluster service transformation model is used for
Realize the model conversion between the cluster service of Hadoop demand models and the cluster service of Hadoop deployment models.
In one embodiment of the invention, node and Hadoop portion of the node transformation model by Hadoop demand models
Affix one's name to model node between element mapping relations come implementation model conversion, element mapping relations include helper labels and
Mapper labels, helper labels are used to describing the mapping relations of element between class and class, helper labels be used to describing class and
The mapping relations of attribute between class.
Specifically, under different application scene, the clustered node part in deployment model has larger difference.For example,
In CloudStack, main administrative unit includes VM, Flavor, Image, represents virtual machine, configuration file and mirror image;And
In Docker, main administrative unit is then Container, Repository, Image, represents container, warehouse and mirror image.
Accordingly, it would be desirable to the element mapping relations for setting up demand model and deployment model node section carry out implementation model conversion.
Embodiments of the invention devise the description rule of a set of mapping relations and the conversion method of model manipulation, according to giving
Element mapping relations between fixed model, carry out demand model to the conversion of deployment model automatically.Element mapping between model is closed
System is described by an XML file, and keyword is defined as follows in description rule:
(1) helper is used to describe the mapping relations of element between class and class.Helper labels contain two attributes:
Value and key, value represent the element that will be mapped in demand model, and key represents corresponding element in deployment model.
(2) mapper is used to describe the mapping relations of attribute between class and class.Mapper labels also contain key and value
Two attributes, key represents the title of the element property that deployment model should be mapped, and value represents element pair in demand model
The title of attribute is answered, the element that they are belonged to is respectively by the helper tag definitions of last layer.
Based on above-mentioned keyword, it is possible to according to mapping ruler in demand model and deployment model proposed by the invention
The mapping relations of element be described.As shown in figure 4, the NodeType elements in demand model are mapped in deployment model
VMType elements, are described with a helper label, and the key attributes and value attributes of helper labels are respectively deployment mould
The title of type and demand model corresponding element, i.e. VMType and NodeType.Wherein, id pairs of the id and VMType of NodeType
Should, name is corresponding with name, and ram is corresponding with memoryMb, and disk is corresponding with imageSpaceGb, vcpus and guestCpus pairs
Should.
Management to system can be realized by model manipulation, model manipulation includes five kinds of fundamental types, i.e. Get, Set,
List, Add and Remove.Any one acts on the model manipulation of demand model element, is converted into one and acts on deployment
The model manipulation of model element.As shown in table 1, invention defines the transformation rule of model manipulation, model manipulation is realized
Automatic conversion.For example, the element A in demand model is mapped as the B element in deployment model, then, List for element A,
Add and Remove operations will be mapped to that same operation in corresponding B element, and Get or the Set operation for A attributes also will be by
It is mapped to the same operation to respective attributes.
The mapping ruler of the model manipulation of table 1 conversion
In one embodiment of the invention, cluster service transformation model is carried out by restricted model and default transfer algorithm
The automatic conversion of cluster service, restricted model is used to limit the incidence relation between multiple model elements, presets transfer algorithm root
According to Hadoop demand models and restricted model generation service arrangement scheme.
Specifically, it is various not comprising HDFS, MapReduce, Hbase, Yarn, Spark etc. in the Hadoop ecosystems
The calculating of same type and storing framework, these are calculated or storing framework has specific deployment and collocation method, and different frames
Dependence or restriction relation are there may be between frame.For example, deployment MapReduce services need to rely on HDFS services.Therefore, it is right
In different Hadoop cluster services, there is larger difference in model conversion method.The present invention describes cluster service by restricted model
Deployment rule, and the automatic conversion of cluster service is realized by a general-purpose algorithm.
Restricted model describes a kind of deployment rule of Hadoop services.As shown in figure 5, the meta-model of restricted model is included
Several main model elements such as service unit, character units, deployment unit, and describe the incidence relation between them.Its
In, Service represents service unit, comprising name, minNodeNum attribute, title and minimum deployment nodes is represented successively;
Role represents character units, comprising attributes such as name, excluName, resPriority, deOrder, title, row is represented successively
His attribute, resource prioritization and deployment is sequentially;DU represents deployment unit, comprising attributes such as health, represents the healthy shape of service
Condition;Dependency_S represents the dependence between service unit, comprising name attributes, the relied on service unit of expression
Title;Similarly, Dependency_DU represents the dependence between deployment unit, and name attributes then represent relied on portion
Affix one's name to the title of unit.When there is dependence between service unit, dependence is not necessarily present between deployment unit;But
It is, when there is dependence between deployment unit, dependence to be certainly existed between service unit.
As shown in fig. 6, being illustrated by taking the service such as HDFS, MapReduce, HBase as an example.HDFS service units include three
Plant character units, i.e. NameNode, SeconderyNameNode and DataNode.Wherein, character units NameNode has and only
There are a deployment unit DU_NN, character units SeconderyNameNode one and only one deployment unit DU_SNN, and
NameNode and SeconderyNameNode can not be deployed in same node;Character units DataNode can have multiple deployment
Cells D U_DN, and DataNode can be deployed in same node with NameNode or SeconderyNameNode;Additionally, HDFS
Service unit does not exist dependence to other service units, and deployment unit DU_NN, DU_SNN and DU_DN are to other deployments unit
Also in the absence of dependence.MapReduce service units include two kinds of character units, i.e. JobTracker and TaskTracker;Its
In, one and only one deployment unit of character units JobTracker DU_JT;Character units TaskTracker can have multiple
Deployment unit DU_TT, and JobTracker and TaskTracker can not be deployed in same node;Additionally, MapReduce is serviced
Unit is to HDFS service units and there is dependence, and deployment unit DU_JT has dependence, deployment unit DU_ to deployment unit DU_NN
There is dependence to deployment unit DU_DN in TT.HBase service units include two kinds of character units, i.e. HMaster and
HRegionServer;Wherein, one and only one deployment unit of character units HMaster DU_HM;Character units
HRegionServer can have multiple deployment unit DU_RS, and HMaster and HRegionServer can not be deployed in same section
Point;Additionally, HBase service units are to HDFS service units and there is dependence, deployment unit DU_HM is deposited to deployment unit DU_NN
Relying on, deployment unit DU_RS has dependence to deployment unit DU_DN.
In one embodiment of the invention, default Deployment Algorithm is comprised the following steps:According in Hadoop demand models
Service unit under service list, obtains needing the set of service of deployment;According to the dependence between service unit in restricted model
Relation, is supplemented and is sorted to the service in set of service, obtains being actually needed the service ordered set of deployment;According to service
Ordered set, each deployment scheme for servicing and calculating service is successively read according to the mode of backward;According to service arrangement list
The node set of unit, the deployment for being serviced successively.
Specifically, embodiments of the invention propose a kind of general-purpose algorithm, can be according to given demand model and constraint
Model, automatically generates service arrangement scheme, and the basic step of algorithm is as follows:
1st, service unit according to demand in model under service list services, obtains needing the set of service of deployment
services{S1,S2,S3,…,Si}。
2nd, according to the dependence between service unit in restricted model, the service in set of service services is carried out
Supplement and sequence, obtain being actually needed service ordered set servicesOrder { S1, S2, S3 ..., Sj } of deployment;If clothes
Other services that certain service is relied in business set are not appeared in set of service, then need to be supplemented;In ordered set
In servicesOrder, service S1 does not rely on other services, and service S2 can be dependent on service S1, and service S3 can be dependent on clothes
Business S1 and service S2, by that analogy.
3rd, according to ordered set servicesOrder, it is successively read each according to the mode of backward and services and calculate clothes
The deployment scheme of business:First, letter according to demand in model in the software deployment module (i.e. Agent) of the service of current deployment
Breath, obtains the deployment node listing of the service;Secondly, character types and its portion that the service is included are obtained according to restricted model
Administration's constraint;Then, the node that the deployment unit of each role is disposed is calculated according to nodal information and role's deployment constraint;
Finally, the dependence according to the deployment unit of role in restricted model records the deployment node letter of the deployment unit of its dependence
Breath, obtains the node set of the deployment unit of each service in ordered set servicesOrder, i.e.,
servicesOrderDeploy{S1{DU1{},DU2{},...},S2{DU1{},DU2{},...},...,Sn{DU1{},DU2
{ } ... } }, for example, the node set of a deployment unit for the deployment scheme serviced comprising HDFS and MapReduce is
servicesOrderDeploy{HDFSService{DU_NN{1},DU_SNN{2},DU_DN{1,2,3}},
MapReduceService{DU_JT{1},DU_TT{2,3}}}。
4th, the node set servicesOrderDeploy according to service arrangement unit, the deployment for being serviced successively.
S3:The information change situation in Hadoop demand models and Hadoop deployment models is monitored using synchronization engine, and
Information in Hadoop demand models and/or Hadoop deployment models carries out synchronizing information when changing.
In being embodiment at one of the invention, Hadoop deployment models are constructed using SM@RT instruments.
Specifically, model represents the overall architecture of system with one group of manageable unit during operation, would fit snugly within system
Be described as to presentation of informationization during the operation such as internal structure, state, configuration standard, the structuring towards manager visual angle regards
Figure.
The present invention is using SM@RT instruments construction Hadoop deployment models.The meta-model and Access Model of given system, its
In, meta-model describes the information of managed system, and Access Model describes the call method of management interface, is input into based on more than,
Architecture can be automatically generated by SM@RT instruments when goal systems is run, and its two-way uniformity and between system also can
It is guaranteed.
As shown in fig. 7, the invention provides one group of operation on the deployment of Hadoop cluster services, outlining to inhomogeneity
The operation of type model element is simultaneously classified to model manipulation, and for every kind of operation, the present invention embodies action name and needs
The parameter wanted and the provided change of operation.
SM@RT support the bi-directional synchronization of Hadoop deployment models and runtime.As shown in figure 8, when synchronization engine detection
When the character units of certain service increased a deployment unit in Hadoop deployment models, automatically add in runtime
Plus a virtual machine is disposed;When keeper deletes this virtual machine in runtime, synchronization engine also can be automatic
This change is detected, and calls management interface to terminate corresponding deployment unit.
In order to it is further understood that the present invention, will be described in detail by following examples.
In order to verify the feasibility and validity of method proposed by the invention, an automatic deployment and configuration are realized
The example of Hadoop, there is provided the solution of the Hadoop services of user customization, including deployment MapReduce services and
HBase is serviced.
Hadoop clusters are disposed in experiment on CloudStack, using 5 deploying virtual machines in CloudStack
MapReduce and HBase is serviced.Wherein, MapReduce service arrangements are in tri- virtual machines of Host-1, Host-2, Host-3
On;HBase service arrangements are on tri- virtual machines of Host-1, Host-4, Host-5.Additionally, different virtual machines is assigned with not
The resources such as same CPU, internal memory, storage, specific configuration detail is shown in Table 2 with deployment scenario.
The node deployment situation of table 2
User only needs to define the demand model of Hadoop clusters, and method proposed by the present invention can be realized as demand model
To the automatic conversion of deployment model, and it is finally completed clustered deploy(ment).
As shown in figure 9, demand model includes clustered node part and cluster service part.
In clustered node part, NodeTypes represents list of win node type WIN, including three kinds of node types, respectively " CPU:
4, Memory:8, Storage:400”、“CPU:2、Memory:8、Storage:400 " and " CPU:2、Memory:4、
Storage:800”;Nodes represents node listing, including 5 nodes, wherein, id is that the node type correspondence id of 1 Node is
The NodeType of nt1, id are the NodeType that 2 ids corresponding with the node type of 3 Node are nt2, and id is 4 and 5 Node
Node type correspondence id is the NodeType of nt3.
In cluster service part, MapReduce and HBase services are deployed on its corresponding 3 node respectively.
The conversion of demand model to deployment model is divided into two parts, is respectively the model conversion and cluster service of clustered node
Model conversion.
In clustered node part, NodeType elements and Node elements in demand model are respectively mapped in deployment model
VMType elements and VM elements.Therefore, three kinds of type of virtual machine (i.e. VMType) and 5 virtual machines are there is also in deployment model
(i.e. VM), as shown in Figure 10.
In cluster service part, according to restricted model proposed by the present invention and general-purpose algorithm, model conversion process is as follows:
1st, service unit according to demand in model under service list services, obtains needing the set of service of deployment
services{MapReduceService,HBaseService};
2nd, it can be seen from the dependence in restricted model between MapReduceService and HBaseService,
Do not exist dependence between MapReduceService and HBaseService but the two services are all relied on
HDFSService, now, HDFSService is not appeared in set of service, accordingly, it would be desirable in set of service services
Supplement HDFSService simultaneously carries out deployment sequence, obtains being actually needed the ordered set servicesOrder of the service of deployment
{HDFSService,MapReduceService,HBaseService};
3rd, according to ordered set servicesOrder, according to the mode of backward be successively read HBaseService,
MapReduceService, HDFSService simultaneously calculate the deployment scheme of service:First, according to demand in model
Information in the software deployment module (i.e. Agent) of HBaseService, obtains the deployment node listing of HBase services, 1,4,5
Number node;Secondly, understand that HBaseService includes two kinds of role's lists of HMaster and HRegionServer according to restricted model
Unit, one and only one deployment unit of HMaster DU_HM, HRegionServer can have multiple deployment unit DU_RS, and
HMaster and HRegionServer can not be deployed on same node, additionally, the portion that deployment unit DU_HM is serviced HDFS
There is dependence in administration cells D U_NN, the deployment unit DU_DN that deployment unit DU_RS is serviced HDFS has dependence;Then, according to
Nodal information understands and role's deployment constraint is calculated, and the resource prioritization and deployment order of HMaster and NameNode are all
It is highest, therefore the two character units is deployed in resourceful No. 1 node, and 4, No. 5 node deployment character units
HRegionServer and DataNode;Similarly, when MapReduceService is disposed, by demand model and constraint mould
Type can be calculated character units JobTracker and NameNode and be deployed in No. 1 node, and 2, No. 3 node deployment role's lists
First TaskTracker and DataNode;And when HDFSService is disposed, from restricted model, HDFSService is disobeyed
Rely in other services, and NameNode and SeconderyNameNode can not be deployed in same node, further according to
HBaseService and MapReduceService deployment nodal informations understand that NameNode is deployed in No. 1 node,
SeconderyNameNode is deployed in No. 2 nodes, and DataNode is deployed in 1~No. 5 node;Finally, according to restricted model middle part
The dependence for affixing one's name to unit records the deployment nodal information of its deployment unit for relying on, and obtains ordered set servicesOrder
In each service deployment unit node set, i.e. servicesOrderDeploy HDFSService DU_NN { 1 },
DU_SNN{2},DU_DN{1,2,3,4,5}},MapReduce Service{DU_JT{1},DU_TT{2,3}},
HBaseService{DU_HM{1},DU_RS{4,5}}}。
4th, the node set servicesOrderDeploy according to service arrangement unit, the deployment for being serviced successively.
As shown in Figure 10, specific Hadoop portions are obtained by the model conversion of clustered node and the model conversion of cluster service
Administration's model.
The Hadoop dispositions methods of model-driven in cloud environment of the present invention, for diversified software and hardware resources in cloud environment
The difficulty brought to deployment Hadoop clusters with need property with service, it is proposed that the Hadoop cluster services deployment side of model-driven
Method, by propose Hadoop demand models to deployment model conversion method, realize the two-way same of deployment model and runtime
Step, for Hadoop deployment provides a kind of quick expansible cluster service deployment way, and demonstrates this in actual scene
The feasibility and validity of invention.
In addition, other of the Hadoop dispositions methods of model-driven are constituted and acted in the cloud environment of the embodiment of the present invention
All it is for a person skilled in the art known, in order to reduce redundancy, does not repeat.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are described
Point is contained at least one embodiment of the invention or example.In this manual, to the schematic representation of above-mentioned term not
Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not
Can these embodiments be carried out with various changes, modification, replacement and modification in the case of departing from principle of the invention and objective, this
The scope of invention is by claim and its equivalent limits.
Claims (7)
1. in a kind of cloud environment model-driven Hadoop dispositions methods, it is characterised in that comprise the following steps:
S1:Hadoop demand models and Hadoop deployment models are provided, wherein, the Hadoop demand models are used for basis
System requirements generate corresponding administration view, and the Hadoop deployment models are used to describe the node for attempting that manages with confidence
Breath, running status and software are disposed;
S2:Realize that the model between the Hadoop demand models and the Hadoop deployment models turns according to default transformation rule
Change, wherein, the default transformation rule includes node transformation model and cluster service transformation model, and the node transformation model is used
Model conversion between the node of the node for realizing the Hadoop demand models and the Hadoop deployment models, the collection
Group's service transformation model is used to realize the cluster service of the Hadoop demand models and the cluster of the Hadoop deployment models
Model conversion between service;
S3:The information change feelings in the Hadoop demand models and the Hadoop deployment models are monitored using synchronization engine
Condition, and information in the Hadoop demand models and/or the Hadoop deployment models to enter row information when changing same
Step.
2. in cloud environment according to claim 1 model-driven Hadoop dispositions methods, it is characterised in that it is described
Hadoop demand models include:
Clustered node module, the clustered node module is provided with infrastructure resources, and the infrastructure resources include node
Configured list, node listing and corresponding resource and attribute in container image list;
Cluster service module, the cluster service module is provided with service list, the service list include various services and
The attribute of every kind of service.
3. in cloud environment according to claim 2 model-driven Hadoop dispositions methods, it is characterised in that it is described
Hadoop deployment models include:
Clustered node unit, the clustered node unit is stored with virtual machine configuration list, virtual machine list and virtual machine image
List;
Cluster service unit, the cluster service unit is used to provide cluster service.
4. in cloud environment according to claim 3 model-driven Hadoop dispositions methods, it is characterised in that the node
Element between node and the node of the Hadoop deployment models that transformation model passes through the Hadoop demand models maps
Relation carrys out implementation model conversion, and the element mapping relations include helper labels and mapper labels, the helper labels
Mapping relations for describing element between class and class, the mapping that the helper labels are used to describe attribute between class and class is closed
System.
5. in cloud environment according to claim 3 model-driven Hadoop dispositions methods, it is characterised in that the cluster
Service transformation model carries out the automatic conversion of cluster service by restricted model and default transfer algorithm, and the restricted model is used for
Limit the incidence relation between multiple model elements, the default transfer algorithm according to the Hadoop demand models and it is described about
Beam model generates service arrangement scheme.
6. in cloud environment according to claim 5 model-driven Hadoop dispositions methods, it is characterised in that it is described default
Deployment Algorithm is comprised the following steps:
According to the service unit under service list in the Hadoop demand models, obtain needing the set of service of deployment;
According to the dependence between service unit in restricted model, the service in set of service is supplemented and sorted, obtained
To the service ordered set for being actually needed deployment;
According to the service ordered set, each deployment side for servicing and calculating service is successively read according to the mode of backward
Case;
According to the node set of service arrangement unit, the deployment for being serviced successively.
7. in cloud environment according to claim 1 model-driven Hadoop dispositions methods, it is characterised in that using SM@
RT instruments construct the Hadoop deployment models.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710094086.2A CN106933622A (en) | 2017-02-21 | 2017-02-21 | The Hadoop dispositions methods of model-driven in cloud environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710094086.2A CN106933622A (en) | 2017-02-21 | 2017-02-21 | The Hadoop dispositions methods of model-driven in cloud environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106933622A true CN106933622A (en) | 2017-07-07 |
Family
ID=59423391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710094086.2A Pending CN106933622A (en) | 2017-02-21 | 2017-02-21 | The Hadoop dispositions methods of model-driven in cloud environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106933622A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111427684A (en) * | 2020-03-20 | 2020-07-17 | 支付宝(杭州)信息技术有限公司 | Service deployment method, system and device |
CN113378103A (en) * | 2021-06-02 | 2021-09-10 | 哈尔滨工程大学 | Dynamic tracking method for coherent distribution source under strong impact noise |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111996A (en) * | 2014-07-07 | 2014-10-22 | 山大地纬软件股份有限公司 | Health insurance outpatient clinic big data extraction system and method based on hadoop platform |
CN104734892A (en) * | 2015-04-02 | 2015-06-24 | 江苏物联网研究发展中心 | Automatic deployment system for big data processing system Hadoop on cloud platform OpenStack |
CN105260203A (en) * | 2015-09-25 | 2016-01-20 | 福州大学 | Model-based Hadoop deploy and allocation method |
US9516053B1 (en) * | 2015-08-31 | 2016-12-06 | Splunk Inc. | Network security threat detection by user/user-entity behavioral analysis |
-
2017
- 2017-02-21 CN CN201710094086.2A patent/CN106933622A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111996A (en) * | 2014-07-07 | 2014-10-22 | 山大地纬软件股份有限公司 | Health insurance outpatient clinic big data extraction system and method based on hadoop platform |
CN104734892A (en) * | 2015-04-02 | 2015-06-24 | 江苏物联网研究发展中心 | Automatic deployment system for big data processing system Hadoop on cloud platform OpenStack |
US9516053B1 (en) * | 2015-08-31 | 2016-12-06 | Splunk Inc. | Network security threat detection by user/user-entity behavioral analysis |
CN105260203A (en) * | 2015-09-25 | 2016-01-20 | 福州大学 | Model-based Hadoop deploy and allocation method |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111427684A (en) * | 2020-03-20 | 2020-07-17 | 支付宝(杭州)信息技术有限公司 | Service deployment method, system and device |
CN111427684B (en) * | 2020-03-20 | 2023-04-07 | 支付宝(杭州)信息技术有限公司 | Service deployment method, system and device |
CN113378103A (en) * | 2021-06-02 | 2021-09-10 | 哈尔滨工程大学 | Dynamic tracking method for coherent distribution source under strong impact noise |
CN113378103B (en) * | 2021-06-02 | 2023-05-05 | 哈尔滨工程大学 | Dynamic tracking method for coherent distribution source under strong impulse noise |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105830049B (en) | Automation experiment platform | |
CN102682059B (en) | Method and system for distributing users to clusters | |
US10372883B2 (en) | Satellite and central asset registry systems and methods and rights management systems | |
CN111339071B (en) | Method and device for processing multi-source heterogeneous data | |
US8904381B2 (en) | User defined data partitioning (UDP)—grouping of data based on computation model | |
CN107103064B (en) | Data statistical method and device | |
WO2021032146A1 (en) | Metadata management method and apparatus, device, and storage medium | |
CN109656963A (en) | Metadata acquisition methods, device, equipment and computer readable storage medium | |
CN110275861A (en) | Date storage method and device, storage medium, electronic device | |
Zhang et al. | An online performance prediction framework for service-oriented systems | |
CN108241724A (en) | A kind of metadata management method and device | |
Chao | E-services in e-business engineering | |
WO2022048648A1 (en) | Method and apparatus for achieving automatic model construction, electronic device, and storage medium | |
CN106933622A (en) | The Hadoop dispositions methods of model-driven in cloud environment | |
CN114036159A (en) | Bank business information updating method and system | |
EP3577587B1 (en) | Satellite and central asset registry systems and methods and rights management systems | |
CN111651522B (en) | Data synchronization method and device | |
CN112579655A (en) | Method, device and equipment for integrating customer portrait indexes | |
CN108846002B (en) | Label real-time updating method and system | |
CN115422169A (en) | Data warehouse construction method and device based on commercial scene | |
Sanchez-Gallegos et al. | On the building of efficient self-adaptable health data science services by using dynamic patterns | |
CN114596046A (en) | Integrated platform based on unified digital model of business center station and data center station | |
CN107919991A (en) | A kind of virtual data center creates and quota collocation method and device | |
CN113672603A (en) | Multi-source heterogeneous electric power big data automatic label implementation method and system | |
CN104572649B (en) | The processing method of the data of distributed memory system, apparatus and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170707 |