CN110647379A - Hadoop cluster automatic telescopic deployment and Plugin deployment method based on OpenStack cloud - Google Patents

Hadoop cluster automatic telescopic deployment and Plugin deployment method based on OpenStack cloud Download PDF

Info

Publication number
CN110647379A
CN110647379A CN201810682329.9A CN201810682329A CN110647379A CN 110647379 A CN110647379 A CN 110647379A CN 201810682329 A CN201810682329 A CN 201810682329A CN 110647379 A CN110647379 A CN 110647379A
Authority
CN
China
Prior art keywords
cluster
deployment
node
hadoop
automatic telescopic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810682329.9A
Other languages
Chinese (zh)
Other versions
CN110647379B (en
Inventor
吕智慧
吴杰
强浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201810682329.9A priority Critical patent/CN110647379B/en
Publication of CN110647379A publication Critical patent/CN110647379A/en
Application granted granted Critical
Publication of CN110647379B publication Critical patent/CN110647379B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of cloud computing, and particularly relates to a method for Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud. The method comprises the following steps: in a Hadoop cluster deployment stage, a cluster automatic scaling strategy based on resource utilization rate and a replacement mechanism based on task success rate. The invention enables OpenStack to provide better support for the cluster, the cluster scale can be adjusted according to the service processing amount in different time periods, and the service processing speed is guaranteed.

Description

Hadoop cluster automatic telescopic deployment and Plugin deployment method based on OpenStack cloud
Technical Field
The invention belongs to the technical field of cloud computing, and relates to a Hadoop cluster automatic telescopic deployment and Plugin deployment method based on OpenStack cloud.
Background
The prior art discloses that Sahara can be integrated with third party management tools (such as Apache Ambari and Cloudera management consoles) via a plug-in mechanism. The core part of Sahara is responsible for interaction with users, and provides resources (such as virtual machines, servers, security groups, etc.) of OpenStack through Heat components; the Plugin is responsible for installing and configuring Hadoop clusters in pre-allocated virtual machines, and can also be a tool for cluster deployment management and monitoring. Sahara provides a unified mechanism for Plugin to work in pre-allocated virtual machines: on the one hand, Plugin has to inherit sahara. Plugins. providing.provisioningPluginBase class and needs to implement all necessary methods/interfaces; on the other hand, the virtual machine object provided by Sahara has a remote property, which can be used to implement interaction with the virtual machine, and the virtual machine is operated by an instance remote call command (an available command can be found in Sahara. clients. remote. instant Interophelper).
Based on the current situation of the prior art, the inventor of the application proposes a method for Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud, supplements and optimizes a deployment automatic telescopic mechanism related to a Hadoop cluster, and the automatic telescopic mechanism adjusts the cluster scale to delete redundant nodes, replace problem nodes and deploy new nodes.
Disclosure of Invention
The invention aims to provide a method for Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud, based on the current situation of the prior art, supplements and optimizes a deployment automatic telescopic mechanism related to a Hadoop cluster, and the automatic telescopic mechanism adjusts the cluster scale to delete redundant nodes, replace problem nodes and deploy new nodes.
The purpose of the invention is realized by the following technical scheme:
the invention provides an OpenStack cloud-based Hadoop cluster automatic telescopic deployment method which includes integrating a Sahara module in an OpenStack cloud with a third-party management tool through a Plugin mechanism based on the Sahara module in the OpenStack cloud, referring to requirements, combining an automatic telescopic deployment method, distributing a proper amount of virtual machines for a required Hadoop cluster, and installing and configuring the Hadoop cluster in pre-distributed virtual machines.
Specifically, the method for Hadoop cluster automatic telescopic deployment based on the OpenStack cloud is characterized by completing automatic telescopic deployment of a Hadoop cluster in a cloud environment according to prediction and real-time conditions; the method specifically comprises the following steps:
(1) utilization-based automatic scaling strategy
Introduction of the invention
Figure BDA0001710865990000021
Respectively representing the expected values of the three utilization rates of the cluster CPU, the RAM and the hard Disk of the user, and the utilization rates of the three utilization rates are respectively the expected values in the actual situation
Figure BDA0001710865990000028
Since users have different degrees of importance for different resources, lambda is introduced into the systemC、λR、λDThe three terms are respectively used as the weights of the three terms, so that the following data can be obtained,
φ=λC·ηCR·ηRD·ηD(definition 2)
Wherein, definition 1 represents the difference between the actual utilization rate and the expected utilization rate of each index, and the index can specifically reflect the difference between each index; definition 2 represents the difference between the three items and the expectation, the range of the value is also within [0,1), the closer to 0, the closer to the expectation value, the more consistent the utilization rate of the cluster is;
based on the above, the invention provides an automatic scaling strategy based on utilization rate;
(2) automatic telescopic rapid deployment strategy based on task success rate
The invention introduces a variable in the strategy
Figure BDA0001710865990000023
Figure BDA0001710865990000024
Is one between [0,1]The percentage of (c) represents the proportion of a task that can successfully run on a single node; under the optimal condition, each node
Figure BDA0001710865990000025
All tasks can be successfully executed, and the result can be smoothly output; because the failure of the task execution of the node is inevitable, the node cannot be replaced immediately as long as a mistake is made, which is unreasonable and can also cause the increase of the system overhead; so will be right
Figure BDA0001710865990000026
With a pre-estimated value
Figure BDA0001710865990000027
If the value approaches 0, the task success rate of the node is too small, in this case, the continuous use of the node reduces the operation efficiency of the cluster, so a node replacement strategy is started;
based on the above, the invention provides an automatic telescopic rapid deployment strategy based on the task success rate.
In the invention, the automatic Hadoop cluster telescopic deployment based on the OpenStack cloud comprises the following two processes:
1. automatic scaling deployment mechanism
(1) Utilization-based automatic scaling strategy
The final realization target of the automatic scaling strategy algorithm based on the cloud platform resource utilization rate is as follows: by combining cloud platform resources and current application scene requirements, computing resources of the cloud platform are utilized more reasonably, and a cluster deployment automatic stretching mechanism related to a Hadoop cluster is optimized, so that the operation efficiency of the cluster reaches a better result;
the use conditions of three indexes of a CPU, a memory and a hard disk generally play an important role in the realization process of the automatic expansion function based on the utilization rate, and the invention mainly reflects the utilization rate from the three indexes;
introduction of the invention
Figure BDA0001710865990000031
Respectively representing the expected values of the utilization rates of the cluster CPU, the RAM and the hard Disk by the user, and the utilization rates of the three are respectively the expected values in practical situation
Figure BDA0001710865990000034
Since users have different degrees of importance for different resources, lambda is introduced into the systemC、λR、λDThe three terms are respectively used as the weights of the three terms, so that the following data are obtained,
Figure BDA0001710865990000032
φ=λC·ηCR·ηRD·ηD(definition 2)
Defining 1 to represent a difference value between an actual utilization rate and an expected utilization rate under each index, wherein the index specifically reflects the difference between each index, and the platform adjusts the cluster resource configuration condition according to the difference value; in the actual situation,
Figure BDA0001710865990000033
is a possible interval range, so the calculation result of η is a value interval, and since the degree of coincidence with the user's expectation needs to be calculated, a region is the minimum value included in the sub-interval range of the [0,1) interval; definition 2 represents the difference between the three items and the expectation, the range of the value is also within [0,1), the closer to 0, the closer to the expectation value, the more consistent the utilization rate of the cluster is;
in the algorithm, expected values of three utilization rates of a cluster CPU, a memory RAM and a hard Disk of a user are obtained firstly, actual utilization rates of the three utilization rates in a platform are compared, if the actual utilization rates are smaller than the minimum value of the corresponding expected values, Datanode and Namenode services are closed, a virtual machine is closed, and if the actual utilization rates are larger than the minimum value of the corresponding expected values and smaller than the maximum value of the corresponding expected values, the virtual machine is started, a Hadoop cluster is deployed, and the Hadoop cluster is started;
(2) automatic telescopic rapid deployment strategy based on task success rate
Based on Hadoop in the design process, it is thought that various problems may occur in any node to cause failure in execution of the distributed tasks, and the failure caused by some physical factors of the node can be avoided by adopting a mode of replacing the node with a high failure rate;
meanwhile, before a node which replaces the node to provide computing service officially serves the cluster, point-to-point data block copying is needed to ensure that data can be correctly stored, the realization of the strategy can better adjust the state of the whole cluster, and avoids the conditions of weight increment of other node loads and even scale effect caused by the failure of a single node;
to implement this strategy, in the present invention, a variable is first introduced
Figure BDA0001710865990000041
Figure BDA0001710865990000042
Is one between [0,1]The percentage of (c) represents the proportion of a task that can successfully run on a single node; under the optimal condition, each node
Figure BDA0001710865990000043
All tasks can be successfully executed, the result can be smoothly output, and because the failure of the node task execution is inevitable, the node cannot be immediately replaced as long as a mistake is made, which is not only unreasonable, but also can cause the increase of the system overhead, therefore, in the invention, the task execution of the node can be successfully executed, and the node can not be replaced immediately, and the invention can not only avoid the problem of the failure of the node task execution
Figure BDA0001710865990000044
With a pre-estimated value
Figure BDA0001710865990000045
If the value approaches to 0, the task success rate of the node is too small, the running efficiency of the cluster is reduced by the continuous use of the node, and therefore a node replacement strategy is started;
the process steps of the algorithm of the present invention are described as follows:
step1 selecting an appropriate one
Figure BDA0001710865990000046
Value as the minimum standard for measuring task success rate
step2 calculating the success rate of node task in a certain time periodValue of
step3 mixingValue and
Figure BDA0001710865990000049
comparing the values, if so, continuing to start step2 by the next node; if the ratio is less than the preset value, the next step is carried out
step4 applying for a new node
step5, deploying Hadoop application on the newly applied node and copying the data on the original node
step6 starting service on new node and suspending service of original node
step7 step2 starting service on new node, terminate original node, and enter next node
Through the replacement of the nodes, the optimized task execution effect of the whole cluster is finally realized, and in the replacement process, the failure caused by the physical aspect of the virtual machine can be avoided to the greatest extent.
2. Cluster automatic deployment Plugin implementation
(1) Cluster Plugin realization interface
The cluster is used as an independent plugin and exists in an independent directory form under the sahara/sahara/plugins directory, and the main directory structure of the cluster is shown in FIG. 1;
wherein:
1. as shown in fig. 2, the v2_7_1 directory is specifically-made content of the sandbox plugin 2.7.1 version, the hadoop2 is general content, and the outermost version factor.
Py is the core responsible for implementing all necessary interfaces, the interfaces specifically needed to be implemented are shown in table 1:
3. as shown in fig. 3, the functional implementation is mainly divided into two parts, namely configuration and startup of the cluster. Py is a configured core module, in which a path of a relevant configuration file is configured, and configuration of an environment variable is performed;
4. as shown in fig. 5, in versionhandle.py, the configuration and the startup of the sandbox are specifically completed according to the current plugin version;
5. run _ script.py/starting _ script.py specifically realizes the starting of the cluster, wherein the run _ script.remote () method is utilized to remotely connect to the started virtual machine through ssh, and execute a corresponding linux command, thereby specifically controlling the starting of processes, nodes, clusters and the like.
(2) Cluster mirror image packaging and manufacturing tool implementation
OpenStack virtual machine image
In the embodiment of the invention, a CentOS operating system is taken as an example, and the manufacturing process and principle of the OpenStack virtual machine image are briefly introduced;
2. cluster mirroring
In the invention, a Diskimage-builder is used for making a cluster mirror image.
The invention carries out Hadoop cluster automatic telescopic deployment experiment,
any one group of deployments is selected as a representative of the experiment to carry out a plurality of tests, and index data before and after optimization are compared on the basis to analyze deployment services. Table 2 shows the cluster configuration.
TABLE 2 Cluster configuration
Cluster vCPU RAM disk Node number
Cluster
1 4core 10GB 5GB 8
Cluster 2 2core 100GB 100GB 16
Cluster 3 1core 5GB 80GB 8
Cluster 4 1core 5GB 80GB 16
Cluster 5 1core 5GB 80GB 24
Cluster 6 1core 5GB 80GB 48
Test results show that the speed of deployment after optimization is improved considerably compared with that before optimization, when the number of cluster nodes to be deployed is small, the optimization effect is not obvious, but when the number of the cluster nodes is increased, the deployment time before optimization is increased obviously, and the deployment service after optimization is increased along with the increase of the cluster scale, but the increase is moderate, the 6-time cluster deployment time is relatively close, the deployment time is stable within the range from 10 minutes to 20 minutes, and the results show that the cluster deployment time after optimization is improved obviously, and the required time is more stable; compared with the prior art, the optimized deployment service still shows the optimization effect even in the process of cluster deployment with smaller scale, and the effect after optimization is obviously improved in the aspect of success rate, thus showing that the deployment service is more successfully optimized. The Hadoop cluster automatic telescopic deployment strategy provided by the invention can optimize the automatic deployment of the Hadoop cluster, so that the deployment service is more stable and efficient.
Drawings
FIG. 1 shows a directory structure diagram of a cluster as a stand-alone plugin, existing in the form of a stand-alone directory under the sahara/sahara/plugins directory.
Fig. 2 shows that the v2_7_1 directory is specifically-made content of the sandbox plugin 2.7.1 version, the hadoop2 is general content, and the outermost version factor.
Fig. 3 shows that, in terms of functional implementation, the configuration and startup are mainly divided into two parts, i.e., a configuration _ helper _ py is a configured core module, in which a path of a relevant configuration file is configured and an environment variable is configured, and the configuration _ helper _ py is also responsible for specifically generating a corresponding configuration file according to user configuration for a sandbox to be started.
Fig. 4 is a cut-away view of the work done by config _ helper. py when configuring the spark environment variable for a small portion of the system shown in fig. 3.
In fig. 5, pyy, the sandbox is configured and started according to the current plugin version,
wherein the start _ cluster method describes the whole flow of sandbox starting.
Fig. 6 shows that a qcow2 formatted virtual machine image of size 10G is created.
Fig. 7 is a schematic diagram of a cluster.
FIG. 8 shows a web page of the cluster creation service on which the relevant requirements submission for cluster creation is performed, including the selection of a Hadoop version, configuration of nodes as listed in Table 1, selection of mirror images, etc., after the relevant specification is completed, the cluster may enter a deployment phase.
Fig. 9 shows that the speed of deployment after optimization is considerably improved compared with that before optimization, the deployment time of cluster deployment after optimization is obviously improved, and the required time is more stable.
Fig. 10 shows that, compared to the deployment service before optimization, even in the process of cluster deployment with a smaller scale, the optimized deployment service still exhibits its optimization effect, and in terms of success rate, the optimized effect is significantly improved, which represents the success of the deployment service optimization of the present invention.
Detailed Description
The technical solution of the present invention is specifically described below with reference to the accompanying drawings and examples.
The invention aims to provide an OpenStack cloud-based Hadoop cluster automatic telescopic deployment method. As shown in fig. 1, the invention is based on a Sahara module in an OpenStack cloud, and is integrated with a third-party management tool through a Plugin mechanism, and a proper amount of virtual machines are allocated to a required Hadoop cluster by referring to requirements and combining with an automatic telescopic deployment method, and the Hadoop cluster is installed and configured in a pre-allocated virtual machine.
In the invention, Hadoop cluster automatic telescopic deployment is carried out based on OpenStack cloud, and the method specifically comprises the following two processes:
1. automatic scaling deployment mechanism
(1) Utilization-based automatic scaling strategy
The final realization target of the automatic scaling strategy algorithm based on the cloud platform resource utilization rate is as follows: by combining the cloud platform resources and the current application scene requirements, the computing resources of the cloud platform are more reasonably utilized, and the cluster deployment automatic stretching mechanism related to the Hadoop cluster is optimized, so that the operation efficiency of the cluster reaches a better result.
When the resource number occupied by the virtual machine is distributed, the CPU, the memory and the hard disk are generally concerned about 3 aspects, so that the use conditions of the three indexes are inevitably important aspects in the implementation process of the automatic scaling function based on the utilization rate; in the present embodiment, the utilization rate is mainly reflected from these three indexes in terms of the utilization rate.
Introduction of the invention
Figure BDA0001710865990000071
Respectively representing the expected values of the three utilization rates of the cluster CPU, the RAM and the hard Disk of the user, and the utilization rates of the three utilization rates are respectively the expected values in the actual situation
Figure BDA0001710865990000074
Since users have different degrees of importance for different resources, lambda is introduced into the systemC、λR、λDThe three terms are respectively used as the weights of the three. The following data can thus be obtained from,
Figure BDA0001710865990000072
φ=λC·ηCR·ηRD·ηD(definition 2)
Defining 1 to represent a difference value between an actual utilization rate and an expected utilization rate under each index, wherein the index can specifically reflect the difference between each index, and a platform can adjust the configuration condition of cluster resources according to the difference value; in the actual situation,
Figure BDA0001710865990000073
will generally be a range of possible intervals, so that the calculation of η will also be a range of valuesSince it is necessary to calculate the degree of coincidence with the user's expectation, this region is the minimum value included in the range of the subintervals of the [0,1) interval; definition 2 represents the difference between the three items and the expectation, the range of the value is also within [0,1), the closer to 0, the closer to the expectation value, the more consistent the utilization rate of the cluster is;
the algorithm firstly obtains expected values of three utilization rates of a user for a cluster CPU, an internal memory RAM and a hard Disk, compares the actual utilization rates of the three utilization rates in a platform, and closes a Datanone and a Namenode service and a virtual machine if the actual utilization rate is smaller than the minimum value of the corresponding expected values; if the actual utilization value is greater than the minimum value of the corresponding expected value and less than the maximum value of the corresponding expected value, starting the virtual machine, deploying the Hadoop cluster, starting,
(2) automatic telescopic rapid deployment strategy based on task success rate
In the design process of Hadoop, it is considered that any node may have various problems to cause the execution failure of the distributed task, the failed task will operate again, if the probability of node failure increases, although Hadoop can have its own consideration standard for the distributed task, the operating time of the whole cluster will increase, the higher the task failure rate of the node is, the larger the increase amplitude is, even in this time, the node is always performing task calculation, the utilization rate of each resource is more reasonable, but certain influence is actually caused on the operation of the whole task, the reasons for causing task failure are many, and in order to cause the smallest possible influence on other nodes of the whole cluster, the invention adopts the mode of replacing the node with higher failure rate, thereby avoiding the failure caused by some physical factors of the node itself;
meanwhile, on the principle that mobile computing is higher in economic benefit than mobile data, Hadoop can allocate computing tasks to nodes with data blocks needed by computing as much as possible, the nodes finish storage of needed data in a security mode before, if the data are deleted directly, certain influence is inevitably caused to the whole cluster, the cluster possibly enters the security mode again and needs to wait, therefore, before the nodes replacing the nodes for providing computing service serve the cluster formally, point-to-point data block copying needs to be carried out, data can be stored correctly, the realization of the strategy can better adjust the state of the whole cluster, and the conditions of weight increment of loads of other nodes caused by failure of a single node and even scale effect are avoided;
in order to implement the above strategy, the present invention first introduces a variable
Figure BDA0001710865990000091
Figure BDA0001710865990000092
Is one between [0,1]Percentage of (d), representing the proportion of a task that can successfully run on a single node; under the optimal condition, each node
Figure BDA0001710865990000093
All tasks can be successfully executed, the result can be smoothly output, and because the execution failure of the node tasks is inevitable, the node cannot be immediately replaced as long as a mistake is made, which is unreasonable and can also cause the increase of the system overhead; in the invention, the
Figure BDA0001710865990000094
With a pre-estimated value
Figure BDA0001710865990000095
If the value approaches to 0, the task success rate of the node is too small, and at the moment, the running efficiency of the cluster is reduced by the continuous use of the node, so that the node replacement strategy is started;
the process steps of the algorithm are described as follows:
step1 selecting an appropriate one
Figure BDA0001710865990000096
Value as the minimum standard for measuring task success rate
step2 calculating the success rate of node task in a certain time period
Figure BDA0001710865990000097
Value of
step3 mixing
Figure BDA0001710865990000098
Value and
Figure BDA0001710865990000099
comparing the values, if so, continuing to start step2 by the next node; if the ratio is less than the preset value, the next step is carried out
step4 applying for a new node
step5, deploying Hadoop application on the newly applied node and copying the data on the original node
step6 starting service on new node and suspending service of original node
step7 step2 starting service on new node, terminate original node, and enter next node
Through the replacement of the nodes, the optimized task execution effect of the whole cluster is finally realized, and in the replacement process, the failure caused by the physical aspect of the virtual machine can be avoided to the greatest extent.
3. Cluster automatic deployment Plugin implementation
(3) Cluster Plugin realization interface
The cluster is used as an independent plug-in and exists in an independent directory form under the sahara/sahara/plug-ins directory, the main directory structure of the cluster is shown in figure 1,
wherein:
as shown in fig. 2, the v2_7_1 directory is specifically made content of the sandbox plugin version 2.7.1, the hadoop2 is general content, and the outermost version factor.
py is the core responsible for implementing all necessary interfaces, and the interfaces specifically needed to be implemented are shown in table 1:
TABLE 1
Figure BDA0001710865990000101
Figure BDA0001710865990000111
(cluster,instances)
Figure BDA0001710865990000121
Figure BDA0001710865990000131
Fig. 3 shows that in terms of functional implementation, the configuration and startup are mainly divided into two parts, namely a cluster, and config _ helper is a core module of the configuration, in which a path of a relevant configuration file is configured, and configuration of an environment variable is performed; in addition, the config _ helper _ py is also responsible for specifically generating a corresponding configuration file according to the user configuration for the sandbox to be started, and fig. 4 intercepts a small part of work done by the config _ helper _ py when the spark environment variable is configured;
py, the configuration and the startup of the sandbox are specifically completed according to the current plugin version, wherein the whole flow of the sandbox startup is written in the start _ cluster method;
run _ script.py/starting _ script.py specifically realizes the starting of the cluster, wherein the run _ script.remote () method is utilized to remotely connect to the started virtual machine through ssh, and execute a corresponding linux command, thereby specifically controlling the starting of processes, nodes, clusters and the like.
(4) Cluster mirror image packaging and manufacturing tool implementation
OpenStack virtual machine mirroring:
taking a CentOS operating system as an example, the manufacturing process and principle of the OpenStack virtual machine image in this embodiment are as follows:
1) downloading a CentOS installation ISO mirror image;
2) installation is done by the virt-manager tool or virt-install command. FIG. 6 is an example of an installation using a command line;
fig. 6 shows that a virtual machine image with a qcow2 format is created, the virtual machine image is 10G in size, if a virt-manager is used, gradual installation can be performed through graphical prompts, and some additional configuration needs to be performed during installation, such as changing the ethernet state, setting the host name, specifying the installation source and the storage device, performing disk partitioning, setting the root password, and the like;
3) after the step 2), logging in the virtual machine which is just installed through the root user, and performing related configuration, such as installation of ACPI service, installation of a closed-init packet, configuration of partition size adjustment support, zeroconf routing forbidding, configuration of console log output and the like;
4) after the configuration is finished, closing the virtual machine;
5) removing MAC address information;
6) the image is compressed.
After the steps are completed, a common OpenStack virtual machine mirror image is manufactured and can be uploaded to an OpenStack platform for use.
Cluster mirroring:
aiming at clusters of different types and versions, mirror images of corresponding types and versions are required to be used as support, when the cluster mirror images are manufactured, besides a basic OpenStack mirror image manufacturing step, downloading, installing and configuring of all related software packages (such as Hadoop and Spark) in the mirror images are required.
The invention performs Hadoop cluster automatic telescopic deployment experiment
Any one group of deployments is selected as a representative of the experiment to carry out a plurality of tests, and index data before and after optimization are compared on the basis to analyze deployment services.
TABLE 2 Cluster configuration
Cluster vCPU RAM disk Node number
Cluster
1 4core 10GB 5GB 8
Cluster 2 2core 100GB 100GB 16
Cluster 3 1core 5GB 80GB 8
Cluster 4 1core 5GB 80GB 16
Cluster 5 1core 5GB 80GB 24
Cluster 6 1core 5GB 80GB 48
In an experiment, six different Hadoop clusters are deployed, as shown in fig. 6, six clusters with different scales and configurations are researched, 6 physical computing nodes in the experiment are total, Openstack can control the positions of virtual machines to a certain extent according to the resource use condition, so that the virtual machines are uniformly distributed on each node in a first cluster and a second cluster, in order to reduce uncertainty caused by other factors, the cluster deployment is directly performed on 6 identical machines in the embodiment, and table 2 shows the specific node resource configuration conditions of the two clusters;
fig. 8 shows a web page of the cluster creation service, on which relevant requirements for cluster creation are submitted, where the requirements include selection of a Hadoop version, configuration of nodes shown in table 1, selection of mirror images, and the like, and after all the requirements are specified in a relevant manner, the cluster may enter a deployment stage, and in this experiment, the experiment results are compared before and after optimization in terms of deployment speed and success rate;
in the two groups of comparison tests, six Hadoop clusters are respectively deployed, the deployment of each cluster is carried out for a plurality of times, the abnormal or failed condition is eliminated, the average value is taken as the deployment time of the cluster, fig. 9 shows that the speed of deployment after optimization is considerably higher than that before optimization, and in the case of a smaller number of cluster nodes to be deployed, the optimization effect is not significant, but when the number of cluster nodes is increased, the deployment time before optimization is obviously increased, the deployment service after optimization has the advantages that the deployment time is increased along with the increase of the cluster scale, but the increase is moderate, the 6-time cluster deployment time is relatively close, the deployment time is stabilized within the range from 10 minutes to 20 minutes, and the result shows that the optimized cluster deployment time is obviously improved, and the required time is more stable; as shown in fig. 10, the optimized deployment service still exhibits its optimization effect even in the process of cluster deployment with smaller scale compared to the deployment service before optimization; the results show that along with the increase of the cluster scale, the success rate of cluster deployment is reduced to a certain extent due to various uncertainties, and before optimization, the reduction is obvious, and the stability of the deployment service is poor; after optimization, although the success rate is also reduced, the amplitude reduction is small, and the amplitude reduction also tends to be smooth and is still maintained at a high level; the experimental result proves that in the aspect of success rate, the effect after optimization is obviously improved obviously, and the optimization of the deployment service is more successful. The Hadoop cluster automatic telescopic deployment strategy can optimize automatic deployment of the Hadoop cluster, so that the deployment service is more stable and efficient.

Claims (2)

1. A Hadoop cluster automatic telescopic deployment method based on an OpenStack cloud is characterized by comprising the steps of integrating a Sahara module in the OpenStack cloud with a third-party management tool through a Plugin mechanism based on the OpenStack cloud, referring to requirements, combining an automatic telescopic deployment method, distributing a proper amount of virtual machines for the needed Hadoop cluster, and installing and configuring the Hadoop cluster in pre-distributed virtual machines.
2. The method of claim 1, wherein the method performs automated scaling deployment of Hadoop clusters in a cloud environment based on prediction and real-time conditions, comprising:
(1) utilization-based automatic scaling strategy
Introduction of
Figure FDA0001710865980000012
Respectively representing the expectation values of the three utilization rates of the cluster CPU, the RAM and the hard Disk of the user, and replacing the utilization rates of the three utilization rates to be lC、lR、lDAccording to the different users' attention degree to different resources, lambda is introduced into the resource modelC、λR、λDThe three items are respectively used as the weight values of the three items; the following data were thus obtained from the above,
φ=λC·ηCR·ηRD·ηD(definition 2)
Wherein, definition 1 represents the difference between the actual utilization rate and the expected utilization rate under each index, and the index specifically reflects the difference between each index; definition 2 represents the difference between the three terms and the expectation, the range of the value is [0,1), the closer to 0, the closer to the expectation value, the more consistent the utilization rate of the cluster is;
(2) automatic telescopic rapid deployment strategy based on task success rate
Introducing a variable into a strategy
Figure FDA0001710865980000013
Is one between [0,1]Percentage of (d), representing the proportion of a task that can successfully run on a single node; each node
Figure FDA0001710865980000015
All tasks can be successfully executed, and the result can be smoothly output;
based on the inevitable failure of the task execution of the node, to
Figure FDA0001710865980000016
Setting a pre-estimated value
Figure FDA0001710865980000017
If the value approaches 0, the task success rate representing the node is too small, the continuous use of the node reduces the operation efficiency of the cluster, and the node replacement strategy is enabled.
CN201810682329.9A 2018-06-27 2018-06-27 Method for carrying out Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud Active CN110647379B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810682329.9A CN110647379B (en) 2018-06-27 2018-06-27 Method for carrying out Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810682329.9A CN110647379B (en) 2018-06-27 2018-06-27 Method for carrying out Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud

Publications (2)

Publication Number Publication Date
CN110647379A true CN110647379A (en) 2020-01-03
CN110647379B CN110647379B (en) 2023-10-17

Family

ID=68988861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810682329.9A Active CN110647379B (en) 2018-06-27 2018-06-27 Method for carrying out Hadoop cluster automatic telescopic deployment and Plugin deployment based on OpenStack cloud

Country Status (1)

Country Link
CN (1) CN110647379B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104320460A (en) * 2014-10-24 2015-01-28 西安未来国际信息股份有限公司 Big data processing method
US20150063166A1 (en) * 2013-08-27 2015-03-05 Futurewei Technologies, Inc. System and Method for Mobile Network Function Virtualization
CN104734892A (en) * 2015-04-02 2015-06-24 江苏物联网研究发展中心 Automatic deployment system for big data processing system Hadoop on cloud platform OpenStack
CN106982137A (en) * 2017-03-08 2017-07-25 中国人民解放军国防科学技术大学 Hadoop cluster Automation arranging methods based on kylin cloud computing platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150063166A1 (en) * 2013-08-27 2015-03-05 Futurewei Technologies, Inc. System and Method for Mobile Network Function Virtualization
CN104320460A (en) * 2014-10-24 2015-01-28 西安未来国际信息股份有限公司 Big data processing method
CN104734892A (en) * 2015-04-02 2015-06-24 江苏物联网研究发展中心 Automatic deployment system for big data processing system Hadoop on cloud platform OpenStack
CN106982137A (en) * 2017-03-08 2017-07-25 中国人民解放军国防科学技术大学 Hadoop cluster Automation arranging methods based on kylin cloud computing platform

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANTONIO CORRADI ET AL.: "Elastic provisioning of virtual Hadoop clusters in OpenStack-based Clouds", 《IEEE ICC 2015 - WORKSHOP ON CLOUD COMPUTING SYSTEMS, NETWORKS, AND APPLICATIONS (CCSNA)》 *
张新朝: "基于云平台虚拟集群的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
王炳旭: "基于IaaS云平台的Hadoop资源调度策略研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN110647379B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
US11398948B2 (en) Generation and deployment of inherited network topology models
US11570148B2 (en) Method and apparatus for deploying security access control policy
CN107566165B (en) Method and system for discovering and deploying available resources of power cloud data center
CN110147411A (en) Method of data synchronization, device, computer equipment and storage medium
US20110107299A1 (en) Systems and methods for integrated package development and machine configuration management
WO2022105440A1 (en) Hybrid quantum-classical cloud platform and task execution method
US9354920B2 (en) Managing virtual appliances supporting multiple profiles
CN105553741A (en) Automatic deployment method for application system based on cloud computing
US11886905B2 (en) Host upgrade method and device
JP6783850B2 (en) Methods and systems for limiting data traffic
CN103064717B (en) A kind of apparatus and method of parallel installation of software for cluster system
CN111522622B (en) K8S quick starting method based on cloud platform
US11509531B2 (en) Configuration sharing and validation for nodes in a grid network
CN112099917B (en) Regulation and control system containerized application operation management method, system, equipment and medium
CN110008005B (en) Cloud platform-based power grid communication resource virtual machine migration system and method
CN103200036A (en) Automated configuration method of electrical power system cloud computing platform
CN105743677A (en) Resource configuration method and apparatus
CN112637265B (en) Equipment management method, device and storage medium
CN106406980B (en) A kind of dispositions method and device of virtual machine
CN111880738A (en) Method for automatically creating and mounting LVM (logical volume manager) volume in K8s environment
CN110297713A (en) Configuration management system and method of cloud host
CN112527450B (en) Super-fusion self-adaptive method, terminal and system based on different resources
CN110502242A (en) Code automatic generation method, device, computer equipment and storage medium
CN107493200B (en) Optical disc image file creating method, virtual machine deploying method and device
CN110647379A (en) Hadoop cluster automatic telescopic deployment and Plugin deployment method based on OpenStack cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant