CN113312148B

CN113312148B - Big data service deployment method, device, equipment and medium

Info

Publication number: CN113312148B
Application number: CN202110662971.2A
Authority: CN
Inventors: 何伟; 易乐天; 陆平; 高律
Original assignee: Sangfor Technologies Co Ltd
Current assignee: Sangfor Technologies Co Ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2023-03-21
Anticipated expiration: 2041-06-15
Also published as: CN113312148A

Abstract

The application discloses a big data service deployment method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring deployment information, wherein the deployment information comprises service information; configuring a cluster environment of a cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic clearing, database backup and periodic clearing, time synchronization among all nodes in the cluster, secret-free mutual trust among all nodes, kernel parameters of all nodes and a firewall white list of the cluster; arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate an Ambari blue print; and registering Ambari blueprint and deploying to-be-deployed big data service on the first node and a second node in the cluster. Therefore, automatic deployment of big data services can be carried out, the deployment is simple and efficient, mistakes are not easy to occur, and convenience is provided for cluster operation and maintenance.

Description

Big data service deployment method, device, equipment and medium

Technical Field

The present application relates to the field of big data technologies, and in particular, to a big data service deployment method, apparatus, device, and medium.

Background

The big data service covers the data life cycle related activities such as massive, heterogeneous and rapidly-changing data acquisition, transmission, storage, processing (including calculation, analysis, visualization and the like), exchange and destruction through a bottom-layer telescopic big data platform and various upper-layer big data applications and a supporting mechanism or an individual. Like a general data service, before a big data server is used, the big data service needs to be deployed. At present, big data service deployment is mainly manual deployment, but the manual deployment depends on the deployment capability of deployment implementation personnel, the deployment implementation personnel are required to have related technical bases, and the whole deployment process is tedious, time-consuming, labor-consuming and easy to make mistakes.

Therefore, how to automatically deploy big data services is an important problem to be solved by those skilled in the art.

Disclosure of Invention

In view of this, an object of the present application is to provide a method, an apparatus, a device, and a medium for deploying big data services, which can perform automatic deployment of big data services, are simple and efficient in deployment, are not prone to error, and can perform automatic configuration of a cluster environment in a process of deploying big data services, thereby improving stability and reliability of a cluster, reducing time and workload required for configuring a cluster environment individually, and providing convenience for operation and maintenance of the cluster. The specific scheme is as follows:

in a first aspect, the present application discloses a big data service deployment method, applied to a big data service deployment tool developed in advance based on Ambari, where the big data service deployment tool is installed on a first node in a cluster for deploying big data services, and the method includes:

acquiring deployment information, wherein the deployment information comprises service information, and the service information comprises a service list of a big data service to be deployed and service operation configuration information;

configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic clearing, database backup and periodic clearing, time synchronization between each node in the cluster, secret mutual trust between each node, kernel parameters of each node and a firewall white list of the cluster;

arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate an Ambari blue print;

registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment.

Optionally, before configuring the cluster environment of the cluster, the method further includes:

initializing and configuring nodes in the cluster by using node initialization information in the deployment information, wherein the node initialization information comprises an IP address, a domain name, a user name and an SSH login password of each node in the cluster;

clearing historical service data on the first node and the second node;

and installing a JAVA running environment and a Python2 running environment on the first node and the second node.

Optionally, after the registering the Ambari blueprint and deploying the big data service on the first node and a second node in the cluster, further comprising:

if the deployment of the big data service to be deployed fails, judging whether a retry instruction is acquired;

and if a retry instruction is acquired, re-executing the step of starting to clear the historical service data on the first node and the second node.

Optionally, the arranging, by the Ambari, the big data service to be deployed according to the service information and the node information of the cluster, and generating an Ambari blue print includes:

installing an Ambari Server of the Ambari on the first node;

installing agents corresponding to the Ambari Server on the first node and the second node;

and arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari Server to generate an Ambari blueprint.

Optionally, the arranging, by the Ambari according to the service information and the node information of the cluster, the big data service to be deployed to generate an Ambari blue print includes:

arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, wherein the arrangement result represents a mapping relation between the component of the big data service to be deployed and each node in the cluster;

organizing the arrangement result into a preset Ambari blueprint format to obtain the Ambari blueprint.

Optionally, the process of arranging the big data service to be deployed according to the service information and the node information of the cluster by Ambari to obtain an arrangement result further includes:

acquiring arrangement information input through a user interface of the big data service deployment tool;

and adding the arrangement information into the arrangement result.

Optionally, the first node is a physical machine or a virtual machine, and the second node is a physical machine or a virtual machine.

In a second aspect, the present application discloses a big data service deployment apparatus, which is applied to a big data service deployment tool developed in advance based on Ambari, where the big data service deployment tool is installed on a first node in a cluster for deploying a big data service, and includes:

the information acquisition module is used for acquiring deployment information, wherein the deployment information comprises service information, and the service information comprises a service list of a big data service to be deployed and service operation configuration information;

a cluster environment configuration module, configured to configure a cluster environment of the cluster, where the cluster environment includes any one or a combination of any several of log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between nodes in the cluster, secret mutual trust between the nodes, kernel parameters of the nodes, and a firewall white list of the cluster;

the service arranging module is used for arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate an Ambari blueprint;

and the service deployment module is used for registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to operate the big data service to be deployed in the cluster environment.

In a third aspect, the present application discloses an electronic device, comprising:

a memory and a processor;

wherein the memory is used for storing a computer program;

the processor is configured to execute the computer program to implement the big data service deployment method disclosed above.

In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the big data service deployment method disclosed in the foregoing.

The big data service deployment tool is installed on a first node in a cluster for deploying big data services, and first deployment information is obtained, wherein the deployment information comprises service information, and the service information comprises a service list of the big data services to be deployed and service operation configuration information. And then configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between each node in the cluster, secret mutual trust free between each node, kernel parameters of each node and a firewall white list of the cluster. And then arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate the Ambari blueprint. And then registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment. Therefore, automatic deployment of big data services can be realized by directly utilizing a big data service deployment tool developed in advance based on Ambari, the big data service deployment tool developed in advance based on Ambari can automatically arrange the big data services to generate Ambari blue print, then the Ambari blue print is registered, and the big data services are deployed on each node, the whole deployment process is very simple and efficient, errors are not easy to occur, and the labor cost in manual deployment is saved. And the automatic configuration of the cluster environment is directly carried out by the big data service deployment tool in the process of big data service deployment, so that the stability and reliability of the cluster are improved, the time and workload required for independently configuring the cluster environment can be reduced, and great convenience is provided for cluster operation and maintenance.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flow chart of a big data service deployment method disclosed herein;

FIG. 2 is a flow chart of a particular big data service deployment method disclosed herein;

FIG. 3 is a flow chart of a particular big data service deployment method disclosed herein;

FIG. 4 is a schematic structural diagram of a big data service deployment apparatus disclosed in the present application;

fig. 5 is a schematic structural diagram of an electronic device disclosed in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

At present, big data service deployment is mainly manual deployment, but the manual deployment depends on the deployment capability of deployment implementation personnel, the deployment implementation personnel are required to have related technical bases, and the whole deployment process is tedious, time-consuming, labor-consuming and easy to make mistakes. In view of this, the present application provides a big data service deployment method, which can perform automatic deployment of big data services, is simple and efficient in deployment, is not prone to error, and can perform automatic configuration of a cluster environment in a process of deploying the big data services, thereby improving stability and reliability of a cluster, reducing time and workload required for configuring the cluster environment independently, and providing convenience for cluster operation and maintenance.

Referring to fig. 1, an embodiment of the present application discloses a big data service deployment method, which is applied to a big data service deployment tool developed in advance based on Ambari, where the big data service deployment tool is installed on a first node in a cluster for deploying a big data service, and the method includes:

step S11: the method comprises the steps of obtaining deployment information, wherein the deployment information comprises service information, and the service information comprises a service list of the big data service to be deployed and service operation configuration information.

In a specific implementation process, a big data service deployment tool developed based on Ambari in advance is needed, wherein Ambari is a big data deployment, operation, maintenance and monitoring tool sourced by hortworkworks, the big data service deployment tool can be a Web (World Wide Web) Server, and the big data service deployment provides a Web UI (Web User Interface) for a User. After the big data service deployment tool is developed, when the big data service deployment tool needs to be utilized to perform big data service deployment, the big data service deployment tool needs to be installed on a first node in a cluster for deploying big data services, where the first node may be any node in the cluster.

When the big data service deployment tool is used for deploying the big data service, deployment information needs to be acquired first, wherein the deployment information comprises service information and node initialization information, the service information comprises a service list of the big data service to be deployed and service operation configuration information, the service list lists information such as a service name and a version of the big data service to be deployed, and the service operation configuration information comprises a storage address of data generated in the operation process of the big data service to be deployed. The node initialization information includes an IP (Internet Protocol) address, a domain name, a user name, and an SSH (Secure Shell) login password of each node.

Each node in the cluster may be a homogeneous node, and each node needs to install an operating system and configure a network as required, including IP address configuration.

In a specific implementation process, a user can log in through the Web UI on the big data service deployment tool, and input the deployment information after logging in, and accordingly, the big data service deployment tool can acquire the deployment information through the Web UI.

Step S12: configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between each node in the cluster, secret mutual trust free between each node, kernel parameters of each node and a firewall white list of the cluster.

In an actual implementation process, a cluster environment of the cluster needs to be configured, where the cluster environment includes any one or a combination of any several of log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between nodes in the cluster, secret mutual trust between the nodes, kernel parameters of the nodes, and a firewall white list of the cluster. That is, any one or a combination of six items of backup and periodic cleaning of the cluster environment log, backup and periodic cleaning of a database, time synchronization between each node in the cluster, secret-free mutual trust between each node, kernel parameters of each node and a firewall white list of the cluster. The cluster environment is obtained based on the practice of big data service cluster.

The log backup and the periodic clearing are performed, that is, the log backup is performed every how long time in the cluster, the log clearing is performed every how long time, the time range of the log to be cleared and the backup mode are set, the backup mode includes incremental backup and full backup, for example, the log backup can be performed every 3 days, the log clearing is performed every 7 days, the time range of the cleared log is the log which is more than 1 month away from the current time, and the backup mode is incremental backup.

The database backup and the periodic removal, that is, the cluster is set to perform backup of tables and the like in the database at intervals, perform removal of tables and the like in the database at intervals, and set the time range and the backup mode of the data in the database to be removed, where the backup mode includes incremental backup and full backup, for example, the database backup may be performed at intervals of 3 days, and the data in the database is removed at intervals of 7 days, the time range of the data in the database to be removed is the data more than 1 month away from the current time, and the backup mode is incremental backup.

Time synchronization among the nodes in the cluster, that is, if the time of each node in the cluster is inconsistent, the big data service is affected, so that a clock source needs to be selected, and the time synchronization between all the nodes and the clock source is ensured to ensure that the time of each node is consistent. The time of the first node may generally be used as a clock source.

Secret mutual trust is avoided among the nodes, namely when one node in the cluster needs to call another node, the other node can be logged in without using a user name, a password and the like of the other node, and a public and private key pair is adopted for direct communication of the two nodes, so that the login process of the user name and the password is avoided.

The kernel parameters of each node, that is, the kernel parameters that each node needs to configure for operation, for example, the maximum number of processes that can be pulled up, the maximum number of files that can be opened, and the like.

The firewall white list of the cluster, that is, when the cluster is provided with a firewall, each node in the cluster needs to be added to the firewall white list of the cluster to ensure that communication between each node in the cluster is not intercepted by the firewall, so that the nodes in the cluster can communicate with each other.

After the cluster environment of the cluster is configured, compared with the prior art that only a big data service is deployed, and the cluster environment is not configured, the operation and maintenance of the cluster are difficult. The cluster stability and reliability can be improved. And the big data service deployment tool carries out cluster environment automatic configuration in the process of deploying the big data service, so that the time and the workload required by independent configuration can be reduced, and great convenience can be provided for cluster operation and maintenance after the cluster environment is configured.

Step S13: arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate the Ambari blueprint.

After the deployment information is obtained, the big data service deployment tool can know which big data services need to be deployed, so that the Ambari is required to arrange the big data services to be deployed according to the service information and the node information of the cluster, and the Ambari blueprint is generated.

Arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate an Ambari blueprint, which comprises the following steps: arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, wherein the arrangement result represents a mapping relation between the component of the big data service to be deployed and each node in the cluster, namely the arrangement result represents a corresponding relation between the component of the big data service to be deployed and the node where the component is deployed; organizing the arrangement result into a preset Ambari blueprint format to obtain the Ambari blueprint.

That is, ambari arranges the big data service to be deployed according to the service information and the node information of the cluster to obtain an arrangement result representing a mapping relationship between the components in the big data service to be deployed and each node, so as to determine which components in the big data service to be deployed need to be deployed on which nodes, and then organizes the arrangement result into a preset Ambari blueprint format to obtain Ambari blueprint, so that Ambari can identify the arrangement result.

Arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, and arranging the big data service to be deployed according to the affinity among the big data service to be deployed, the service operation configuration information, the resource size of each node, the load balance of each node and the like to obtain the arrangement result.

Specifically, an Ambari Server of Ambari needs to be installed on the first node; installing agents corresponding to the Ambari Server on the first node and the second node; and arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari Server to generate an Ambari blueprint.

That is, ambari is required to be called by the big data service deployment tool to arrange the big data service to be deployed, so Ambari Server of Ambari is required to be installed on the first node, an Agent (Agent) corresponding to Ambari Server is arranged on the first node and the second node, then Ambari Server can be used to arrange the big data service to be deployed according to the service information and the node information of the cluster to generate Ambari blueprint, and the Agent is used to arrange the component corresponding to the big data service to be deployed on each node.

In the actual application process, arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, the method further includes: acquiring arrangement information input through a user interface of the big data service deployment tool; and adding the arrangement information into the arrangement result.

That is, in the process of arranging the big data service to be deployed, in addition to automatic arrangement, user-defined arrangement may be performed by a user, that is, the user may input arrangement information through a user interface (that is, the web UI) of the big data service deployment tool, and the big data service deployment tool takes the arrangement information as a part of an arrangement result after acquiring the arrangement information through its own user interface.

Therefore, the automation arrangement and the user-defined arrangement of the big data service deployment tool are combined, the flexibility in the big data service arrangement process can be improved, and the user experience of the big data service deployment tool is improved.

Step S14: registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment.

After obtaining the Ambari blueprint, registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster, so as to operate the big data service to be deployed in the cluster environment.

Specifically, a reset (Representational State Transfer) API (Application Programming Interface) of Ambari is called to register Ambari blueprint and deploy the big data service to be deployed on the first node and the second node in the cluster, and after the big data service to be deployed is deployed on the first node and the second node, the big data service to be deployed can run in the cluster environment.

In practical applications, the first node may be a physical machine or a virtual machine, and similarly, the second node may be a physical machine or a virtual machine. That is, the big data service deployment tool may be installed on a virtual machine to deploy the big data service to be deployed on the cloud. The big data service deployment tool can also be directly installed on a physical machine so as to deploy the big data service to be deployed in the physical machine cluster. Therefore, the big data service deployment in the application can be a big data service deployment on a cloud or a big data service deployment of a physical machine cluster, the application range of the big data service deployment method in the application is enlarged, and the application scene of the big data service deployment method in the application is wider.

The big data service deployment tool is installed on a first node in a cluster for deploying big data services, and first deployment information is obtained, wherein the deployment information comprises service information, and the service information comprises a service list of the big data services to be deployed and service operation configuration information. And then configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between each node in the cluster, secret mutual trust free between each node, kernel parameters of each node and a firewall white list of the cluster. And then arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate the Ambari blue print. And then registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment. Therefore, automatic deployment of big data services can be realized by directly utilizing a big data service deployment tool developed in advance based on Ambari, the big data service deployment tool developed in advance based on Ambari can automatically arrange the big data services to generate Ambari blue print, then the Ambari blue print is registered, and the big data services are deployed on each node, the whole deployment process is very simple and efficient, errors are not easy to occur, and the labor cost in manual deployment is saved. And the automatic configuration of the cluster environment is directly carried out by the big data service deployment tool in the process of big data service deployment, so that the stability and reliability of the cluster are improved, the time and workload required for independently configuring the cluster environment can be reduced, and great convenience is provided for the cluster operation and maintenance.

Referring to fig. 2, an embodiment of the present application discloses a specific big data service deployment method, which is applied to a big data service deployment tool developed in advance based on Ambari, where the big data service deployment tool is installed on a first node in a cluster for deploying big data services, and the method includes:

step S21: the method comprises the steps of obtaining deployment information, wherein the deployment information comprises service information, and the service information comprises a service list of the big data service to be deployed and service operation configuration information.

Step S22: and performing initialization configuration on the nodes in the cluster by using node initialization information in the deployment information, wherein the node initialization information comprises an IP address, a domain name, a user name and an SSH login password of each node in the cluster.

As described in the foregoing embodiment, the deployment information includes node initialization information, and the node initialization information includes an IP address, a domain name, a user name, and an SSH login password of each node in the cluster. Therefore, after the deployment information is obtained, the nodes in the cluster also need to be initialized and deployed by using the node initialization information.

Each node in the cluster has a corresponding IP address, so that the IP address is recorded in the node initialization information, then a domain name, a user name and an SSH login password corresponding to each IP address are set, and after the big data service deployment tool acquires the node initialization information, the domain name, the user name and the SSH login password of the node corresponding to each IP address can be configured according to the IP address so as to configure the cluster.

Step S23: and clearing the historical service data on the first node and the second node.

In practical application, the historical service data on the first node and the second node also needs to be cleared. Data generated by the previously deployed service may remain on the first node and the second node, so that the data generated by the previously deployed service needs to be cleared, so as not to affect a big data service to be deployed. The historical service data comprises data related to big data services and can also comprise data used in a process of configuring a cluster environment of a cluster.

Step S24: and installing a JAVA running environment and a Python2 running environment on the first node and the second node.

It can be understood that a runtime environment needs to be installed on the first node and the second node, so that the big data service can be executed on the first node and the second node. Specifically, a JAVA runtime environment and a Python2 runtime environment need to be installed on the first node and the second node. Almost all big data services rely on the JAVA runtime environment, while Ambari relies on the Python2 runtime environment. Therefore, the JAVA runtime environment and the Python2 runtime environment need to be installed and runtime environment variables need to be configured on the first node and the second node.

Step S25: configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between each node in the cluster, secret mutual trust free between each node, kernel parameters of each node and a firewall white list of the cluster.

The specific implementation process of step S25 may refer to the content disclosed in the foregoing embodiments, and is not described herein again.

Step S26: arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate the Ambari blueprint.

Step S27: registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment.

The specific implementation processes of step S26 and step S27 may refer to the contents disclosed in the foregoing embodiments, and are not described herein again.

Step S28: and if the deployment of the big data service to be deployed fails, judging whether a retry instruction is acquired.

After the Ambari blueprint is registered by calling the Rest API of the Ambari and the big data service to be deployed is deployed on the first node and the second node in the cluster, the deployment progress of the big data service to be deployed in the cluster needs to be inquired so as to know whether the big data service to be deployed is successfully deployed.

If the deployment of the big data service to be deployed fails, the user can choose to retry on the Web UI, so that whether a retry instruction is acquired needs to be judged, and whether the big data service to be deployed needs to be redeployed is determined.

Step S29: and if a retry instruction is acquired, re-executing the step of starting to clear the historical service data on the first node and the second node.

If the retry instruction is obtained, indicating that the big data service to be deployed needs to be redeployed, so the step of starting to clear the historical service data on the first node and the second node is executed.

Referring to fig. 3, a flow diagram for a specific big data service deployment is shown. Firstly, preparing a plurality of servers and configuring network environments of the servers, and the like, then installing a big data service deployment tool (Web Server) on one of the servers, and enabling a user to log in a Web UI of the big data service deployment tool (Web Server) to configure a cluster, namely, utilizing node initialization information in the deployment information to perform initialization configuration on nodes in the cluster. The cluster environment also needs to be cleared, i.e., the historical service data on the first node in the cluster and the second node in the cluster is cleared. And then installing a big data service operating environment, namely installing a JAVA operating environment and a Python2 operating environment on the first node and the second node. The cluster environment of the cluster is then reconfigured. And installing Ambari Server/Agent, namely installing Ambari Server of the Ambari on the first node; and installing the Agent corresponding to the Ambari Server on the first node and the second node. And then performing service arrangement to generate an Ambari blueprint, calling an Ambari interface to create a cluster, namely calling a Rest API of the Ambari to register the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster. And judging whether the deployment is successful or not, if so, ending, otherwise, judging whether the re-try is carried out or not, if so, re-executing the step of clearing the cluster environment, and if not, ending.

Referring to fig. 4, an embodiment of the present application discloses a big data service deployment apparatus, which is applied to a big data service deployment tool developed in advance based on Ambari, where the big data service deployment tool is installed on a first node in a cluster for deploying a big data service, and includes:

the information acquiring module 11 is configured to acquire deployment information, where the deployment information includes service information, and the service information includes a service list of a big data service to be deployed and service operation configuration information;

a cluster environment configuration module 12, configured to configure a cluster environment of the cluster, where the cluster environment includes any one or a combination of any several of log backup and periodic cleaning, database backup and periodic cleaning, time synchronization between nodes in the cluster, secret mutual trust between the nodes, kernel parameters of the nodes, and a firewall white list of the cluster;

the service arranging module 13 is configured to arrange the big data service to be deployed according to the service information and the node information of the cluster by Ambari, and generate Ambari blueprint;

a service deployment module 14, configured to register the Ambari blueprint and deploy the big data service to be deployed on the first node and a second node in the cluster, so as to run the big data service to be deployed in the cluster environment.

The method is applied to a big data service deployment tool developed in advance based on Ambari, the big data service deployment tool is installed on a first node in a cluster for deploying big data services, deployment information is obtained firstly, wherein the deployment information comprises service information, and the service information comprises a service list of the big data services to be deployed and service operation configuration information. And then configuring the cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic clearing, database backup and periodic clearing, time synchronization between all nodes in the cluster, secret mutual trust between all the nodes, kernel parameters of all the nodes and a firewall white list of the cluster. And then arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate the Ambari blueprint. And then registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to run the big data service to be deployed in the cluster environment. Therefore, automatic deployment of big data services can be realized by directly utilizing a big data service deployment tool developed in advance based on Ambari, the big data service deployment tool developed in advance based on Ambari can automatically arrange the big data services to generate Ambari blue print, then the Ambari blue print is registered, and the big data services are deployed on each node, the whole deployment process is very simple and efficient, errors are not easy to occur, and the labor cost in manual deployment is saved. And the automatic configuration of the cluster environment is directly carried out by the big data service deployment tool in the process of big data service deployment, so that the stability and reliability of the cluster are improved, the time and workload required for independently configuring the cluster environment can be reduced, and great convenience is provided for cluster operation and maintenance.

In a specific implementation process, the big data service deployment apparatus further includes:

the initialization module is used for performing initialization configuration on the nodes in the cluster by using node initialization information in the deployment information, wherein the node initialization information comprises an IP address, a domain name, a user name and an SSH login password of each node in the cluster;

the data clearing module is used for clearing historical service data on the first node and the second node;

and the operating environment installation module is used for installing a JAVA operating environment and a Python2 operating environment on the first node and the second node.

the judging module is used for judging whether a retry instruction is obtained or not when the deployment of the big data service to be deployed fails; and if a retry instruction is acquired, re-executing the step of starting to clear the historical service data on the first node and the second node.

In a specific implementation process, the service orchestration module 12 is configured to:

installing an Ambari Server of the Ambari on the first node;

and adding the arrangement information into the arrangement result.

In a specific implementation process, the first node is a physical machine or a virtual machine, and the second node is a physical machine or a virtual machine.

Referring to fig. 5, a schematic structural diagram of an electronic device 20 provided in the embodiment of the present application is shown, where the electronic device 20 may specifically implement the steps of the big data service deployment method disclosed in the foregoing embodiment.

In general, the electronic device 20 in the present embodiment includes: a processor 21 and a memory 22.

The processor 21 may include one or more processing cores, such as a four-core processor, an eight-core processor, and so on. The processor 21 may be implemented by at least one hardware of a DSP (digital signal processing), an FPGA (field-programmable gate array), and a PLA (programmable logic array). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU (graphics processing unit) which is responsible for rendering and drawing images to be displayed on the display screen. In some embodiments, the processor 21 may include an AI (artificial intelligence) processor for processing computing operations related to machine learning.

Memory 22 may include one or more computer-readable storage media, which may be non-transitory. Memory 22 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 22 is at least used for storing the following computer program 221, wherein after being loaded and executed by the processor 21, the computer program can implement the big data service deployment method steps disclosed in any of the foregoing embodiments.

In some embodiments, the electronic device 20 may further include a display 23, an input/output interface 24, a communication interface 25, a sensor 26, a power supply 27, and a communication bus 28..

Those skilled in the art will appreciate that the configuration shown in FIG. 5 is not limiting of electronic device 20 and may include more or fewer components than those shown.

Further, an embodiment of the present application also discloses a computer-readable storage medium for storing a computer program, where the computer program is executed by a processor to implement the big data service deployment method disclosed in any of the foregoing embodiments.

For a specific process of the big data service deployment method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not described here.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of other elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

The method, the device, the equipment and the medium for deploying the big data service provided by the application are introduced in detail, a specific example is applied in the description to explain the principle and the implementation mode of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A big data service deployment method is characterized by being applied to a big data service deployment tool developed in advance based on Ambari, wherein the big data service deployment tool is installed on a first node in a cluster for deploying big data services, and the method comprises the following steps:

arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate Ambari blueprint;

registering the Ambari blueprint and deploying the big data service to be deployed on the first node and a second node in the cluster so as to operate the big data service to be deployed in the cluster environment;

the arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to generate Ambari blueprint, and the arranging comprises the following steps:

2. The big data service deployment method according to claim 1, wherein before configuring the clustered environment of the cluster, the method further comprises:

clearing historical service data on the first node and the second node;

3. The big data service deployment method of claim 2, wherein, after registering the Ambari blueprint and deploying the big data service on the first node and a second node in the cluster, further comprising:

4. The big data service deployment method according to claim 1, wherein the generating Ambari blue print by arranging the big data service to be deployed by Ambari according to the service information and the node information of the cluster comprises:

installing an Ambari Server of the Ambari on the first node;

5. The big data service deployment method according to claim 1, wherein in the process of arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, the method further comprises:

and adding the arrangement information into the arrangement result.

6. The big data service deployment method according to any one of claims 1 to 5, wherein the first node is a physical machine or a virtual machine, and the second node is a physical machine or a virtual machine.

7. A big data service deployment device is applied to a big data service deployment tool developed in advance based on Ambari, wherein the big data service deployment tool is installed on a first node in a cluster for deploying big data services, and the big data service deployment device comprises:

the cluster environment configuration module is used for configuring a cluster environment of the cluster, wherein the cluster environment comprises any one item or any combination of several items in log backup and periodic clearing, database backup and periodic clearing, time synchronization among all nodes in the cluster, secret mutual trust free among all the nodes, kernel parameters of all the nodes and a firewall white list of the cluster;

a service deployment module, configured to register the Ambari blueprint and deploy the big data service to be deployed on the first node and a second node in the cluster, so as to run the big data service to be deployed in the cluster environment;

wherein, the service arrangement module is specifically configured to: arranging the big data service to be deployed according to the service information and the node information of the cluster by the Ambari to obtain an arrangement result, wherein the arrangement result represents a mapping relation between the component of the big data service to be deployed and each node in the cluster; organizing the arrangement result into a preset Ambari blueprint format to obtain the Ambari blueprint.

8. An electronic device, comprising:

a memory and a processor;

wherein the memory is used for storing a computer program;

the processor is used for executing the computer program to realize the big data service deployment method of any one of claims 1 to 6.

9. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the big data service deployment method of any of claims 1 to 6.