CN116302716A - Cluster deployment method and device, electronic equipment and computer readable medium - Google Patents

Cluster deployment method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN116302716A
CN116302716A CN202310222057.5A CN202310222057A CN116302716A CN 116302716 A CN116302716 A CN 116302716A CN 202310222057 A CN202310222057 A CN 202310222057A CN 116302716 A CN116302716 A CN 116302716A
Authority
CN
China
Prior art keywords
cluster
node
determining
slave
master node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310222057.5A
Other languages
Chinese (zh)
Inventor
王阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202310222057.5A priority Critical patent/CN116302716A/en
Publication of CN116302716A publication Critical patent/CN116302716A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/203Failover techniques using migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application discloses a cluster deployment method, a cluster deployment device, electronic equipment and a computer readable medium, and relates to the technical field of cloud computing, wherein a specific implementation mode comprises the steps of receiving a cluster deployment request and acquiring a corresponding scene identifier; calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality; determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node; determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes; and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node. Therefore, when the clusters are deployed, the anomalies are processed in time, hidden danger of buried accidents is avoided, and the performance and safety of the deployed clusters are improved.

Description

Cluster deployment method and device, electronic equipment and computer readable medium
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a cluster deployment method, a device, an electronic device, and a computer readable medium.
Background
The remote dictionary service (Remote Dictionary Server, redis) is used as a scheme for storing main memory databases in the future, and is implemented and deployed by database administrators (Database Administrator, DBA), operation and maintenance and technicians in various places for each layer of reasons when the redis clusters are deployed in each province, city and branch. The deployment mode, method, configuration and cluster architecture are quite different, the internet cannot be connected under the influence of local network environment, and the technical level of operators is also uneven, so that the deployment quality and the construction period are quite different. When the clusters are deployed, omission or errors exist, and serious accident potential is given to the on-line database burial.
Disclosure of Invention
In view of this, embodiments of the present application provide a cluster deployment method, apparatus, electronic device, and computer readable medium, which can solve the problem that when a cluster is deployed, there is a omission or an error, which can cause a serious accident hidden trouble to be buried in an online database.
To achieve the above object, according to one aspect of the embodiments of the present application, there is provided a cluster deployment method, including:
receiving a cluster deployment request, and acquiring a corresponding scene identifier;
Calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality;
determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node;
determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes;
and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node.
Optionally, determining the corresponding abnormal cluster master node includes:
calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to a cluster deployment request, and receiving a returned first instruction response result and a returned second instruction response result;
judging whether each cluster main node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result;
responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component;
And determining the quantity corresponding to the downtime of the cluster master nodes in each second judging result, and determining the cluster master nodes as abnormal cluster master nodes in response to the quantity exceeding a preset threshold.
Optionally, switching the target cluster slave node to the cluster master node includes:
determining a service monitoring component set and an election strategy corresponding to the scene identifier;
determining a target service monitoring component from the service monitoring component set according to the election strategy;
and calling a target service monitoring component to switch the target cluster slave node to the cluster master node.
Optionally, obtaining performance data of each cluster slave node includes:
and acquiring the online state data, the response speed data and the connection duration data of each cluster slave node.
Optionally, after switching the target cluster slave node to the cluster master node, the method further comprises:
generating master-slave relationship data based on the cluster master node and each cluster slave node;
and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
Optionally, the running structure program toolkit includes:
determining a function corresponding to the structural program tool package, and further determining a logic core of a cluster master node corresponding to the function;
The functions are assigned to the logical cores.
Optionally, determining the logical core of the cluster master node corresponding to the function includes:
acquiring configuration index data corresponding to the function;
and determining the logic core of the cluster master node corresponding to the function according to the configuration index data.
In addition, the application also provides a cluster deployment device, which comprises:
the receiving unit is configured to receive the cluster deployment request and acquire a corresponding scene identifier;
the abnormal cluster master node determining unit is configured to call and operate a corresponding architecture program tool package according to the scene identification, and determine a corresponding abnormal cluster master node in response to monitoring of the operation abnormality;
the data acquisition unit is configured to determine each cluster slave node corresponding to the abnormal cluster master node and acquire the performance data of each cluster slave node;
a switching unit configured to determine a target cluster slave node from among the cluster slave nodes based on the performance data, and further switch the target cluster slave node to a cluster master node;
and the running unit is configured to continue to run the structural program toolkit according to the cluster master node and each cluster slave node until the running is successful.
Optionally, the abnormal cluster master node determination unit is further configured to:
Calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to a cluster deployment request, and receiving a returned first instruction response result and a returned second instruction response result;
judging whether each cluster main node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result;
responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component;
and determining the quantity corresponding to the downtime of the cluster master nodes in each second judging result, and determining the cluster master nodes as abnormal cluster master nodes in response to the quantity exceeding a preset threshold.
Optionally, the switching unit is further configured to:
determining a service monitoring component set and an election strategy corresponding to the scene identifier;
determining a target service monitoring component from the service monitoring component set according to the election strategy;
and calling a target service monitoring component to switch the target cluster slave node to the cluster master node.
Optionally, the data acquisition unit is further configured to:
And acquiring the online state data, the response speed data and the connection duration data of each cluster slave node.
Optionally, the apparatus further comprises a synchronization unit configured to:
generating master-slave relationship data based on the cluster master node and each cluster slave node;
and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
Optionally, the execution unit is further configured to:
determining a function corresponding to the structural program tool package, and further determining a logic core of a cluster master node corresponding to the function;
the functions are assigned to the logical cores.
Optionally, the execution unit is further configured to:
acquiring configuration index data corresponding to the function;
and determining the logic core of the cluster master node corresponding to the function according to the configuration index data.
In addition, the application also provides cluster deployment electronic equipment, which comprises: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors are enabled to realize the cluster deployment method.
In addition, the application further provides a computer readable medium, on which a computer program is stored, which when executed by a processor, implements a cluster deployment method as described above.
To achieve the above object, according to yet another aspect of the embodiments of the present application, a computer program product is provided.
A computer program product of an embodiment of the present application includes a computer program, which when executed by a processor implements a cluster deployment method provided by the embodiment of the present application.
One embodiment of the above invention has the following advantages or benefits: the method comprises the steps of obtaining a corresponding scene identifier by receiving a cluster deployment request; calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality; determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node; determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes; and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node. Therefore, when the clusters are deployed, the anomalies are processed in time, hidden danger of buried accidents is avoided, and the performance and safety of the deployed clusters are improved.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as unduly limiting the present application. Wherein:
FIG. 1 is a schematic diagram of the main flow of a cluster deployment method according to one embodiment of the present application;
FIG. 2 is a schematic diagram of the main flow of a cluster deployment method according to one embodiment of the present application;
FIG. 3 is a main flow diagram of a cluster deployment method according to one embodiment of the present application;
FIG. 4 is a schematic diagram of the main units of a cluster deployment apparatus according to an embodiment of the application;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present application may be applied;
fig. 6 is a schematic diagram of a computer system suitable for use in implementing the terminal device or server of the embodiments of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. In the technical scheme of the application, the aspects of acquisition, analysis, use, transmission, storage and the like of the related user personal information all meet the requirements of related laws and regulations, are used for legal and reasonable purposes, are not shared, leaked or sold outside the aspects of legal use and the like, and are subjected to supervision and management of a supervision department. Necessary measures should be taken for the personal information of the user to prevent illegal access to such personal information data, ensure that personnel having access to the personal information data comply with the regulations of the relevant laws and regulations, and ensure the personal information of the user. Once these user personal information data are no longer needed, the risk should be minimized by limiting or even prohibiting the data collection and/or deletion.
User privacy is protected by de-identifying data when used, including in some related applications, such as by removing a particular identifier, controlling the amount or specificity of stored data, controlling how data is stored, and/or other methods.
Fig. 1 is a schematic diagram of main flow of a cluster deployment method according to an embodiment of the present application, as shown in fig. 1, the cluster deployment method includes:
step S101, a cluster deployment request is received, and a corresponding scene identifier is obtained.
In this embodiment, the execution body (for example, may be a server) of the cluster deployment method may receive the cluster deployment request through a wired connection or a wireless connection. After receiving the cluster deployment request, the execution body can acquire the scene identifier carried in the request. The scene identifier may be used to characterize a scene of cluster deployment, where the scene of cluster deployment may include a large number of read requests, a scene with priority persistence, a scene with high performance, a scene for building a data center, a scene with large number of writing, and the like. Each scene has a corresponding scene identifier, and the scene identifiers corresponding to a large number of read requests, a scene with priority persistence, a scene with high performance, a scene for constructing a data center and a large number of written scenes can be respectively 1, 2, 3 and 4.
The scene with a large number of read requests and preferential persistence adopts a master-multiple-slave cluster deployment architecture: the one master and multiple slaves adopt a two-stage architecture, the first stage provides writing capability, and the second stage can provide reading capability or ensure data persistence. This is the most common deployment mode, typically as a deployment scenario when there are a large number of read requests and preferential persistence. High performance and high availability can be ensured. Because of the high redundancy of the dual slave libraries, data loss control can be minimized and fast switching can be performed when a failure occurs.
A scenario focusing on high performance employs a cascaded replicated cluster deployment architecture: the cascade replication adopts a three-level architecture, wherein the first level provides writing capability, the second level provides reading capability and fast switching capability, and the third level serves as a standby level and can be used as reading capability if the service is insensitive to data synchronization delay. This mode belongs to the mode used when high performance scenes are emphasized. When the concurrency of the production environment is large and the data are temporarily cached, the throughput of the main library can be effectively improved by adopting the deployment mode. The main library fails, and can be switched to the next stage rapidly, so that high-availability recovery service is realized.
The scene for constructing the data center adopts a main copy cluster deployment architecture: also known as dual activity, is commonly used to build data centers, including a primary data center and a backup data center. The main data center bears user data, and the standby data center can realize disaster standby environment or multi-activity environment for backing up the data of the main data center.
A scenario where there is a large number of writes employs a multi-master one-slave cluster deployment architecture: the mode is more applied to a large number of writing scenes. Applications where the traffic technology stack is a short connection, such as a short connection of PHP, are more suitable. The ultrahigh concurrency can be shared among a plurality of main libraries, so that the high pressure bearing capacity of the CPU is achieved. The slave libraries here synchronize all of the master libraries, primarily as the final data consistency library. Such as statistically aggregating some data, centralizing the synchronization to the slave library to effect the query.
Step S102, calling and operating a corresponding architecture program toolkit according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality.
After obtaining the scene identifier, the execution body can call and run a corresponding architecture program toolkit according to the scene identifier. By way of example, the architecture program toolkit may be a remote dictionary service (Remote Dictionary Server, redis) architecture program toolkit, and in particular may be a redis architecture program installation script. When the executing body monitors that the abnormality occurs during the operation of the framework program tool package, the abnormality detection program is called to determine the cluster master node where the abnormality is located. Specifically, the abnormality may be, for example, a machine room network problem, a server environment problem, a deployment personnel technical level problem, a communication cost problem, etc., and the abnormality in the embodiment of the present application is not specifically limited.
Step S103, determining each cluster slave node corresponding to the abnormal cluster master node, and obtaining the performance data of each cluster slave node.
The cluster master node may be denoted as master, and the cluster slave node may be denoted as slave. One master may correspond to one or more salves. Multiple masters may correspond to one salve.
Specifically, obtaining performance data of each cluster slave node includes: and acquiring the online state data, the response speed data and the connection time length data of each cluster slave node, and further serving as the performance data of each cluster slave node.
Step S104, determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes.
Illustratively, determining a target cluster slave node from among the cluster slave nodes based on the performance data includes: the online state data in the performance data is reserved online, the response speed is eliminated, the response speed is reduced, the master is eliminated, the priority is highest, the offset is minimum, the server running ID (the server running ID is the identity identification code of each server running each time), and the server running ID can generate a plurality of running IDs for multiple times and is selected as a target cluster slave node, namely a target slave, and is used as a new master.
Specifically, the switching of the target cluster slave node to the cluster master node includes: determining a service monitoring component set and an election strategy corresponding to the scene identifier; determining a target service monitoring component from the service monitoring component set according to the election strategy; and calling a target service monitoring component to switch the target cluster slave node to the cluster master node.
By way of example, a service listening component, such as a sentinel, which is a redis high availability service listening component, is a special redis service. Data read-write is not provided, only as snoop. In addition to listening to each redis node, the sentinels also listen to each other. And once the sentry discovers that a certain main node fails, automatic fault transfer operation is performed, one slave of the failed main node is updated to the main node, and the rest of slave nodes are synchronized to the new main node. And finishing the fast switching process. A set of service listening components, such as a set of sentinels. And the service monitoring component set corresponding to the scene identifier is the sentinel set in the scene corresponding to the scene identifier. The sentinel quality inspection votes according to the election strategy, and the sentinel with the largest number of votes is determined as the target sentinel, namely the target service monitoring component executes master-slave node switching by the target service monitoring component so as to switch the target cluster from the node to the cluster master node.
Specifically, after switching the target cluster slave node to the cluster master node, the method further includes: generating master-slave relationship data based on the cluster master node and each cluster slave node; and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
After switching the target cluster slave node to the cluster master node, new master-slave relationship data is generated. The executing body may call the target service listening component to synchronize the new master-slave relationship data to all service listening components in the set of service listening components. And finally, switching work is completed.
Step S105, continuing to operate the structural program toolkit according to the cluster master node and each cluster slave node until the operation is successful.
After the node switching is completed, the execution main body can continue to operate the structural program toolkit based on the new master-slave relationship, and repeatedly execute the master-slave node switching when the abnormality is encountered again, so as to operate the structural program toolkit until the operation is successful based on operating the normal master-slave nodes.
Specifically, the running structure program toolkit includes: determining a function corresponding to the structural program tool package, and further determining a logic core of a cluster master node corresponding to the function; the functions are assigned to the logical cores.
For example, the remote dictionary service redis needs to be deployed on a 4-core 8G server, and a central processing unit (central processing unit, cpu) and a memory are transmitted to the inside of the script through a corresponding architecture program toolkit, such as a one-key deployment script, so as to determine the optimal use ratio of parameters of script nesting and the cpu and the memory. Such as server_cpu list and bio_cpu list in the redis parameter, the one-touch deployment script will set this value to server_cpu list 0,1,2 and bio_cpu list 3. Different functions of redis are allocated to different logic cores and written into a configuration file redis.
Specifically, determining a logical core of the cluster master node corresponding to the function includes: acquiring configuration index data corresponding to a function, for example, the logic cores corresponding to the function corresponding to the server_cpu list in the redis parameter are respectively 0,1 and 2, and the logic core corresponding to the function corresponding to the bio_cpu list is 3; according to the configuration index data, it is determined that the logical core of the cluster master node corresponding to the function, for example, the logical core corresponding to the function corresponding to the parameter server_cpu of the architecture program toolkit may be 0,1, and 2, and the logical core corresponding to the parameter bio_cpu of the architecture program toolkit may be 3.
The embodiment obtains the corresponding scene identification by receiving a cluster deployment request; calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality; determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node; determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes; and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node. Therefore, when the clusters are deployed, the anomalies are processed in time, hidden danger of buried accidents is avoided, and the performance and safety of the deployed clusters are improved.
Fig. 2 is a main flow diagram of a cluster deployment method according to an embodiment of the present application, as shown in fig. 2, the cluster deployment method includes:
step S201, a cluster deployment request is received, and a corresponding scene identifier is obtained.
Step S202, calling and operating a corresponding architecture program toolkit according to a scene identifier, calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to a cluster deployment request in response to monitoring of an abnormal operation, and receiving returned first instruction response results and second instruction response results.
When the architecture program toolkit is monitored to be abnormal in operation, the execution body can call a first service monitoring component, for example, all sentinels in a scene corresponding to the scene identifier, send a first instruction, for example, a ping instruction, to each cluster master node corresponding to the cluster deployment request, send a second instruction, for example, an info instruction, receive a returned first instruction response result, that is, whether each cluster master slave node is online or not, and receive a returned second instruction response result, that is, the state of each cluster master slave node.
Step S203, judging whether each cluster master node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result.
And synchronizing the first instruction response result and the second instruction response result acquired by each sentry to other sentries according to whether the master-slave nodes of each cluster indicated by the first instruction response result are online or not and according to the state of the master-slave nodes of each cluster indicated by the second instruction response result, so as to synchronize the state information of each node. For example, on the basis of data synchronization, the executing body may call the sentinel 1 to send hello actions to each cluster master node, check whether there is a node that is DOWN, if so, send a normal notification to other sentinels, if the sentinel 1 always sends hello to one of the cluster master nodes corresponding to the scene identifier, does not receive a response, and obtains a flag, namely an sri_s_down message, and determine that the master is DOWN and output as a first determination result.
And step S204, responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component.
When the first judgment result output by the sentry 1 is that the master of the cluster is down, sending SENTINEL is-master-down-by-addr to other sentry to the second service monitoring components, namely other sentry, informing the other sentry that the master is down, sending hello to the corresponding master to detect after the other sentry receives the first judgment result, and returning detection results, namely returning each second judgment result.
Step S205, determining the number of downtime corresponding to the cluster master nodes in each second judging result, and determining the cluster master nodes as abnormal cluster master nodes in response to the number exceeding a preset threshold.
And determining the number of whistle considered to be downtime of the cluster master node master in the second judging result, and determining that the cluster master node master is in a downtime state when the number exceeds a preset threshold (for example, half of the total number of whistle), namely determining that the cluster master node master is an abnormal cluster master node.
Step S206, determining each cluster slave node corresponding to the abnormal cluster master node, and obtaining the performance data of each cluster slave node.
The performance data may include presence data, response speed data, and connection duration data.
Step S207, determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes to the cluster master nodes.
For example, online retention of online state data in the performance data, elimination of slow response speed and elimination of long disconnection time with the original master, and finally, the server running ID (runid) with highest priority, smallest offset and smallest running ID of each server is an identity identification code of each running, and one server can generate a plurality of running IDs for multiple running, so that the server can be selected as a target cluster slave node, namely a target slave, and is used as a new master. The sentinel will send a slave of no one operation to the new master and a new master address to the slave nodes of the other clusters.
Step S208, continuing to operate the structural program toolkit according to the cluster master node and each cluster slave node until the operation is successful.
And running the structural program tool package based on the cluster master node and each cluster slave node with normal running states until the running is successful. And when the clusters are deployed, the anomalies are processed in time, so that hidden danger of buried accidents is avoided, and the performance and safety of the deployed clusters are improved.
Fig. 3 is an application scenario diagram of a cluster deployment method according to one embodiment of the present application. The cluster deployment method is applied to a high-availability cluster deployment scene. As shown in fig. 3, one-touch deployment: a central control server can log on all other servers through ssh. The central control server is stored with a redis configuration file, a redis installation package, a sentinel configuration file, a compiling dependency package, a sentinel installation package, a switching script and the like.
One-touch deployment script in fig. 3: determining a redisserver (such as a redis cluster 1 comprising redismaster, redisslave and redisslave) and a sentry server (such as a sentry cluster 1 comprising sentry 1, sentry 2 and sentry 3) to be deployed according to the input parameters; determining a master-slave relationship according to the number of redis servers; setting the size, persistence policy and the like of redis configuration parameters according to the input parameters, and optimizing parameters of an operating system; and adjusting proper sentry related parameter configuration, such as a monitoring interval, a missed connection retry time and times, a shortest switching time and the like, according to the definition of the sentry server.
By way of example, the incoming parameters may include:
-c 1-specifying the number of cores of the redis server that can use cpu at most.
-m 4: a size that allows redis to use content at most is specified. Unit G.
-p/data/: specifying a path for redis installation.
-M ip: port: designating a master library ip: port.
-sip 1: port; ip2: port: designating slave ip: port.
-P passwd: the login password and the synchronization password of the redis instance are specified.
-t sendtinel: this parameter is optional sentinel, ms, alone. Wherein, sentinel is a high availability structure, ms is a master-slave architecture, and alone is a single machine.
V ip: assigned vip, the sentinel will use.
-a yes: specifying whether to open aof persistence. Alternative yes, no.
-R yes: it is specified whether rdb persistence is turned on. Alternative yes, no.
-T5000: the sentinel off-link time is specified in milliseconds.
-an N mymaster: a cluster name is specified. The sentinel configuration may be used.
Most of the redis configuration parameters and sentinel configuration parameters above have been set to the optimal configuration and placed in the installation package. The optimal configuration of a small part of parameters is related to the resource configuration of the server, so that among the input parameters of the one-key installation script, CPU, memory and persistence indexes are particularly important, and the optimal configuration can be carried out according to the indexes by the nested optimization scheme in the script. The method comprises reasonable memory space allocation (related to memory occupation, namely mem number), redis maximum usable memory number (related to cpu), read-write performance and data importance (related to persistence), and sentinel election of optimal number (calculated according to-M and-S). The redis multiple architectures are deployed rapidly, and the optimal parameter proportion is debugged by combining the redis.conf configuration file, the sentinel.conf configuration file of the sentinel and the linux system parameters, so that the server can exert the maximum performance. The multiple architectures of the redis are deployed rapidly, single machine, master-slave deployment and high-availability deployment of the redis are combined, and deployment work is completed through one-key deployment scripts. The one-key deployment script can not only rapidly complete the deployment of rediss and sentinels, but also unify and standardize the deployment. The embodiment of the application can realize the redis cluster which supports the multi-scene architecture and has the advantages of uniform configuration, uniform architecture, offline rapid deployment, high performance and high availability.
Fig. 4 is a schematic diagram of main units of a cluster deployment apparatus according to an embodiment of the application. As shown in fig. 4, the cluster deployment apparatus 400 includes a receiving unit 401, an abnormal cluster master node determining unit 402, a data acquiring unit 403, a switching unit 404, and an operating unit 405.
The receiving unit 401 is configured to receive the cluster deployment request, and obtain the corresponding scene identifier.
An abnormal cluster master node determining unit 402, configured to invoke and run a corresponding architecture program toolkit according to the scene identifier, and determine a corresponding abnormal cluster master node in response to monitoring of the running abnormality.
The data obtaining unit 403 is configured to determine each cluster slave node corresponding to the abnormal cluster master node, and obtain performance data of each cluster slave node.
And a switching unit 404 configured to determine a target cluster slave node from the cluster slave nodes based on the performance data, and further switch the target cluster slave node to the cluster master node.
An execution unit 405 configured to continue to execute the structural program toolkit according to the cluster master node and each cluster slave node until the execution is successful.
In some embodiments, the abnormal cluster master node determination unit 402 is further configured to: calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to a cluster deployment request, and receiving a returned first instruction response result and a returned second instruction response result; judging whether each cluster main node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result; responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component; and determining the quantity corresponding to the downtime of the cluster master nodes in each second judging result, and determining the cluster master nodes as abnormal cluster master nodes in response to the quantity exceeding a preset threshold.
In some embodiments, the switching unit 404 is further configured to: determining a service monitoring component set and an election strategy corresponding to the scene identifier; determining a target service monitoring component from the service monitoring component set according to the election strategy; and calling a target service monitoring component to switch the target cluster slave node to the cluster master node.
In some embodiments, the data acquisition unit 403 is further configured to: and acquiring the online state data, the response speed data and the connection duration data of each cluster slave node.
In some embodiments, the apparatus further comprises a synchronization unit configured to: generating master-slave relationship data based on the cluster master node and each cluster slave node; and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
In some embodiments, the execution unit 405 is further configured to: determining a function corresponding to the structural program tool package, and further determining a logic core of a cluster master node corresponding to the function; the functions are assigned to the logical cores.
In some embodiments, the execution unit 405 is further configured to: acquiring configuration index data corresponding to the function; and determining the logic core of the cluster master node corresponding to the function according to the configuration index data.
It should be noted that, the cluster deployment method and the cluster deployment device in the application have a corresponding relationship in the specific implementation content, so the repeated content is not described again.
Fig. 5 illustrates an exemplary system architecture 500 in which a cluster deployment method or cluster deployment apparatus of embodiments of the present application may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 is used as a medium to provide communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 505 via the network 504 using the terminal devices 501, 502, 503 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 501, 502, 503, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be a variety of electronic devices with a clustered deployment screen and support web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (by way of example only) providing support for cluster deployment requests submitted by users using the terminal devices 501, 502, 503. The background management server can receive the cluster deployment request and acquire the corresponding scene identifier; calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality; determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node; determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes; and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node. Therefore, when the clusters are deployed, the anomalies are processed in time, hidden danger of buried accidents is avoided, and the performance and safety of the deployed clusters are improved.
It should be noted that, the cluster deployment method provided in the embodiments of the present application is generally executed by the server 505, and accordingly, the cluster deployment device is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a schematic diagram of a computer system 600 suitable for use in implementing the terminal device of an embodiment of the present application is shown. The terminal device shown in fig. 6 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the computer system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a liquid crystal credit authorization query processor (LCD), and the like, and a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments disclosed herein include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The described units may also be provided in a processor, for example, described as: a processor includes a receiving unit, an abnormal cluster master node determining unit, a data acquiring unit, a switching unit, and an operating unit. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs, and when the one or more programs are executed by one device, the device is caused to receive a cluster deployment request, and a corresponding scene identifier is obtained; calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of the operation abnormality; determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node; determining target cluster slave nodes from all cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes; and continuing to operate the structural program tool package until the operation is successful according to the cluster master node and each cluster slave node.
The computer program product of the present application comprises a computer program which, when executed by a processor, implements the cluster deployment method in the embodiments of the present application.
According to the technical scheme of the embodiment of the application, the exception can be handled in time when the clusters are deployed, the hidden danger of buried accidents is avoided, and the performance and the safety of the deployed clusters are improved.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (16)

1. A cluster deployment method, comprising:
receiving a cluster deployment request, and acquiring a corresponding scene identifier;
calling and operating a corresponding architecture program tool package according to the scene identification, and determining a corresponding abnormal cluster master node in response to monitoring of abnormal operation;
determining each cluster slave node corresponding to the abnormal cluster master node, and acquiring the performance data of each cluster slave node;
determining target cluster slave nodes from the cluster slave nodes based on the performance data, and switching the target cluster slave nodes into cluster master nodes;
And continuing to operate the structural program tool package until the structural program tool package is successfully operated according to the cluster master node and each cluster slave node.
2. The method of claim 1, wherein the determining the corresponding abnormal cluster master node comprises:
calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to the cluster deployment request, and receiving a returned first instruction response result and a returned second instruction response result;
judging whether each cluster master node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result;
responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component;
and determining the quantity corresponding to the downtime of the cluster master nodes in each second judging result, and determining the cluster master nodes to be abnormal cluster master nodes in response to the quantity exceeding a preset threshold.
3. The method of claim 1, wherein the switching the target cluster slave node to a cluster master node comprises:
Determining a service monitoring component set and an election strategy corresponding to the scene identifier;
determining a target service monitoring component from the service monitoring component set according to the election strategy;
and calling the target service monitoring component to switch the target cluster slave node to a cluster master node.
4. The method of claim 1, wherein the obtaining the performance data of the respective cluster slave node comprises:
and acquiring the online state data, the response speed data and the connection duration data of the slave nodes of each cluster.
5. A method according to claim 3, wherein after said switching of the target cluster slave node to a cluster master node, the method further comprises:
generating master-slave relationship data based on the cluster master node and each cluster slave node;
and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
6. The method of claim 1, wherein the running the structural program toolkit comprises:
determining a function corresponding to the structural program tool package, and further determining a logic core of the cluster master node corresponding to the function;
The functions are assigned to the logical cores.
7. The method of claim 6, wherein the determining the logical core of the cluster master to which the function corresponds comprises:
acquiring configuration index data corresponding to the function;
and determining the logic core of the cluster master node corresponding to the function according to the configuration index data.
8. A cluster deployment apparatus, comprising:
the receiving unit is configured to receive the cluster deployment request and acquire a corresponding scene identifier;
the abnormal cluster master node determining unit is configured to call and operate a corresponding architecture program tool package according to the scene identification, and determine a corresponding abnormal cluster master node in response to monitoring of abnormal operation;
the data acquisition unit is configured to determine each cluster slave node corresponding to the abnormal cluster master node and acquire the performance data of each cluster slave node;
a switching unit configured to determine a target cluster slave node from the respective cluster slave nodes based on the performance data, and further switch the target cluster slave node to a cluster master node;
and the running unit is configured to continue to run the structural program toolkit according to the cluster master node and each cluster slave node until the running is successful.
9. The apparatus of claim 8, wherein the abnormal cluster master determination unit is further configured to:
calling a first service monitoring component to send a first instruction and a second instruction to a cluster master node corresponding to the cluster deployment request, and receiving a returned first instruction response result and a returned second instruction response result;
judging whether each cluster master node is down according to the first instruction response result and the second instruction response result to obtain a first judgment result;
responding to the first judging result that the cluster master node is down, sending the first judging result to each second service monitoring component, and receiving each second judging result returned by each second service monitoring component;
and determining the quantity corresponding to the downtime of the cluster master nodes in each second judging result, and determining the cluster master nodes to be abnormal cluster master nodes in response to the quantity exceeding a preset threshold.
10. The apparatus of claim 8, wherein the switching unit is further configured to:
determining a service monitoring component set and an election strategy corresponding to the scene identifier;
Determining a target service monitoring component from the service monitoring component set according to the election strategy;
and calling the target service monitoring component to switch the target cluster slave node to a cluster master node.
11. The apparatus of claim 8, wherein the data acquisition unit is further configured to:
and acquiring the online state data, the response speed data and the connection duration data of the slave nodes of each cluster.
12. The apparatus of claim 10, further comprising a synchronization unit configured to:
generating master-slave relationship data based on the cluster master node and each cluster slave node;
and synchronizing the master-slave relationship data to each service monitoring component in the service monitoring component set.
13. The apparatus of claim 8, wherein the execution unit is further configured to:
determining a function corresponding to the structural program tool package, and further determining a logic core of the cluster master node corresponding to the function;
the functions are assigned to the logical cores.
14. A cluster deployment electronic device, comprising:
One or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
15. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-7.
16. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-7.
CN202310222057.5A 2023-03-09 2023-03-09 Cluster deployment method and device, electronic equipment and computer readable medium Pending CN116302716A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310222057.5A CN116302716A (en) 2023-03-09 2023-03-09 Cluster deployment method and device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310222057.5A CN116302716A (en) 2023-03-09 2023-03-09 Cluster deployment method and device, electronic equipment and computer readable medium

Publications (1)

Publication Number Publication Date
CN116302716A true CN116302716A (en) 2023-06-23

Family

ID=86816254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310222057.5A Pending CN116302716A (en) 2023-03-09 2023-03-09 Cluster deployment method and device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN116302716A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149095A (en) * 2023-10-31 2023-12-01 苏州元脑智能科技有限公司 NAS-based cluster management method, NAS-based cluster management device, computer equipment and media

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149095A (en) * 2023-10-31 2023-12-01 苏州元脑智能科技有限公司 NAS-based cluster management method, NAS-based cluster management device, computer equipment and media
CN117149095B (en) * 2023-10-31 2024-02-06 苏州元脑智能科技有限公司 NAS-based cluster management method, NAS-based cluster management device, computer equipment and media

Similar Documents

Publication Publication Date Title
CN110263054B (en) SQL work order auditing system, method and device and computer equipment
US10078563B2 (en) Preventing split-brain scenario in a high-availability cluster
CN107295080B (en) Data storage method applied to distributed server cluster and server
CN108696581B (en) Distributed information caching method and device, computer equipment and storage medium
CN106487486B (en) Service processing method and data center system
JP6325001B2 (en) Method and system using recursive event listeners in nodes of hierarchical data structures
CN106878363B (en) Information processing method, device and system
WO2021103499A1 (en) Multi-active data center-based traffic switching method and device
CN107729176B (en) Disaster recovery method and disaster recovery system for configuration file management system
CN107508694B (en) Node management method and node equipment in cluster
US20200125453A1 (en) Systems and methods for cross-regional back up of distributed databases on a cloud service
WO2016074167A1 (en) Lock server malfunction processing method and system thereof in distribution system
KR102114339B1 (en) Method for operating kubernetes system supporting active/standby model
CN112217847A (en) Micro service platform, implementation method thereof, electronic device and storage medium
CN111158949A (en) Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium
CN116302716A (en) Cluster deployment method and device, electronic equipment and computer readable medium
CN111865632A (en) Switching method of distributed data storage cluster and switching instruction sending method and device
CN116302352A (en) Cluster disaster recovery processing method and device, electronic equipment and storage medium
CN114338684B (en) Energy management system and method
CN113126925B (en) Member list determining method, device and equipment and readable storage medium
US10853892B2 (en) Social networking relationships processing method, system, and storage medium
CN111581285A (en) Data information synchronization method and device, electronic equipment and medium
CN111708843A (en) Cross-data-center MySQL multi-activity implementation method based on MGR
CN114143330A (en) Configuration method, device and system of time server
CN109753292B (en) Method and device for deploying multiple applications in multiple single instance database service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination