CN106100894A - A kind of highly reliable cluster operation management method - Google Patents

A kind of highly reliable cluster operation management method Download PDF

Info

Publication number
CN106100894A
CN106100894A CN201610542731.8A CN201610542731A CN106100894A CN 106100894 A CN106100894 A CN 106100894A CN 201610542731 A CN201610542731 A CN 201610542731A CN 106100894 A CN106100894 A CN 106100894A
Authority
CN
China
Prior art keywords
cluster
management
highly reliable
control
web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610542731.8A
Other languages
Chinese (zh)
Other versions
CN106100894B (en
Inventor
向友君
张莉婷
吴宗泽
张勰
蔡旭坤
李凯鑫
苏春晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201610542731.8A priority Critical patent/CN106100894B/en
Publication of CN106100894A publication Critical patent/CN106100894A/en
Application granted granted Critical
Publication of CN106100894B publication Critical patent/CN106100894B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0478Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload applying multiple layers of encryption, e.g. nested tunnels or encrypting the content with a first key and then with at least a second key
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/18Network architectures or network communication protocols for network security using different networks or channels, e.g. using out of band channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The invention discloses a kind of highly reliable cluster operation management platform method, specifically include: the web of (1) highly reliable cluster management and control order accesses the scheduling with http form and issues: build cluster O&M web-based management platform, realize remotely management and the visualized management of cluster, load-balancing technique is passed through from Access Layer, dispatch layer, middle control layer, redundancy fault-tolerant, it is achieved the reliability of cluster O&M web-based management;(2) transmission of highly reliable cluster management and control order with issue: in data transmission procedure, AES, RC4 algorithm is used respectively transmission data to be encrypted with AES key, by ssh tunnel transmission after data base64 coding after encryption, it is achieved the data reliability of cluster operation management.(3) execution of highly reliable cluster management and control order and feedback: build expansible cluster O&M central authorities O&M control system, support various configurations Governance framework, support User Defined Configuration Framework, it is achieved the middle control reliability of cluster operation management.

Description

A kind of highly reliable cluster operation management method
Technical field
The present invention relates to the technical field of IT operation management, particularly to a kind of highly reliable cluster operation management method.
Background technology
High speed development along with Internet technology and the competing product with type of service emerge in an endless stream, and user is to service quality Require tightened up.In the face of the pressure from user, Internet firm has been usually taken distributed type assemblies deployment services, utilizes it high Performance, high reliability, high scalability solve the challenge that this is huge.With distributed type assemblies popularization, distributed type assemblies Internal correlation is complicated, and cluster management increasingly becomes the sane service key core of offer, becomes academia with the research of engineering circles One of hot issue.If manually building deployment cluster environment by operation maintenance personnel, manage service configuration, not only inefficiency, Reliability is low, and is difficult to migrate extension, is not easy to management.
The deployment workload brought to solve cluster scale to expand increases severely, configuration variance between heterogeneous server main frame, collection The configuration management of group rings border and extension, it is necessary to design new cluster operation management mode to carry out large-scale cluster automatization fortune Dimension.Cluster operation management method specifically should contain automatization's deployment, Host Status monitoring, Portable Batch System, machine configuration pipe The functions such as reason, log audit.
Summary of the invention
It is an object of the invention to the shortcoming overcoming prior art with not enough, it is provided that a kind of highly reliable cluster operation management Method.The method, on the redundancy disaster tolerance technology basis of load balancing, designs also in conjunction with configuration management framework SaltStack Realize safe and reliable cluster operation management method.A simple and effective manager is provided for middle-size and small-size scale cluster management Case, it is achieved safe and reliable remote-control cluster.According to hosted environment automatic deployment, reduce the artificial mistake disposed and cause, shorten Deployment time, improve comprehensively and dispose efficiency, and the mechanism persistently managing service configuration for a long time is provided.
The purpose of the present invention is achieved through the following technical solutions:
A kind of highly reliable cluster operation management method, described method comprises the following steps:
S1, the web of highly reliable cluster management and control order access the scheduling with http form and issue, based on LVS+Keepalive Load technology builds the HTTP server of two-node cluster hot backup, supports that artificial hot-swap, fault automatically switch, based on Nginx+ Tornado network frame technology builds cluster operation management Web platform, Nginx realize load balancing and reverse proxy;
S2, highly reliable cluster management and control order transmission with issue, management and control data transmission time encrypted number by aes algorithm respectively According to, RC4 encryption key, transmitted by SSH secure tunnel after being encoded by base64, be suitable in Tornado network frame The central O&M control system of RPYC telecommunication technique management;
S3, the execution of highly reliable cluster management and control order and feedback, central authorities' O&M control system compatibility various configurations framework, tool Body includes Satlstack, Func, and supports to custom-configure framework, and Saltstack platform realizes carrying out clustered node main frame Management and control.
Further, described step S1, the web of highly reliable cluster management and control order access the scheduling with http form and issue bag Include:
S1.1, configuration LVS, it is achieved build cluster operation management platform Access Layer, it is achieved the load balancing of Access Layer;Configuration Keepalive builds the two-node cluster hot backup of cluster operation platform Access Layer, and amendment Keepalive key configuration also designs shell foot This realizes semi-artificial automatic switchover principal and subordinate HTTP server;
S1.2, configuration Nginx build cluster operation platform dispatch layer HTTP server, revise Nginx reverse proxy part Key configuration, it is achieved the load balancing of rear end Web server and request scheduling;Design tornado program builds cluster O&M Platform Web server layer, based on MVC exploitation Web server administration interface with service logic.
Further, described step S2, the transmission of highly reliable cluster management and control order include with issuing:
S2.1, tcp data segment use AES, RC4, base64 mode that data are encrypted coding;
Set up SSH trusting relationship between S2.2, cluster operation management platform and central authorities' O&M control system, pacified by SSH Full tunnel transmission encrypted data.
Further, described step S3, the execution of highly reliable cluster management and control order include with feedback:
S3.1, cluster service node deployment salt-minion, func-minion client, revise key configuration, to taking The cluster central authorities O&M control system built up sends certificate;
S3.2, central authorities' O&M control system manage the certificate accepting all trusted node of cluster internal, it is achieved to all letters Appoint the management and control of node, and the execution result of management and control order feeds back to upstream Web.
Further, described step S1, the web of highly reliable cluster management and control order access in the scheduling issue with http form Build LVS+Keepalive two-node cluster hot backup module, Nginx Http direction scheduler module, Tornado Web service degradation scheduling Pattern, the multi-level Load Balancing Model that above-mentioned three's simultaneous is formed.
Further, described cluster operation platform uses AES+ with the data communication mode of described central O&M control system The RPYC remote scheduling mode of RC4 AES and base64 coded system, stochastic generation session key and by the safe tunnel of SSH Road transmits.
Further, described step S3, the execution of highly reliable cluster management and control order realize with Saltstack platform in feedback Clustered node main frame is carried out management and control specifically include: remote command calls, automatization of service deployment, service configuration management, service Performance monitoring, log audit.
Further, described step S3, the execution of highly reliable cluster management and control order use Saltstack to constitute in feedback Automatization's deployment, data acquisition monitoring, service configuration management, wherein said automatization disposes to use based on yaml form joins Put file to manage concentratedly.
Further, described step S3, the execution of highly reliable cluster management and control order control with central authorities' O&M described in feedback System compatible volume Configuration Framework includes Satlstack, Func.
The present invention has such advantages as relative to prior art and effect:
(1) this paper presents multilamellar Load Balancing Model, both avoided unit overload causing trouble, and ensured again cluster system The redundancy disaster tolerance of system, ensure that the high reliability of O&M Visualization Platform.
(2) this paper presents multi-platform distributed central control system model, by multi-platform each other for road by the way of ensure base Plinth O&M function highly reliable, it is ensured that the high reliability of O&M central control system.
(3) there is employed herein the model of multi-enciphering, encryption tunnel, it is to avoid management and control data being transmitted across at untrusted network Journey is ravesdropping, distorts, it is ensured that data communications security reliability in operational system.
Accompanying drawing explanation
Fig. 1 is the process step figure of the cluster operation management of the inventive method;
Fig. 2 is the flow chart that the inventive method realizes cluster operation management safety.
Detailed description of the invention
For making the purpose of the present invention, technical scheme and advantage clearer, clear and definite, develop simultaneously embodiment pair referring to the drawings The present invention further describes.Should be appreciated that specific embodiment described herein, and need not only in order to explain the present invention In limiting the present invention.
Embodiment one
Refer to the process step figure that Fig. 1, Fig. 1 are cluster operation managements in the present embodiment.Highly reliable collection shown in Fig. 1 Group's operation management method, specifically includes following steps:
S1, the web of highly reliable cluster management and control order access the scheduling with http form and issue, based on LVS+Keepalive Load technology builds the HTTP server of two-node cluster hot backup, supports that artificial hot-swap, fault automatically switch, based on Nginx+ Tornado network frame technology builds cluster operation management Web platform, Nginx realize load balancing and reverse proxy.
This step specifically includes:
S1.1, configuration LVS, it is achieved build cluster operation management platform Access Layer, it is achieved the load balancing of Access Layer;Configuration Keepalive builds the two-node cluster hot backup of cluster operation platform Access Layer, and amendment Keepalive key configuration also designs shell foot This realizes semi-artificial automatic switchover principal and subordinate HTTP server;
S1.2, configuration Nginx build cluster operation platform dispatch layer HTTP server, revise Nginx reverse proxy part Key configuration, it is achieved the load balancing of rear end Web server and request scheduling;Design tornado program builds cluster O&M Platform Web server layer, based on MVC exploitation Web server administration interface with service logic.
In this step S1, the LVS+Keepalive technology composition of employing can the highly reliable Access Layer of artificial hot-swap, have Effect accesses;Load balancing and reverse proxy is realized, it then follows after request is uniformly distributed to by the dispatching principle of local first by Nginx The web services of end, carries out visualization and issues.
The scheduling issue that the web of described highly reliable cluster management and control order accesses with http form includes building LVS+ Keepalive two-node cluster hot backup module, Nginx Http direction scheduler module, Tornado Web service degradation scheduling method.Wherein The multi-level Load Balancing Model that three's simultaneous is formed, emphasis solves the integrity problem in cluster O&M method.
S2, highly reliable cluster management and control order transmission with issue, management and control data transmission time encrypted number by aes algorithm respectively According to, RC4 encryption key, transmitted by SSH secure tunnel after being encoded by base64, be suitable in Tornado network frame The central O&M control system of RPYC telecommunication technique management.
This step specifically includes:
S2.1, tcp data segment use AES, RC4, base64 mode that data are encrypted coding;
Set up SSH trusting relationship between S2.2, cluster operation management platform and central authorities' O&M control system, pacified by SSH Full tunnel transmission encrypted data.
In this step S2, cluster operation platform is that RYPC remotely adjusts with the data communication mode of central authorities' O&M control system With.The data of transmission are encrypted by AES, RC4, base64, and carry out safe transmission by SSH secure tunnel.
The transmission of described highly reliable cluster management and control order with issue middle employing AES+RC4 AES and base64 coding staff The RPYC remote scheduling mode of formula, stochastic generation session key and being transmitted by SSH secure tunnel, safe and reliable can realize cluster Management and control data are transmitted.
Emphasis solves the safety issue in O&M method.
S3, the execution of highly reliable cluster management and control order and feedback, central authorities' O&M control system compatibility various configurations framework, tool Body includes Satlstack, Func, and supports to custom-configure framework, and Saltstack platform realizes carrying out clustered node main frame Management and control, specifically includes: remote command calls, automatization of service deployment, service configuration management, service performance monitoring, log audit.
This step specifically includes:
S3.1, cluster service node deployment salt-minion, func-minion client, revise key configuration, to taking The cluster central authorities O&M control system built up sends certificate;
S3.2, central authorities' O&M control system manage the certificate accepting all trusted node of cluster internal, it is achieved to all letters Appoint the management and control of node, and the execution result of management and control order feeds back to upstream Web.
Wherein, central authorities' O&M control system compatible various configurations Governance framework design, and the various of cluster O&M are provided Basic management function;Execution and the execution result thereof of management and control order feed back to upstream Web.
Perform and the feedback of described highly reliable cluster management and control order mainly have employed the Automation that Saltstack is constituted The module compositions such as administration, data acquisition monitoring, service configuration management, automatization's deployment module mainly uses based on yaml form joining Put file to manage concentratedly.
Embodiment two
The present embodiment specifically gives the implementation process of a kind of highly reliable cluster operation management method, specifically comprises the following steps that
1) basic environment is disposed.
According to cluster operational system master-plan, build prototype system herein and be divided into O&M Web platform and the central authorities of this locality Two sub-networks of O&M control system.Gateway gateway function is to LVS virtual IP address by external public network address IP port mapping On;WebNode function O&M Web platform service node, is deployed on local physical host, and the system that minimizes is by two WebNode main frame achieves load balancing layer and all functions of O&M Web platform;ControlNode refers to control in O&M System Service Node, ClusterNode refers to group system internal service node.
2) access load layer to dispose.
First source code is installed the Keepalived service software of latest edition and carries out simple environment configurations.Then create Build Keepalived global configuration file/etc/Keepalived/Keepalived.conf, be broadly divided into the automatic mistake of VRRP Lose switching (vrrp_instance) and Vitural Server load balancing (virtual_server) two parts.
The major function of the Nginx_check.sh script in configuration is every 10s detection Nginx service, if Nginx loses Effect is then restarted.If it is unavailable to restart unsuccessfully explanation native service, then stopping the machine Keepalived, switching flow is to another On main frame, it is to avoid invalid traffic.When arranging Virtual Server, the main frame weight of configuration this locality is 2, it is ensured that request is preferential Forward this locality, it is possible to effectively reduce unnecessary network traffics.
3) Nginx reverse proxy is disposed.
Installation and deployment Nginx on WebNode1, WebNode2 server, establishment/etc/ after configuration software running environment Nginx/Nginx.conf file.The principle forwarded according to local first, arranging local load weight is 2.
4) operation layer is disposed.
The mode that Tornado uses one process single-threaded starts, and WebNode1, WebNode2 server is all opened port and divided Be not three threads of 8886~8888, wherein 8886,8887 respectively from different O&Ms control machine communication, 8888 as standby host line Journey, when all thread states are that busy is just used.Nginx will request according to upstream loading rule after receiving HTTP request The concrete business module of Tornado giving rear end processes.
5) RPYC server disposition.O&M central control system is the tie connecting O&M Web platform with cluster service node, main The function wanted is the access of O&M management and control order and forwards execution, real by RPYC server and service configuration management platform two parts Existing.
RPYC server is O&M central control system access dispatching functional module based on the exploitation of far call agreement RPYC, The member method of exposed_XX defined in Server class, then can be realized remotely by root method attribute at clinet end Call.
6) service configuration management Platform deployment.Salt-Master service, ClusterNode portion is disposed at ControlNode Administration's Salt-Minion service, amendment service profiles such as node identities, node IP, node grains information etc., then leading Salt-Master visa Slat-Minion certificate.Then, use Rsync synchronizing software same between multiple stage Slat-Master main frame Step Master main frame common configuration, can realize SaltStack Configuration Framework basic environment.
In sum, the present invention starts with from the technical scheme that investigation industry is common, the most appropriate for cluster operation management The key issue of solution: O&M Web platform high reliability, O&M central control system high reliability, management and control data transmission security can By property, provide corresponding solution, propose on this basis multilamellar load balancing, many cluster configuration management platform height can By cluster operational system framework, and the mode of multiple symmetric cryptography is used to solve operational system Communication Security Problem.
The embodiment of the present invention first passes through LVS technology and provides external Virtual Service and access dispatching, uses Keepalived+ Nginx builds the HTTP reverse proxy layer of two-shipper duplex, optimizes system resource profit while improving O&M Web platform reliability By rate;Secondly, in conjunction with the O&M thought of service degradation, service layer is pressed level priority degree service is provided, add further Strong system reliability, it is to avoid unit overload causing trouble, it is achieved the high reliability of O&M Web platform.
Platform is managed as system reserve, the manual switching when StackSalt platform fault, it is ensured that basis by Func Management and control module highly reliable.Saltstack platform achieves the management and control module of operation management, deployment module, monitoring module, uses Many Salt-Master distributed deployment, solves single-point problem and improves service performance, it is achieved that O&M central control system highly reliable Property.
On O&M Web platform with O&M central control system telecommunication problem, calculate herein in conjunction with RC4, AES symmetric cryptography Method, is individually created encryption key at each conversation procedure, reduces the possibility that encryption is cracked.Meanwhile, SSH secure tunnel skill is introduced Art encrypted transmission passage, further ensures that data are transmitted safe and reliable.
By to the load dispatch of system, highly reliable, systemic-function completes etc., and various dimensions are tested, and verify proposed by the invention Scheme can properly settle key issue:
(1) balance dispatching of multilamellar load module can make the Access Layer node of system be in the working method of multimachine multiplexing, The most effectively achieve access load balancing, be greatly promoted the resource utilization of system simultaneously relative to hot standby working method;
(2) redundancy disaster tolerance, unsuccessfully automatic switchover, the O&M theory of service degradation are combined during O&M Web Platform Designing, On the one hand solve Single Point of Faliure problem, on the other hand ensure when fault occurs, high excellent reliability of service;
(3) O&M central control system uses distributed many Salt-Master to dispose, in conjunction with the modularized design of O&M function, Provide redundancy disaster tolerance, solve the high concurrent problem effectively solving clustered node management.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not by above-described embodiment Limit, the change made under other any spirit without departing from the present invention and principle, modify, substitute, combine, simplify, All should be the substitute mode of equivalence, within being included in protection scope of the present invention.

Claims (9)

1. a highly reliable cluster operation management method, it is characterised in that described method comprises the following steps:
S1, the web of highly reliable cluster management and control order access the scheduling with http form and issue, and load based on LVS+Keepalive Technology builds the HTTP server of two-node cluster hot backup, supports that artificial hot-swap, fault automatically switch, based on Nginx+Tornado net Network framework technology builds cluster operation management Web platform, Nginx realize load balancing and reverse proxy;
S2, highly reliable cluster management and control order transmission with issue, management and control data transmission time respectively by aes algorithm encryption data, RC4 Encryption key, is transmitted by SSH secure tunnel after being encoded by base64, is suitable for RPYC long-range in Tornado network frame The central O&M control system of communication technology management;
S3, the execution of highly reliable cluster management and control order and feedback, central authorities' O&M control system compatibility various configurations framework, specifically wraps Including Satlstack, Func, and support to custom-configure framework, Saltstack platform realizes managing clustered node main frame Control.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S1, The scheduling issue that the web of highly reliable cluster management and control order accesses with http form includes:
S1.1, configuration LVS, it is achieved build cluster operation management platform Access Layer, it is achieved the load balancing of Access Layer;Configuration Keepalive builds the two-node cluster hot backup of cluster operation platform Access Layer, and amendment Keepalive key configuration also designs shell foot This realizes semi-artificial automatic switchover principal and subordinate HTTP server;
S1.2, configuration Nginx build cluster operation platform dispatch layer HTTP server, the pass of amendment Nginx reverse proxy part Key configures, it is achieved the load balancing of rear end Web server and request scheduling;Design tornado program builds cluster operation platform Web server layer, based on MVC exploitation Web server administration interface with service logic.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S2, The transmission of highly reliable cluster management and control order includes with issuing:
S2.1, tcp data segment use AES, RC4, base64 mode that data are encrypted coding;
SSH trusting relationship is set up, by the safe tunnel of SSH between S2.2, cluster operation management platform and central authorities' O&M control system Road transmission encrypted data.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S3, The execution of highly reliable cluster management and control order includes with feedback:
S3.1, cluster service node deployment salt-minion, func-minion client, revise key configuration, to putting up Cluster central authorities O&M control systems send certificate;
S3.2, central authorities' O&M control system manage the certificate accepting all trusted node of cluster internal, it is achieved trust joint to all The management and control of point, and the execution result of management and control order feeds back to upstream Web.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S1, The web of highly reliable cluster management and control order accesses during the scheduling with http form is issued and builds LVS+Keepalive two-node cluster hot backup mould Block, Nginx Http direction scheduler module, Tornado Web service degradation scheduling method, above-mentioned three's simultaneous is formed many Level Load Balancing Model.
A kind of highly reliable cluster operation management method the most according to claim 3, it is characterised in that described cluster O&M Platform uses AES+RC4 AES and base64 coded system with the data communication mode of described central authorities O&M control system RPYC remote scheduling mode, stochastic generation session key and being transmitted by SSH secure tunnel.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S3, With Saltstack platform in feedback, the execution of highly reliable cluster management and control order realizes that clustered node main frame is carried out management and control and specifically wraps Include: remote command calls, automatization of service deployment, service configuration management, service performance monitoring, log audit.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S3, The execution of highly reliable cluster management and control order and feedback use automatization's deployment of Saltstack composition, data acquisition monitoring, clothes Business configuration management, wherein said automatization disposes and uses configuration file based on yaml form to manage concentratedly.
A kind of highly reliable cluster operation management method the most according to claim 1, it is characterised in that described step S3, The execution of highly reliable cluster management and control order includes with central authorities' O&M control system compatibility volume Configuration Framework described in feedback Satlstack、Func。
CN201610542731.8A 2016-07-11 2016-07-11 A kind of highly reliable cluster operation management method Expired - Fee Related CN106100894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610542731.8A CN106100894B (en) 2016-07-11 2016-07-11 A kind of highly reliable cluster operation management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610542731.8A CN106100894B (en) 2016-07-11 2016-07-11 A kind of highly reliable cluster operation management method

Publications (2)

Publication Number Publication Date
CN106100894A true CN106100894A (en) 2016-11-09
CN106100894B CN106100894B (en) 2019-04-09

Family

ID=57219831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610542731.8A Expired - Fee Related CN106100894B (en) 2016-07-11 2016-07-11 A kind of highly reliable cluster operation management method

Country Status (1)

Country Link
CN (1) CN106100894B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789225A (en) * 2016-12-13 2017-05-31 广州唯品会信息科技有限公司 A kind of method and device of interface operation port mapping configuration
CN106777079A (en) * 2016-12-13 2017-05-31 苏州蜗牛数字科技股份有限公司 A kind of daily record data Visualized Analysis System and method
CN106850305A (en) * 2017-02-16 2017-06-13 郑州云海信息技术有限公司 A kind of IT operation management method and device
CN107193670A (en) * 2017-05-25 2017-09-22 郑州云海信息技术有限公司 A kind of method for remote management of cluster of workstation, apparatus and system
CN108241565A (en) * 2016-12-26 2018-07-03 航天信息股份有限公司 A kind of system and method for being used to implement application system automation O&M
CN108549545A (en) * 2018-04-20 2018-09-18 武汉极意网络科技有限公司 A kind of project organization method and system based on tornado frames
CN109086189A (en) * 2018-07-23 2018-12-25 郑州云海信息技术有限公司 A kind of physical infrastructure manager PIM alert processing method and equipment
CN109558256A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 Controlled terminal automatic recovery method and device
CN110633564A (en) * 2018-06-25 2019-12-31 北京国双科技有限公司 File generation method and device
CN112559519A (en) * 2020-12-09 2021-03-26 北京红山信息科技研究院有限公司 Big data cluster management system
CN112751709A (en) * 2020-12-29 2021-05-04 北京浪潮数据技术有限公司 Management method, device and system of storage cluster
CN112818045A (en) * 2021-01-22 2021-05-18 辽宁长江智能科技股份有限公司 Data access unified management platform for big data
CN113296840A (en) * 2020-02-20 2021-08-24 银联数据服务有限公司 Cluster operation and maintenance method and device
CN113343269A (en) * 2021-06-28 2021-09-03 迈普通信技术股份有限公司 Encryption method and device
CN113489684A (en) * 2021-06-11 2021-10-08 快乐购有限责任公司 Communication device, method for calling service between cloud and intranet

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104683394A (en) * 2013-11-27 2015-06-03 上海墨芋电子科技有限公司 Cloud computing platform database benchmark test system for new technology and method thereof
US20150169209A1 (en) * 2011-05-13 2015-06-18 General Electric Company System and method for multi-tasking of a medical imaging system
CN105320773A (en) * 2015-11-03 2016-02-10 中国人民解放军理工大学 Distributed duplicated data deleting system and method based on Hadoop platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150169209A1 (en) * 2011-05-13 2015-06-18 General Electric Company System and method for multi-tasking of a medical imaging system
CN104683394A (en) * 2013-11-27 2015-06-03 上海墨芋电子科技有限公司 Cloud computing platform database benchmark test system for new technology and method thereof
CN105320773A (en) * 2015-11-03 2016-02-10 中国人民解放军理工大学 Distributed duplicated data deleting system and method based on Hadoop platform

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106777079A (en) * 2016-12-13 2017-05-31 苏州蜗牛数字科技股份有限公司 A kind of daily record data Visualized Analysis System and method
CN106789225A (en) * 2016-12-13 2017-05-31 广州唯品会信息科技有限公司 A kind of method and device of interface operation port mapping configuration
CN108241565A (en) * 2016-12-26 2018-07-03 航天信息股份有限公司 A kind of system and method for being used to implement application system automation O&M
CN106850305A (en) * 2017-02-16 2017-06-13 郑州云海信息技术有限公司 A kind of IT operation management method and device
CN107193670B (en) * 2017-05-25 2021-03-26 苏州浪潮智能科技有限公司 Remote management method, device and system for cluster workstations
CN107193670A (en) * 2017-05-25 2017-09-22 郑州云海信息技术有限公司 A kind of method for remote management of cluster of workstation, apparatus and system
CN109558256B (en) * 2017-09-26 2023-04-07 北京国双科技有限公司 Controlled terminal automatic recovery method and device
CN109558256A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 Controlled terminal automatic recovery method and device
CN108549545A (en) * 2018-04-20 2018-09-18 武汉极意网络科技有限公司 A kind of project organization method and system based on tornado frames
CN110633564A (en) * 2018-06-25 2019-12-31 北京国双科技有限公司 File generation method and device
CN110633564B (en) * 2018-06-25 2022-01-14 北京国双科技有限公司 File generation method and device
CN109086189A (en) * 2018-07-23 2018-12-25 郑州云海信息技术有限公司 A kind of physical infrastructure manager PIM alert processing method and equipment
CN113296840A (en) * 2020-02-20 2021-08-24 银联数据服务有限公司 Cluster operation and maintenance method and device
CN112559519A (en) * 2020-12-09 2021-03-26 北京红山信息科技研究院有限公司 Big data cluster management system
CN112751709A (en) * 2020-12-29 2021-05-04 北京浪潮数据技术有限公司 Management method, device and system of storage cluster
CN112751709B (en) * 2020-12-29 2023-01-10 北京浪潮数据技术有限公司 Management method, device and system of storage cluster
CN112818045A (en) * 2021-01-22 2021-05-18 辽宁长江智能科技股份有限公司 Data access unified management platform for big data
CN113489684A (en) * 2021-06-11 2021-10-08 快乐购有限责任公司 Communication device, method for calling service between cloud and intranet
CN113343269A (en) * 2021-06-28 2021-09-03 迈普通信技术股份有限公司 Encryption method and device

Also Published As

Publication number Publication date
CN106100894B (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN106100894B (en) A kind of highly reliable cluster operation management method
CN104935672B (en) Load balancing service high availability implementation method and equipment
US7761573B2 (en) Seamless live migration of virtual machines across optical networks
Collins et al. Online payments by merely broadcasting messages
CN103596652B (en) A kind of network control method and device
KR101408037B1 (en) Virtual Machine Integration Monitoring Apparatus and method for Cloud system
CN101883108B (en) Document transmission method and system of dynamic authentication
CN103197952A (en) Management system and method aiming at maintenance and deployment of application system based on cloud infrastructure
WO2016206456A1 (en) Physical machine upgrading method, service migration method and apparatus
CN103986786A (en) Remote cloud desktop operation system
CN105554015A (en) Management network and method for multi-tenant container cloud computing system
CN102932455B (en) Construction method based on cloud computing render farms
WO2012149718A1 (en) Method for cloud terminal to access cloud server in cloud computing system, and cloud computing system
CN105049419A (en) Mimicry-network step-by-step exchange routing system based on heterogeneous diversity
CN113778615B (en) Rapid and stable network shooting range virtual machine construction system
CN106775993A (en) A kind of physical machine is migrated to the method and system of cloud computing platform
CN103581325A (en) Cloud computing resource pool system and implement method thereof
CN108833610A (en) A kind of information updating method, apparatus and system
CN116389105B (en) Remote access management platform and management method
CN203135901U (en) Encryption equipment management device
CN105227577A (en) Unified database access agent equalization methods under a kind of multi-client
CN110213359A (en) A kind of car networking networking data delivery system and method based on D2D
CN1988465A (en) Managing and monitoring method for dynamic IP network VPN
CN103067476B (en) A kind of dynamic network reconstruction method based on virtual machine
CN103297514A (en) Virtual machine management platform and virtual machine management method based on cloud infrastructure

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190409

Termination date: 20210711

CF01 Termination of patent right due to non-payment of annual fee