CN109284204B - Big data platform operation and maintenance method and system based on virtualization computing - Google Patents

Big data platform operation and maintenance method and system based on virtualization computing Download PDF

Info

Publication number
CN109284204B
CN109284204B CN201811047942.XA CN201811047942A CN109284204B CN 109284204 B CN109284204 B CN 109284204B CN 201811047942 A CN201811047942 A CN 201811047942A CN 109284204 B CN109284204 B CN 109284204B
Authority
CN
China
Prior art keywords
virtual machine
server
mirror image
image
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811047942.XA
Other languages
Chinese (zh)
Other versions
CN109284204A (en
Inventor
黄桥藩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Sinoregal Software Co ltd
Original Assignee
Fujian Sinoregal Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Sinoregal Software Co ltd filed Critical Fujian Sinoregal Software Co ltd
Priority to CN201811047942.XA priority Critical patent/CN109284204B/en
Publication of CN109284204A publication Critical patent/CN109284204A/en
Application granted granted Critical
Publication of CN109284204B publication Critical patent/CN109284204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Stored Programmes (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a big data platform operation and maintenance method based on virtualization computing, which comprises the steps of establishing a virtual machine mirror image, wherein the virtual machine mirror image comprises a system configuration file and a user environment variable; after the virtual machine mirror image is generated, storing the virtual machine mirror image in a mirror image warehouse server, and managing mirror image versions according to the time sequence generated by the virtual machine mirror image; sending the virtual machine image of the corresponding version to a server according to the requirement, and running the virtual machine image by the server; the invention also provides a large data platform operation and maintenance system based on the virtualization computing, which is convenient for user maintenance.

Description

Big data platform operation and maintenance method and system based on virtualization computing
Technical Field
The invention relates to a large data platform operation and maintenance method and system based on virtualization computing.
Background
The existing big data operation and maintenance is based on an SSH remote connection server of a server IP address, a command management server is executed through an automatic script, hundreds of servers are often arranged in a big data cluster, if manual operation is adopted for maintenance, the efficiency is very low, the difficulty is very high, a large number of operation and maintenance scripts are disorderly, the content is repeated, the quality is difficult to guarantee, finally, hidden dangers are left for faults, the definition and management of the expected states of the existing network are omitted, all the states of the existing network are products accumulated by the scripts day and month, the state drift of the server is caused, and further, the hidden dangers are left for the service stability.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a large data platform operation and maintenance method and system based on virtualization computing, which change the traditional distributed script execution into distributed container replacement, and facilitate maintenance for users.
One of the present invention is realized by: a big data platform operation and maintenance method based on virtualization computing comprises the following steps:
step 1, establishing a virtual machine image, wherein the virtual machine image comprises a system configuration file and a user environment variable;
step 2, after the virtual machine mirror image is generated, storing the virtual machine mirror image in a mirror image warehouse server, and managing mirror image versions according to the time sequence generated by the virtual machine mirror image;
and 3, sending the virtual machine image of the corresponding version to a server according to the requirement, and operating the virtual machine image by the server.
And step 4, collecting log files of the whole big data server cluster, and feeding back information of servers with abnormal states in log analysis through the servers with abnormal states in the log analysis.
And step 5, performing algorithm analysis on the feedback abnormal server, judging whether mirror image recovery operation is needed, and if the mirror image recovery operation is needed, informing the mirror image warehouse server to update the normal mirror image and updating the abnormal server.
The second invention is realized by the following steps: a big data platform operation and maintenance system based on virtualization computing comprises:
the system comprises an establishing module, a judging module and a judging module, wherein the establishing module is used for establishing a virtual machine mirror image, and the virtual machine mirror image comprises a system configuration file and a user environment variable;
the management module is used for generating a virtual machine image, storing the virtual machine image into the image warehouse server and managing an image version according to the time sequence generated by the virtual machine image;
and the updating module is used for sending the virtual machine image of the corresponding version to the server according to the requirement, and the server runs the virtual machine image.
The system further comprises a log module, wherein the log module is used for collecting log files of the whole big data server cluster, and feeding back the information of servers with abnormal conditions in log analysis through the servers with abnormal conditions in the log analysis.
And further, the system also comprises a recovery module which is used for carrying out algorithm analysis on the fed-back abnormal server and judging whether mirror image recovery operation is needed or not, and if the mirror image recovery operation is needed, informing the mirror image warehouse server to update the normal mirror image and updating the abnormal server.
Further, the establishing of the virtual machine image is further specifically: instantiating the virtual machine image into a container in the virtual machine of the server, then running on the virtual machine of the server, and finally storing the virtual machine as the virtual machine image.
The invention has the following advantages:
1) The version management of the big data platform is realized through virtualization: and the quick recovery and upgrade iteration of the big data cluster are realized through the virtualization of realizing the system environment, configuration and data separation.
2) Automated failure recovery: by the automatic operation and maintenance system of the big data platform, the mirror image recovery of the fault node can be automatically completed when the fault server node occurs in the cluster, and the automatic recovery of the fault node of the big data platform is realized.
3) The labor cost of operation and maintenance is saved: the script management configuration is converted into the mirror image version management configuration, the cost of manual intervention can be greatly reduced through the automatic operation and maintenance system, and most of the automatic and intelligent operation and maintenance targets can be realized.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
As shown in fig. 1, the operation and maintenance method for a big data platform based on virtualization computing of the present invention includes:
step 1, establishing a virtual machine image, wherein the virtual machine image comprises a system configuration file and a user environment variable;
step 2, after the virtual machine mirror image is generated, storing the virtual machine mirror image in a mirror image warehouse server, and managing mirror image versions according to the time sequence generated by the virtual machine mirror image;
and 3, sending the virtual machine image of the corresponding version to a server according to the requirement, wherein the server runs the virtual machine image.
And 4, collecting log files of the whole big data server cluster, and feeding back the information of the servers with the abnormal log analysis through the servers with the abnormal log analysis.
And 5, performing algorithm analysis on the fed-back abnormal server, judging whether mirror image recovery operation is needed, and if the mirror image recovery operation is needed, informing the mirror image warehouse server to update the normal mirror image and updating the abnormal server.
The invention relates to a large data platform operation and maintenance system based on virtualization computing, which comprises:
the system comprises an establishing module, a judging module and a judging module, wherein the establishing module is used for establishing a virtual machine mirror image, and the virtual machine mirror image comprises a system configuration file and a user environment variable;
the management module is used for generating a virtual machine image, storing the virtual machine image into the image warehouse server and managing an image version according to the time sequence generated by the virtual machine image;
and the updating module is used for sending the virtual machine image of the corresponding version to the server according to the requirement, and the server runs the virtual machine image.
And the log module is used for acquiring log files of the whole big data server cluster, and feeding back the information of the servers with abnormity in log analysis through the servers with abnormity in log analysis.
And the recovery module is used for carrying out algorithm analysis on the fed-back abnormal server to judge whether mirror image recovery operation is needed or not, and if the mirror image recovery operation is needed, informing the mirror image warehouse server to update the normal mirror image and updating the abnormal server.
The virtual machine mirror image generates a container through instantiation, the container is operated in a server, and the file of the server is directly mapped to the virtual machine through a file mapping mechanism of the virtual machine, so that the virtual machine directly accesses the file on the server.
One specific embodiment of the present invention:
the scheme structure is as follows: mirror image generation based on container technology, mirror image warehouse version management, a log monitoring system and an automatic operation and maintenance system of a virtualization big data platform.
The method mainly comprises the following steps:
a mirror image generation server
1) Virtualization based on container technology: by separating the kernel bottom operating system of the Linux system from the system configuration file, the virtual machine image realized by the invention mainly comprises the system configuration file, the user environment variable and the like, and belongs to lightweight virtualization. Namely, the bottom linux kernel of the host machine is used as a container, the virtual machines configured in various versions run in the bottom container, and only the basic configuration and the system environment variables of the user exist in the virtual machines.
2) And (3) generating a lightweight mirror image file: the data volume of the big data platform is very huge, the mirror image file realizes the separation of user service data and system configuration environment variables, the mirror image of the invention does not contain the service data of the big data platform, the generation of the light-weight mirror image file is realized, the light-weight mirror image can be used for updating the second-level big data platform system mirror image, the quick switching of the configuration version of the big data platform is realized, and the normal operation of the service is not influenced.
B mirror image warehouse server
1) Managing the mirror image version: the mirror image is generated and then stored in a mirror image warehouse server, the server manages the mirror image version, and the mirror image version is managed according to the generated time sequence;
2) Issuing a mirror image version: the system can be restored by rolling back the system version of the big data platform through the version of the mirror image manager, or the system and the configuration of the big data platform can be upgraded by managing the version of the system mirror image which is configured in advance.
Log monitoring system
1) Log collection and analysis: and collecting log files of the whole big data server cluster, and analyzing abnormal servers through logs.
2) Abnormal feedback: and feeding back information such as server ID and the like with abnormality in log analysis to the automatic operation and maintenance system of the big data platform.
Automatic operation and maintenance system for big data platform
Exception server processing decisions: and performing algorithm analysis aiming at the abnormal server fed back by the log server, and judging whether mirror image recovery operation is required. Then, the mirror image warehouse server is informed to update the normal mirror image version and update the version for the abnormal server, because the big data platform is in the HA mode, the shutdown switching of one server does not affect the service operation in the same time; if the mirror image recovery operation is not needed, the task processing is not needed.
And (3) configuration management of a normal image file: the server can run a normal image file and verify a related big data configuration file (for an operator to verify whether the configuration is the same as the actual requirement), so that the centralized management of subsequent configuration files can be realized, and the configuration is edited by one server, namely the upgrading of the big data configuration file can be realized by the whole cluster.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (4)

1. A big data platform operation and maintenance method based on virtualization computing is characterized in that: the method comprises the following steps:
step 1, establishing a virtual machine image, wherein the virtual machine image comprises a system configuration file of a big data platform system and a user environment variable;
step 2, after the virtual machine mirror image is generated, storing the virtual machine mirror image in a mirror image warehouse server, and managing mirror image versions according to the time sequence generated by the virtual machine mirror image;
step 3, sending the virtual machine image of the corresponding version to a server according to the requirement, and running the virtual machine image by the server;
step 4, collecting log files of the whole big data server cluster, and feeding back server information with abnormality in log analysis through the server with abnormality in log analysis;
and step 5, performing algorithm analysis on the fed-back abnormal server, judging whether mirror image recovery operation is needed, and if mirror image recovery operation is needed, informing the mirror image warehouse server to update the normal mirror image and updating the abnormal server.
2. The large data platform operation and maintenance method based on virtualization computing according to claim 1, wherein: the establishing of the virtual machine image is further specifically as follows: instantiating the virtual machine image into a container in the virtual machine of the server, then running on the virtual machine of the server, and finally storing the virtual machine as the virtual machine image.
3. A big data platform operation and maintenance system based on virtualization computing is characterized in that: the method comprises the following steps:
the system comprises an establishing module, a storage module and a processing module, wherein the establishing module is used for establishing a virtual machine mirror image, and the virtual machine mirror image comprises a system configuration file of a big data platform system and a user environment variable;
the management module is used for generating a virtual machine image, storing the virtual machine image into the image warehouse server and managing an image version according to the time sequence generated by the virtual machine image;
the updating module is used for sending the virtual machine image of the corresponding version to the server according to the requirement, and the server runs the virtual machine image;
the system also comprises a log module, a log analysis module and a data analysis module, wherein the log module is used for collecting log files of the whole big data server cluster, analyzing abnormal servers through logs and feeding back information of the abnormal servers in the log analysis;
the mirror image recovery system further comprises a recovery module, the recovery module is used for carrying out algorithm analysis on the fed-back abnormal server and judging whether mirror image recovery operation is needed or not, and if the mirror image recovery operation is needed, the mirror image warehouse server is informed to update the normal mirror image to the abnormal server for updating.
4. The big data platform operation and maintenance system based on virtualization computing as claimed in claim 3, wherein: the establishing of the virtual machine image is further specifically as follows: instantiating the virtual machine image into a container in the virtual machine of the server, then running the container on the virtual machine of the server, and finally storing the virtual machine as the virtual machine image.
CN201811047942.XA 2018-09-10 2018-09-10 Big data platform operation and maintenance method and system based on virtualization computing Active CN109284204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811047942.XA CN109284204B (en) 2018-09-10 2018-09-10 Big data platform operation and maintenance method and system based on virtualization computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811047942.XA CN109284204B (en) 2018-09-10 2018-09-10 Big data platform operation and maintenance method and system based on virtualization computing

Publications (2)

Publication Number Publication Date
CN109284204A CN109284204A (en) 2019-01-29
CN109284204B true CN109284204B (en) 2022-10-25

Family

ID=65183901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811047942.XA Active CN109284204B (en) 2018-09-10 2018-09-10 Big data platform operation and maintenance method and system based on virtualization computing

Country Status (1)

Country Link
CN (1) CN109284204B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061540B (en) * 2019-11-27 2023-05-23 北京计算机技术及应用研究所 Application virtualization method and system based on container technology
CN111953788A (en) * 2020-08-17 2020-11-17 浪潮云信息技术股份公司 Large-scale cloud platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550012A (en) * 2015-12-07 2016-05-04 国云科技股份有限公司 Method for custom recovery of malfunctioning virtual machine
CN106199696A (en) * 2016-06-29 2016-12-07 中国石油天然气股份有限公司 Earthquake data processing system and method
CN107294772A (en) * 2017-05-23 2017-10-24 甘肃万维信息技术有限责任公司 One kind combines Docker and realizes dynamic management and monitoring service system
CN107395762A (en) * 2017-08-30 2017-11-24 四川长虹电器股份有限公司 A kind of application service based on Docker containers accesses system and method

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100505640C (en) * 2006-01-26 2009-06-24 腾讯科技(深圳)有限公司 A method and system for software upgrade
US8615501B2 (en) * 2008-06-23 2013-12-24 International Business Machines Corporation Hypervisor service to provide image version control support
CN101540799B (en) * 2009-04-23 2013-01-16 深圳市融创天下科技股份有限公司 Mobile terminal software upgrading method
CN101594387A (en) * 2009-06-29 2009-12-02 北京航空航天大学 The virtual cluster deployment method and system
US20130139183A1 (en) * 2011-11-28 2013-05-30 Wyse Technology Inc. Creation or installation of a disk image for a target device having one of a plurality of hardware platforms
US9218192B2 (en) * 2013-01-13 2015-12-22 International Business Machines Corporation Information handling device locally reproducing received defects by selecting an appropriate virtual machine image
CN105912382A (en) * 2016-04-07 2016-08-31 浪潮电子信息产业股份有限公司 Mirror image management device, system and method
CN106528224B (en) * 2016-11-03 2020-08-04 腾讯科技(深圳)有限公司 Content updating method, server and system for Docker container
CN107066296B (en) * 2017-03-31 2020-09-25 北京奇艺世纪科技有限公司 Method and device for cleaning mirror image in cluster node
CN107196803B (en) * 2017-05-31 2019-11-22 中国人民解放军信息工程大学 The dynamic generation and maintaining method of isomery cloud host

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550012A (en) * 2015-12-07 2016-05-04 国云科技股份有限公司 Method for custom recovery of malfunctioning virtual machine
CN106199696A (en) * 2016-06-29 2016-12-07 中国石油天然气股份有限公司 Earthquake data processing system and method
CN107294772A (en) * 2017-05-23 2017-10-24 甘肃万维信息技术有限责任公司 One kind combines Docker and realizes dynamic management and monitoring service system
CN107395762A (en) * 2017-08-30 2017-11-24 四川长虹电器股份有限公司 A kind of application service based on Docker containers accesses system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CloudVS: Enabling version control for virtual machines in an open-source cloud under commodity settings;Chung Pan Tang等;《2012 IEEE Network Operations and Management Symposium》;20120420;第188-195页 *
浅析Docker容器技术的发展前景;易升海等;《电信工程技术与标准化》;20180614;第31卷(第6期);第88-91页 *

Also Published As

Publication number Publication date
CN109284204A (en) 2019-01-29

Similar Documents

Publication Publication Date Title
CN107689953B (en) Multi-tenant cloud computing-oriented container security monitoring method and system
CN110222036B (en) Method and system for automated database migration
CN110764786A (en) Optimized deployment resource and software delivery platform in cloud computing environment
CN108521339B (en) Feedback type node fault processing method and system based on cluster log
CN107992392B (en) Automatic monitoring and repairing system and method for cloud rendering system
CN108881477B (en) Distributed file acquisition monitoring method
CN104007994B (en) Updating method, upgrading method and upgrading system based on strategy storeroom interaction
US8930964B2 (en) Automatic event correlation in computing environments
CN105893225A (en) Automatic error processing method and device
US8539285B2 (en) Systems for agile error determination and reporting and methods thereof
US8032779B2 (en) Adaptively collecting network event forensic data
US20110296397A1 (en) Systems and methods for generating cached representations of host package inventories in remote package repositories
CN104679574A (en) Virtual machine image management system in cloud computing
CN103490941A (en) Real-time monitoring on-line configuration method in cloud computing environment
CN109240716B (en) Big data platform version management and rapid iterative deployment method and system
CN109284204B (en) Big data platform operation and maintenance method and system based on virtualization computing
CN111367618A (en) Code management method, system, terminal and medium based on docker
US10574552B2 (en) Operation of data network
CN104899505A (en) Software detection method and software detection device
US20140173065A1 (en) Automated configuration planning
CN115086148A (en) Optical network alarm processing method, system, equipment and storage medium
CN106126419A (en) The adjustment method of a kind of application program and device
CN110011827A (en) Towards doctor conjuncted multi-user's big data analysis service system and method
US11431571B2 (en) Monitoring time-base policy domain architecture
CN112068981A (en) Knowledge base-based fault scanning recovery method and system in Linux operating system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 350000 21 / F, building 5, f District, Fuzhou Software Park, 89 software Avenue, Gulou District, Fuzhou City, Fujian Province

Applicant after: FUJIAN SINOREGAL SOFTWARE CO.,LTD.

Address before: Floor 20-21, building 5, area F, Fuzhou Software Park, 89 software Avenue, Gulou District, Fuzhou City, Fujian Province 350000

Applicant before: FUJIAN SINOREGAL SOFTWARE CO.,LTD.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant