CN110677288A - Edge computing system and method generally used for multi-scene deployment - Google Patents

Edge computing system and method generally used for multi-scene deployment Download PDF

Info

Publication number
CN110677288A
CN110677288A CN201910911688.1A CN201910911688A CN110677288A CN 110677288 A CN110677288 A CN 110677288A CN 201910911688 A CN201910911688 A CN 201910911688A CN 110677288 A CN110677288 A CN 110677288A
Authority
CN
China
Prior art keywords
management
host
service
management server
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910911688.1A
Other languages
Chinese (zh)
Inventor
黄舒泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang 99Cloud Information Service Co Ltd
Original Assignee
Zhejiang 99Cloud Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang 99Cloud Information Service Co Ltd filed Critical Zhejiang 99Cloud Information Service Co Ltd
Priority to CN201910911688.1A priority Critical patent/CN110677288A/en
Publication of CN110677288A publication Critical patent/CN110677288A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The invention relates to the field of edge cloud computing, in particular to an edge computing system and method generally used for multi-scene deployment. The edge computing system and the method which are generally used for multi-scene deployment comprise a cloud platform and are characterized in that: the cloud platform is internally provided with a plurality of management servers which are respectively a configuration management server, a fault management server, a host management server, a service management server and a software management server. The beneficial effects are as follows: the magnitude deployment can be flexibly deployed in a severe environment to work, the robustness of the system is improved due to the reduction of the quantity, when the server is down, the system can be automatically recovered and put into service again in a very short time, and the maintenance cost caused by system expansion which possibly occurs in the future is reduced; and the method has ultra-low time delay, and improves the capability of high-complexity calculation.

Description

Edge computing system and method generally used for multi-scene deployment
Technical Field
The invention relates to the field of edge cloud computing, in particular to an edge computing system and method generally used for multi-scene deployment.
Background
In the prior art, as the 5G technology matures, a cloud operating system needs to be deployed in some remote areas or regional services with strong non-traditional technological strength according to the needs of customers to meet the data processing requirements of the remote areas or regional services, and a traditional cloud operating system is large in size, high in construction complexity, high in requirements for deployment environments and large in machine room; and the management mode of the traditional cloud platform cannot remotely configure and monitor the system, the response to the system error is slow, and the operation cost and the maintenance cost are high.
Disclosure of Invention
The invention aims to provide an edge computing system and method generally used for multi-scene deployment, and the technical scheme adopted by the invention is as follows:
the invention discloses an edge computing system and method generally used for multi-scene deployment, which comprises a cloud platform and is characterized in that: the cloud platform is internally provided with a plurality of management servers, namely a configuration management server, a fault management server, a host management server, a service management server and a software management server, wherein,
the configuration management is responsible for carrying out installation configuration of each component, and each time the system is started, the system stock service, the controller configuration service and the calculation configurator service are all re-executed, so that the system can be quickly restored to normal configuration after being restarted;
the fault management can count alarm times and check logs, and simultaneously comprises physical and virtual resources of a central cloud and an edge cloud;
the host management can monitor hardware resources and collect and synchronize virtual machine alarm, key processes and H/W faults from resource arrangement service, service management and configuration management; the host management can automatically restart the host by using different scheduling strategies according to cluster states, key processes, resource thresholds, faults of the physical host and the like under the condition that the virtual host is shut down;
the service management uses multiple channels to avoid the disconnection of communication and the split brain problem of the service and monitor the service state;
the software management provides a life cycle management mechanism for the shutdown problem of the virtual machine during upgrading, when hot migration is needed, resources on the host needing to be updated are automatically transferred to the available host, and the resources are automatically distributed to the updated host after updating is completed.
Further, the host management may use different scheduling policies to automatically restart the host according to a cluster state, a key process, a resource threshold, a failure of the physical host, and the like when the virtual host is powered off.
The invention has the beneficial effects that: the system can be automatically recovered and put into service again in a very short time when a server is down, so that the maintenance cost caused by system expansion which possibly occurs in the future is reduced; and the method has ultra-low time delay, and improves the capability of high-complexity calculation.
Drawings
FIG. 1 is an operating system architecture diagram of the present invention;
FIG. 2 is a schematic diagram of configuration management of the present invention;
FIG. 3 is a schematic diagram of the fault management of the present invention;
FIG. 4 is a schematic diagram of host management of the present invention;
FIG. 5 is a schematic diagram of the service management of the present invention;
FIG. 6 is a software management schematic of the present invention;
FIG. 7 is a diagram of a conventional cloud platform architecture of the present invention;
Detailed Description
The invention will be further described with reference to the following figures and examples.
The invention discloses an edge computing system and method generally used for multi-scene deployment, which comprises a cloud platform and is characterized in that: the cloud platform is internally provided with a plurality of management servers, namely a configuration management server, a fault management server, a host management server, a service management server and a software management server, wherein,
the configuration management is responsible for carrying out installation configuration of each component, and each time the system is started, the system stock service, the controller configuration service and the calculation configurator service are all re-executed, so that the system can be quickly restored to normal configuration after being restarted;
the fault management can count alarm times and check logs, and simultaneously comprises physical and virtual resources of a central cloud and an edge cloud;
the host management can monitor hardware resources and collect and synchronize virtual machine alarm, key processes and H/W faults from resource arrangement service, service management and configuration management; the host management can automatically restart the host by using different scheduling strategies according to cluster states, key processes, resource thresholds, faults of the physical host and the like under the condition that the virtual host is shut down;
the service management uses multiple channels to avoid the disconnection of communication and the split brain problem of the service and monitor the service state;
the software management provides a life cycle management mechanism for the shutdown problem of the virtual machine during upgrading, when hot migration is needed, resources on the host needing to be updated are automatically transferred to the available host, and the resources are automatically distributed to the updated host after updating is completed.
Further, the host management may use different scheduling policies to automatically restart the host according to a cluster state, a key process, a resource threshold, a failure of the physical host, and the like when the virtual host is powered off.
The following system explanation and explanation are given by taking a 1+1 high-availability dual-control node control cluster as an example:
i. fig. 1 is a complete architecture diagram of the cloud operating system of the present invention, and the system architecture design includes a control node, a computing node, a storage node, a virtual network element interface, an operation support system, and a service support system, where a cloud computing platform, a virtual machine, and a distributed storage system are three components of a bottom layer; the operation support system and the service support system exchange data with the control node, and the virtual network element interface exchanges a calculation result with the calculation node; the virtual machine is optimized at the computing node, and an SR-IOV, OVS-DPDK and Intel network acceleration scheme is introduced into the network part; forming a distributed storage scheme Ceph in a storage node set; deploying virtual EPCs and virtual CPEs in virtual machines on upper-layer virtual network element interfaces VNFs to realize support of telecommunication network elements;
fig. 2 is a schematic diagram of a configuration management service, where sysinv provides state management of the entire software, modification of system configuration, and controllerconfig/controlproteconfig is responsible for setting the system configuration according to the role of a physical node;
FIG. 3 is a schematic diagram of fault management, wherein other system modules directly send alarm and log information to FM-manager through FM-API, and a central log system of fault management can collect log information of all nodes in the system; the fault management alarm system receives alarm information of all node roles;
FIG. 4 is a schematic diagram of a mainframe management service showing the cooperation between the mainframe management service and other management services and monitoring modules, the mainframe management using rmon to monitor the storage and usage of the central processing unit and the memory; using a pmon management base process to monitor computing and block storage services; providing a heartbeat detection service of the platform using hbs service; providing a management service to the server BWC using the hwmond service; using other service modules of the MTC service main pipe MTCE platform to provide interfaces to the outside;
v. fig. 5 is a schematic diagram of service management, the service management is composed of three components, the high-availability controller of the service management is a redundancy model, a 1+1 high-availability dual-control node is adopted to control a cluster, a main control node and a standby control node are in real-time communication, when the main control node fails, an HA process is automatically triggered, the standby node is switched to be the main control node, and the service management can be expanded to be N + M or N control nodes; the high-reliability message service can use at most three independent communication paths to avoid the split brain problem of communication, each path of the LAG protection link is configured, and the HMAC SHA-512 is used for carrying out identity verification on the message; its service monitoring may be active or passive;
fig. 6 is a schematic diagram of software management, where the software management provides a patch production tool and a management service of the patch, supports hot patches and reboot required patches, and needs to restart nodes when replacing kernel patches; through the real-time migration service of the virtual machine, the service is ensured not to be interrupted when the reboot patch is installed on the management node;
fig. 7 is an architecture diagram of a conventional cloud platform, and as a supplementary description, the conventional cloud platform places a computing node, a network node, and a storage node in a resource pool composed of stacks, a user calls a corresponding resource of the resource pool using an API, and a bottom layer includes a physical storage, a network switch, and a server, and also includes stacks.
List of abbreviations, english and key term definitions:
KVM (Kernel-based Virtual Machine): the kernel-based virtual machine is a virtualization infrastructure used in a Linux kernel, and can convert the Linux kernel into a virtual machine monitor;
EPC (evolved Packet core): the system is characterized in that only a packet domain is available, a circuit domain is not available, the system is based on an all-IP structure, control and bearing are separated, and a network structure is flattened;
cpe (customer premix equipment): a mobile signal access device for receiving mobile signals and forwarding the mobile signals by wireless WIFI signals;
ceph: an open source software storage platform that applies object storage on a single distributed computer cluster and provides interfaces for object level, block level, or file level storage;
SR-IOV: a virtualization hardware acceleration scheme is originally designed for sharing network resources among virtual machines;
OVS-DPDK: open vSwitch and DPDK will combine virtual machine acceleration schemes.
The invention is light-weight to deploy, can be flexibly deployed in a severe environment to work, improves the robustness of the system due to the reduction of the volume, can automatically recover and put into service again in a very short time when a server is down, and reduces the maintenance cost caused by system expansion which possibly occurs in the future; and the method has ultra-low time delay, and improves the capability of high-complexity calculation.
The present invention is not limited to the above embodiments, and any technical solutions similar or identical to the present invention, which are made in the light of the present invention, are within the scope of the present invention.
The techniques, shapes, and configurations not described in detail in the present invention are all known techniques.

Claims (2)

1. An edge computing system and method generally used for multi-scene deployment comprises a cloud platform and is characterized in that: the cloud platform is internally provided with a plurality of management servers, namely a configuration management server, a fault management server, a host management server, a service management server and a software management server, wherein,
the configuration management is responsible for carrying out installation configuration of each component, and each time the system is started, the system stock service, the controller configuration service and the calculation configurator service are all re-executed, so that the system can be quickly restored to normal configuration after being restarted;
the fault management can count alarm times and check logs, and simultaneously comprises physical and virtual resources of a central cloud and an edge cloud;
the host management can monitor hardware resources and collect and synchronize virtual machine alarm, key processes and H/W faults from resource arrangement service, service management and configuration management; the host management can automatically restart the host by using different scheduling strategies according to cluster states, key processes, resource thresholds, faults of the physical host and the like under the condition that the virtual host is shut down;
the service management uses multiple channels to avoid the disconnection of communication and the split brain problem of the service and monitor the service state;
the software management provides a life cycle management mechanism for the shutdown problem of the virtual machine during upgrading, when hot migration is needed, resources on the host needing to be updated are automatically transferred to the available host, and the resources are automatically distributed to the updated host after updating is completed.
2. The system and method of claim 1, wherein the edge computing system is generally used for multi-scene deployment, and comprises: the host management can automatically restart the host by using different scheduling strategies according to cluster states, key processes, resource thresholds, faults of the physical host and the like under the condition that the virtual host is shut down.
CN201910911688.1A 2019-09-25 2019-09-25 Edge computing system and method generally used for multi-scene deployment Pending CN110677288A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910911688.1A CN110677288A (en) 2019-09-25 2019-09-25 Edge computing system and method generally used for multi-scene deployment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910911688.1A CN110677288A (en) 2019-09-25 2019-09-25 Edge computing system and method generally used for multi-scene deployment

Publications (1)

Publication Number Publication Date
CN110677288A true CN110677288A (en) 2020-01-10

Family

ID=69079017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910911688.1A Pending CN110677288A (en) 2019-09-25 2019-09-25 Edge computing system and method generally used for multi-scene deployment

Country Status (1)

Country Link
CN (1) CN110677288A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597043A (en) * 2020-05-14 2020-08-28 行星算力(深圳)科技有限公司 Method, device and system for calculating edge of whole scene
CN112737934A (en) * 2020-12-28 2021-04-30 常州森普信息科技有限公司 Cluster type Internet of things edge gateway device and method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515316A (en) * 2021-07-29 2021-10-19 广州高维网络科技有限公司 Novel edge cloud operating system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515316A (en) * 2021-07-29 2021-10-19 广州高维网络科技有限公司 Novel edge cloud operating system

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
OPENINFRA: "StarlingX overview和功能点详解", 《检索自互联网:<URL: HTTPS://BLOG.CSDN.NET/OPENINFRA/ARTICLE/DETAILS/97299626 >》 *
OPENINFRA: "基于StarlingX的边缘计算机器学习优化", 《检索自互联网:<URL:HTTPS://BLOG.CSDN.NET/OPENINFRA/ARTICLE/DETAILS/89395640?SPM=1001.2101.3001.6650.1&UTM_MEDIUM=DISTRIBUTE.PC_RELEVANT.NONE-TASK-BLOG-2%7EDEFAULT%7EBLOGCOMMENDFROMBAIDU%7EDEFAULT-1.HIGHLIGHTWORDSCORE&DEPTH_1-UTM_SOURCE=DIS>》 *
凌云时刻: "【干货分享】电信云/边缘云虚拟层软件StarlingX介绍", 《检索自互联网:<URL:HTTPS://BLOG.CSDN.NET/BJCHENXU/ARTICLE/DETAILS/107036191>》 *
无: "OpenStack StarlingX组件详解", 《检索自互联网:<URL: HTTPS://WWW.SOHU.COM/A/273284053_609513》 *
李振江等: "边缘计算IaaS平台架构StarlingX研究", 《2019全国边缘计算学术研讨会论文集》 *
边缘计算社区: "一文读懂StarlingX", 《检索自互联网:<URL: HTTPS://BLOG.CSDN.NET/WEIXIN_41033724/ARTICLE/DETAILS/99145584>》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597043A (en) * 2020-05-14 2020-08-28 行星算力(深圳)科技有限公司 Method, device and system for calculating edge of whole scene
CN112737934A (en) * 2020-12-28 2021-04-30 常州森普信息科技有限公司 Cluster type Internet of things edge gateway device and method

Similar Documents

Publication Publication Date Title
US5875290A (en) Method and program product for synchronizing operator initiated commands with a failover process in a distributed processing system
CN103346903B (en) Dual-machine backup method and device
US6012150A (en) Apparatus for synchronizing operator initiated commands with a failover process in a distributed processing system
WO2016058307A1 (en) Fault handling method and apparatus for resource
CN100426751C (en) Method for ensuring accordant configuration information in cluster system
CN101483540A (en) Master-slave switching method in high class data communication equipment
CN110134518B (en) Method and system for improving high availability of multi-node application of big data cluster
WO2003047063A1 (en) A functional fail-over apparatus and method of operation thereof
CN102916825A (en) Management equipment of dual-computer hot standby system, management method and dual-computer hot standby system
CN108347339B (en) Service recovery method and device
CN112948063B (en) Cloud platform creation method and device, cloud platform and cloud platform implementation system
CN103532753A (en) Double-computer hot standby method based on memory page replacement synchronization
CN114090184B (en) Method and equipment for realizing high availability of virtualization cluster
CN113515316A (en) Novel edge cloud operating system
CN108469996A (en) A kind of system high availability method based on auto snapshot
CN110677288A (en) Edge computing system and method generally used for multi-scene deployment
CN102045187B (en) Method and equipment for realizing HA (high-availability) system with checkpoints
CN113127270A (en) Cloud computing-based 2-out-of-3 safety computer platform
CN101938369B (en) Comprehensive network management access management system, management method and network management system applying same
WO2015116048A1 (en) Shutdown of computing devices
CN102487332B (en) Fault processing method, apparatus thereof and system thereof
CN101686261A (en) RAC-based redundant server system
CN101557307B (en) Dispatch automation system application state management method
CN110399254A (en) A kind of server CMC dual-locomotive heat activating method, system, terminal and storage medium
CN113220509A (en) Double-combination alternating shift system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110