WO2022161430A1 - 边缘云系统、边缘管控方法、管控节点及存储介质 - Google Patents

边缘云系统、边缘管控方法、管控节点及存储介质 Download PDF

Info

Publication number
WO2022161430A1
WO2022161430A1 PCT/CN2022/074248 CN2022074248W WO2022161430A1 WO 2022161430 A1 WO2022161430 A1 WO 2022161430A1 CN 2022074248 W CN2022074248 W CN 2022074248W WO 2022161430 A1 WO2022161430 A1 WO 2022161430A1
Authority
WO
WIPO (PCT)
Prior art keywords
control
management
edge
node
cluster
Prior art date
Application number
PCT/CN2022/074248
Other languages
English (en)
French (fr)
Inventor
何淋波
郭飞
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2022161430A1 publication Critical patent/WO2022161430A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Definitions

  • the present application relates to the field of edge cloud technologies, and in particular, to an edge cloud system, an edge management and control method, a management and control node, and a storage medium.
  • Edge computing is a form of distributed computing in which data processing, storage, etc. are processed in edge nodes closer to the terminal. It is close to the data source, which is conducive to reducing service response delay and bandwidth costs.
  • edge computing With the increasing development of edge computing, a large number of applications need to be deployed on the edge side.
  • containers As a cloud-native technology, containers have excellent features such as light weight and portability, which are very suitable for carrying applications in edge computing scenarios. Instances, and because containers inherently have better affinity for applications, it is convenient to deploy or close application instances in a short period of time quickly and easily to meet real-time traffic on the edge side.
  • cloud-edge fusion how to orchestrate and schedule edge containers is a technical problem faced by the fusion of cloud native and edge computing (referred to as cloud-edge fusion).
  • the open source container orchestration and scheduling system kubernetes is used to solve the problem of container orchestration and scheduling in the cloud native and edge computing fusion scenario.
  • the master component of kubernetes can be hosted on the cloud
  • the worker component of kubernetes can be deployed on edge computing nodes
  • the worker component is connected to the master component in the cloud through the public network
  • the master component and the worker component cooperate with each other Implement the orchestration and scheduling of containers in cloud-edge integration scenarios.
  • due to objective factors such as the public network connection between the cloud and the edge side, and the delay and instability of the public network network, applications running on the edge side are often out of cloud control and cannot guarantee the serviceability of edge applications.
  • Various aspects of the present application provide an edge cloud system, an edge management and control method, a management and control node, and a storage medium, which are used to realize edge autonomy under a cloud-edge fusion architecture and ensure the serviceability of edge applications.
  • Embodiments of the present application provide an edge cloud system, including: a central management and control node, and at least one edge cluster connected to the central management and control node in a network, each edge cluster includes an edge management and control node and an edge computing node, and the edge computing node Containerized applications can be deployed on the edge cloud system.
  • the edge management and control node is used to manage and control the containerized applications in the target edge cluster during the period when the central management and control node does not meet the management and control conditions for the target edge cluster to which the edge management and control node belongs, so that the container The application continues to provide services;
  • the central management and control node is configured to, after the management and control conditions for the target edge cluster are re-satisfied, based on the management and control status of the containerized applications in the target edge cluster before the management and control conditions are not satisfied, re-control the target edge cluster.
  • Containerized applications in the cluster are managed and controlled.
  • the embodiment of the present application also provides an edge management and control method, which is applicable to edge management and control nodes, the method includes: determining that a central management and control node in an edge cloud system does not meet the management and control conditions for a target edge cluster; During the period when the control conditions for the target edge cluster are satisfied, the containerized applications in the target edge cluster are managed and controlled, so that the containerized applications continue to provide services; wherein the target edge cluster is the edge cloud system The edge cluster to which the edge management node belongs.
  • the embodiment of the present application also provides an edge management and control method, which is applicable to a central management and control node, the method includes: during the period when the management and control conditions for the target edge cluster in the edge cloud system are not satisfied, determining that the management and control of the target edge cluster is re-satisfied condition; re-manage and control the containerized applications in the target edge cluster based on the management and control status of the containerized applications in the target edge cluster before the management and control conditions are not met; wherein the target edge cluster is the edge cloud For any edge cluster in the system, during the period when the central management and control node does not meet the management and control conditions for the target edge cluster, the edge management and control node in the target edge cluster performs the containerized application in the target edge cluster. Control.
  • Embodiments of the present application further provide an edge management and control node, including: a memory and a processor; the memory is used to store a computer program; the processor is coupled to the memory and used to execute the computer program to use
  • the central management and control node in the edge cloud system to which it belongs does not meet the management and control conditions for the target edge cluster; and during the period when the central management and control node does not meet the management and control conditions for the target edge cluster, the The containerized application is managed and controlled so that the containerized application can continue to provide services; wherein, the target edge cluster is the edge cluster to which the edge management and control node in the edge cloud system belongs.
  • An embodiment of the present application further provides a central management and control node, including: a memory and a processor; the memory is used to store a computer program; the processor is coupled to the memory and used to execute the computer program to use
  • a central management and control node including: a memory and a processor; the memory is used to store a computer program; the processor is coupled to the memory and used to execute the computer program to use
  • determine that the control conditions for the target edge cluster are re-satisfied based on the containerized applications in the target edge cluster before the control conditions are not met , and re-manage and control the containerized applications in the target edge cluster; wherein the target edge cluster is any edge cluster in the edge cloud system, and the central control node does not satisfy the requirements for the
  • the containerized applications in the target edge cluster are managed and controlled by the edge management and control nodes in the target edge cluster.
  • Embodiments of the present application further provide a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the processor causes the processor to implement steps in any of the methods provided by the embodiments of the present application.
  • Embodiments of the present application further provide a computer program product, including computer programs/instructions, wherein, when the computer program/instructions are executed by a processor, the processor is caused to implement any of the methods provided in the embodiments of the present application. step.
  • a cloud-edge fusion architecture is provided.
  • an edge management and control node is also added in the edge cluster.
  • the edge management and control node can manage and control the containerized applications in the edge cluster where it is located.
  • the dual management and control of the center and the edge can greatly improve the edge autonomy of the cloud-edge fusion architecture.
  • FIG. 1a is a schematic structural diagram of an edge cloud system provided by an exemplary embodiment of the present application.
  • FIG. 1b is a schematic diagram of a relationship state of data interaction among a central management and control node, an edge management and control node, and an edge computing node in an edge cloud system according to an embodiment of the present application;
  • FIG. 2a is a schematic flowchart of an edge management and control method provided by an exemplary embodiment of the present application
  • FIG. 2b is a schematic flowchart of another edge management and control method provided by an exemplary embodiment of the present application.
  • FIG. 2c is a schematic flowchart of still another edge management and control method provided by an exemplary embodiment of the present application.
  • FIG. 3a is a schematic structural diagram of an edge management and control apparatus provided by an exemplary embodiment of the present application.
  • FIG. 3b is a schematic structural diagram of an edge management and control node according to an exemplary embodiment of the present application.
  • FIG. 4a is a schematic structural diagram of another edge management and control apparatus provided by an exemplary embodiment of the present application.
  • FIG. 4b is a schematic structural diagram of a central management and control node according to an exemplary embodiment of the present application.
  • FIG. 1a is a schematic structural diagram of an edge cloud system provided by an exemplary embodiment of the present application.
  • the edge cloud system 100 includes: a central management and control node 101 , and at least one edge cluster 102 networked with the central management and control node 101 .
  • Each edge cluster 102 includes an edge management node 102a and an edge computing node 102b, wherein a containerized application can be deployed on the edge computing node 102b.
  • the edge cloud system 100 in this embodiment is a cloud computing platform built on the edge infrastructure based on cloud computing technology and edge computing capabilities, and is a network with computing, network, storage, and security capabilities close to the edge. system.
  • Edge cloud is a relative concept.
  • Edge cloud refers to a cloud computing platform that is relatively close to the terminal.
  • the terminal here refers to the demand side of cloud computing services. For example, it can be a terminal or user terminal in the Internet, or a terminal or user terminal in the Internet of Things. .
  • the edge cloud system 100 in this embodiment is different from a central cloud or a traditional cloud computing platform.
  • the central cloud or traditional cloud computing platform may include a data center with large-scale resources and centralized locations, while the edge cloud in this embodiment
  • the system 100 includes at least one edge cluster 102 .
  • These edge clusters 102 cover a wider range of networks and therefore have the characteristics of being closer to terminals.
  • the resource scale of a single edge cluster 102 is small, but the number of edge clusters 102 is relatively large.
  • each edge cluster 102 includes a series of edge infrastructures, these edge infrastructures include but are not limited to: distributed data centers (DCs), wireless computer rooms or clusters, operators' communication networks, core network equipment , base stations, edge gateways, home gateways, computing devices and/or storage devices and other edge devices and corresponding network environments, etc.
  • the edge cluster 102 may be implemented as an Internet Data Center (Internet Data Center, IDC) located at the edge, that is, an edge IDC is an edge cluster 102 in this embodiment of the present application; or, the edge cluster 102 It can be implemented as a computer room located at the edge, that is, a computer room is an edge cluster 102 in this embodiment of the present application.
  • IDC Internet Data Center
  • the locations, capabilities and included infrastructures of different edge clusters 102 may be the same or different. Based on these edge infrastructures, the edge cluster 102 can provide various resources externally, such as resources with certain computing capabilities such as CPUs, GPUs, servers, and computing devices, resources with storage capabilities such as memory and hard disks, and network resources such as bandwidth. In this embodiment, resources with certain computing capabilities in the edge cluster 102 are called edge computing nodes 102b, such as servers, computing devices, etc. Each edge cluster 102 includes at least one edge computing node 102b.
  • the edge cloud system 100 in this embodiment can be applied to various application scenarios such as Content Delivery Network (CDN), e-commerce, games, audio and video, Internet of Things, logistics, industrial brain, urban brain, etc. End users in the scenario provide cloud computing services.
  • CDN Content Delivery Network
  • End users in the scenario provide cloud computing services.
  • an application that can provide cloud computing services in the application scenario (hereinafter referred to as application) may be deployed in the edge cluster 102 in the edge cloud system 100, wherein the application deployed in the edge cluster 102 actually It is a process of deploying an application on the edge computing node 102b in the edge cluster 102.
  • an application that can provide online shopping functions can be deployed on the edge computing node 102b in the edge cluster 102, for example, it can be a server of an online shopping application, and the interaction between the server and the shopping terminal can be used for shopping
  • the user provides an online shopping function; in a game scenario, an application that can provide an online game function can be deployed on the edge computing node 102b in the edge cluster 102, for example, it can be a server of the online game application, which interacts with the game terminal It can provide online game services for game users; in the field of audio and video, applications that can provide audio and video functions can be deployed on the edge computing nodes 102b in the edge cluster 102, for example, it can be a live broadcast server, an on-demand server or a video monitoring service The interaction between these servers and the playback terminal can provide viewing users with services such as live broadcast, on-demand or monitoring.
  • the cloud-native technology of container is adopted in this embodiment, that is, the container is carried by the container.
  • Application and then deploy applications in units of containers, which can realize a cloud-native and edge computing fusion architecture, referred to as cloud-edge fusion architecture.
  • an application carried in a container is referred to as a containerized application, and may also be referred to as a container instance for short.
  • the edge computing node 102b hosting the containerized application may fail, it will involve a live migration operation for the containerized application, or, when the resources of the edge computing node 102b hosting the containerized application are insufficient, it will involve the operation of the containerized application.
  • a central management and control node 101 is deployed, and the central management and control node 101 can use the edge cluster 102 as a control object to manage and control the containerized applications in each edge cluster 102 .
  • the central management and control node 101 can manage and control the containerized applications in the corresponding edge cluster 102 according to the service demand information submitted by the edge service demander.
  • the service requirement information submitted by the edge service demander may be a requirement to deploy containerized applications in a specified area, to change the service quality requirements of containerized applications, or to upgrade containerized applications, or to increase or decrease the number of containerized applications. ,and many more.
  • the edge service demander refers to a party that needs to provide cloud computing services for the edge cluster 102 .
  • the central control node 101 can also automatically monitor the running status of the containerized applications in the edge cluster 102, and manage and control these containerized applications accordingly.
  • the running status of the containerized application mainly refers to whether the containerized application is running normally.
  • the management and control of containerized applications includes at least one of the following: deployment, reconstruction, upgrade, migration, resource expansion, resource reduction, shutdown, restart, and freed. Further, no matter what kind of management and control is performed on the containerized application, the central management and control node 101 can implement the management and control operation of the containerized application in the edge cluster 102 through the edge computing node 102b in the edge cluster 102 .
  • the central management and control node 101 can generate the first management and control data required for the management and control of the containerized applications in the edge cluster 102 according to the service requirement information submitted by the edge service demander and/or the monitored running status of the containerized applications , the first management and control data is delivered to the corresponding edge computing node 102b in the edge cluster 102, and the corresponding edge computing node 102b executes deployment, upgrade, migration, resource expansion, and resource reduction for the containerized application to be managed and controlled according to the first management and control data. at least one of the management and control operations of capacity, shutdown, restart, and release.
  • the first control data refers to the data required for the edge computing node 102b to manage and control the containerized applications in the edge cluster 102 to which it belongs, and the data includes the type of control operation and various actions and parameters related to the control. . It should be noted that, according to the difference in service demand information and/or the running state of the containerized application, the content of the first control data and the type of the indicated control operation will be different, and the control performed by the edge computing node 102b will be different. Operation will also vary. An exemplary illustration is given below:
  • the central management and control node 101 externally provides a requirement submission entry, and the requirement submission entry may be a web page, an application page, a command window, or the like.
  • the function of the requirement submission entry is for the edge service requirement to submit its own service requirement information to the central control node 101 .
  • the edge service demander can provide the central control node 101 with service demand information requesting the deployment of containerized applications through the demand submission portal provided by the central control node 101.
  • the service demand information includes: edge cluster selection parameters, Resource selection parameters and information pointing to the image file required by the application; edge cluster selection parameters include the region where the edge cluster is located and/or performance requirements for the edge cluster, etc., and are mainly used to select edge clusters; resource selection parameters include resource type, resource The number and performance requirements of resource devices, etc., are mainly used to select the edge computing node 102b in the edge cluster 102; the information pointing to the image file required by the application can be the storage address of the image file, or the access address of the device that can provide the image file , which is used to obtain the image file required by the application.
  • the central management and control device 101 selects an edge cluster that meets the parameter requirements from at least one edge cluster 102 according to the edge cluster selection parameter; further, according to the resource selection parameter, the edge computing node 102b in the target edge cluster 102 is scheduled, Sending management and control data related to the deployment of the containerized application to the scheduled edge computing node 102b, the management and control data includes information pointing to the image file required by the application and resource information required by the containerized application to indicate the scheduled edge computing node 102b Deploy containerized applications on it based on this governance data.
  • the scheduled edge computing node 102b reserves resources for the containerized application according to the resource information in the management and control data, and obtains the image file according to the information pointing to the image file in the management and control data, and runs the image file to thereby obtain the image file. Deploy containerized applications on reserved resources.
  • the edge service demander can query the log data generated by the containerized application in the edge cluster 102 and the edge computing node 102b through the central control node 101, and can learn about the containerized application and the edge computing where it is located through the log data.
  • the running status of the node 102b according to which it can be determined whether to perform management and control operations such as upgrade, migration, resource expansion, resource reduction, shutdown, restart or release for the containerized application.
  • the edge service demander can also determine whether to upgrade, migrate, expand resources, shrink resources, shut down, Control operations such as restart or release.
  • the edge service demander can also actively request the central management and control node 101 to perform management and control operations such as upgrade, migration, resource expansion, resource reduction, shutdown, restart, or release of the containerized application according to service requirements.
  • the following examples illustrate:
  • Expansion and shrinkage of containerized applications During the operation of containerized applications, edge service demanders may wish to expand the resources of containerized applications due to factors such as service expansion requirements, increased user traffic, or improved service performance.
  • the demand submission entry provided by 101 provides the central management and control node 101 with service demand information requesting capacity expansion, where the service demand information includes information such as resource increment or the total amount of resources after increment.
  • the central management and control device 101 sends the management and control data indicating the expansion to the edge computing node 102b that carries the containerized application according to information such as the resource increment or the total amount of resources after the increment, and the edge computing node 102b receives the information indicating the expansion.
  • expand the resources of the containerized application according to the control data For example, increase the memory resources allocated to the containerized application from 2Gb to 4Gb, and increase the number of CPU cores allocated to the containerized application from 2 to 2. 4 cores etc.
  • the central management and control node 101 can also automatically monitor the running status of the containerized application.
  • the management and control data indicating capacity expansion may be sent to the edge computing node 102b carrying the containerized application, and after receiving the management and control data indicating capacity expansion, the edge computing node 102b performs resource capacity expansion for the containerized application according to the management and control data.
  • the central management and control node 101 can also provide service demand information requesting capacity reduction according to the edge service demander, or, when monitoring the excess resources of the edge computing node 102b carrying the containerized application or the load of the containerized application is low, Send the management and control data indicating the scaling down to the edge computing node 102b that hosts the containerized application.
  • the edge computing node 102b After receiving the management and control data indicating the scaling down, the edge computing node 102b performs resource scaling for the containerized application according to the management and control data, for example, assigning
  • the memory resources for the containerized application are reduced from 2Gb to 1Gb, the number of CPU cores allocated to the containerized application is reduced from 2 cores to 1 core, etc.
  • the central management and control node 101 can be provided with service requirement information for requesting reconstruction, the service requirement information can include information indicating the reconstruction of the containerized application, and optionally can also include the information of the new edge computing node 102b.
  • the new edge computing node 102b can also be independently selected by the central management and control node 101 according to information such as the load and resource margin of each edge computing node 102b, which is not limited.
  • the central management and control node 101 sends, to the new edge computing node 102b, management and control data indicating the reconstruction of the containerized application according to the service demand information, where the management and control data includes the identification information of the containerized application to be reconstructed and the required image file or file template, etc., After receiving the control data instructing to rebuild the containerized application, the new edge computing node 102b rebuilds the containerized application locally according to the image file or file template in the control data.
  • the central control node 101 can also automatically monitor the running status of the edge computing node 102b where the containerized application is located.
  • the new edge computing node 102b can rebuild the containerized application on the failed edge computing node, so the new edge computing node 102b sends the control data instructing to rebuild the containerized application, and the new edge computing node 102b receives the instruction to rebuild the container.
  • the management and control data of the application is stored, the corresponding containerized application is rebuilt locally according to the image file or file template contained in the management and control data.
  • Hot migration of containerized applications During the running of containerized applications, it may be necessary to migrate the containerized applications from the original edge computing nodes to the new edge computing nodes due to resource consolidation and edge computing node upgrades. To ensure the serviceability of containerized applications, hot migration can be used.
  • the edge service demander may provide the central management and control node 101 with service demand information requesting live migration, and the service demand information may include information indicating the live migration, and optionally may also include information on the new edge computing node 102b, such as an IP address, and many more.
  • the new edge computing node 102b can also be independently selected by the central management and control node 101 according to information such as the load and resource margin of each edge computing node 102b, which is not limited.
  • the central management and control node 101 respectively sends management and control data indicating live migration to the original edge computing node 102b and the new edge computing node 102b that carry the containerized application, and the management and control data includes the information of the new edge computing node 102b, such as IP address, the identification information of the containerized application to be migrated, and the required image file or file template, etc.
  • the original edge computing node 102b and the new edge computing node 102b After the original edge computing node 102b and the new edge computing node 102b receive the management and control data indicating the live migration, they establish communication based on the Connect and start the migration of the containerized application, which mainly refers to the synchronization of the state of the containerized application; during this period, the new edge computing node 102b will first create the containerized application locally according to the image file or file template in the control data, And start and run the containerized application according to the state data synchronized by the original edge computing node 102b, and after the containerized application runs successfully on the new edge computing node 102b, the original edge computing node 102b releases the resources occupied by the containerized application.
  • the central management and control node 101 can also automatically monitor the resource fragments on the edge computing node 102b. When it is detected that there are many resource fragments on an edge computing node 102b, it is necessary to merge the resources. , it is determined that the containerized application on the edge computing node 102b needs to be hot migrated to other edge computing nodes 102b, so the original edge computing node 102b and the new edge computing node 102b respectively send the management and control data indicating the hot migration. 102b and the new edge computing node 102b establish a communication connection between the two after receiving the management data indicating the live migration and start the migration of the containerized application. When the containerized application runs successfully on the new edge computing node 102b, the original edge The computing node 102b releases the resources occupied by the containerized application.
  • Upgrade of containerized applications with changes in service requirements or updates of image versions, corresponding containerized applications deployed on the edge computing node 102b need to be upgraded.
  • the edge service demander can query the running status of the containerized application to be upgraded through the central management and control node 101.
  • the running status of the containerized application to be upgraded for example, the service of the containerized application to be upgraded.
  • the response status of requests and service requests, etc. to determine whether the containerized application to be upgraded is suitable for upgrading, when it is suitable for upgrading, and what method to use to upgrade, etc., and then generate an upgrade strategy for the containerized application to be upgraded, and update the containerized application to be upgraded.
  • the identifier and the upgrade policy are carried in the upgrade notification and sent to the central management and control node 101 .
  • the central management and control node 101 receives the upgrade notification sent by the edge service demander, determines the ID, name and other identification information of the containerized application to be upgraded and its corresponding upgrade strategy, and generates management and control data indicating the upgrade accordingly, and the management and control data includes the container to be upgraded. ID, name and other identification information and upgrade strategy of the containerized application, and deliver the management and control data to the corresponding edge computing node 102b in the edge cluster 102 where the containerized application to be upgraded is located; after receiving the management and control data, the edge computing node 102b can
  • the upgrade policy in the data performs upgrade processing on the containerized application identified by the identification information.
  • the upgrade strategy may include: upgrade time, upgrade method, and the like.
  • the upgrade of the containerized application can also be actively initiated by the central management and control node 101 .
  • the central management and control node 101 can monitor the version information of the images corresponding to each containerized application, and when a new version of the image is found, it can determine that the containerized application corresponding to the new version of the image needs to be upgraded; Information such as the running status and life cycle of the containerized application.
  • problems such as loopholes, instability, incomplete functions, and excessive CPU or memory resource consumption are found during the running process of the containerized application, it can be determined that the containerization of these problems needs to be solved.
  • the application is upgraded, and management and control data indicating the upgrade is generated.
  • the management and control data includes identification information such as the ID, name and other identification information of the containerized application to be upgraded, and an upgrade strategy, and the management and control data is delivered to the edge cluster 102 where the containerized application to be upgraded is located.
  • the edge computing node 102b to upgrade the containerized application to be upgraded mainly refers to: shutting down the containerized application to be upgraded, updating the containerized application to be upgraded according to the image of the new version, and restarting the containerized application after the update.
  • the central control node 101 when the central control node 101 manages and controls the containerized application through the edge computing node 102b in the edge cluster 102, the central control node 101 needs to maintain a network connection with the edge computing node 102b. In practical applications, it may happen that the network connection between the central management and control node 101 and the edge computing node 102b is disconnected. In order to facilitate the central management and control node 101 to perceive whether the network connection between it and the edge computing node 102b is disconnected, a heartbeat connection can be maintained between the edge computing node 102b and the central management and control node 101, that is, the edge computing node 102b can periodically report to the central management and control node 101 The heartbeat message is reported.
  • the central management and control node 101 does not receive the heartbeat message reported by the edge computing node 102b within a certain period of time, it is considered that the network connection between it and the edge computing node 102b is disconnected. If the network connection between the central management and control node 101 and the edge computing node 102b is disconnected, optionally, the central management and control node 101 may no longer schedule the edge computing node 102b, but the edge computing node 102b and the nodes deployed thereon A containerized application maintains its current state.
  • the central management and control node 101 will only select the edge computing node to be used from the edge computing nodes 102b that maintain network connection with it. , edge computing nodes that are disconnected from the network are not considered.
  • the central control node 101 and an edge computing node 102b are usually connected by a network. After the network connection is disconnected, the central management and control node 101 can also indirectly manage and control the containerized applications on the disconnected edge computing node 102b through other edge computing nodes 102b based on the communication connection between it and other edge computing nodes 102b .
  • the edge cluster 102 taking the edge cluster 102 as the dimension, it is considered whether the network connection between the central control node 101 and the edge cluster 102 is disconnected; If the network connection is disconnected, the central management and control node 101 cannot manage and control the containerized applications in the edge cluster 102 . Further optionally, whether the network connection between the central management and control node 101 and the edge cluster 102 is disconnected can be determined by using the information of the central management and control node 101 and the edge computing node 102b in the edge cluster 102 that maintains a communication connection.
  • the central control node 101 and the edge computing nodes in the edge cluster 102 that maintain communication connections are greater than or equal to the set number threshold, or the ratio of the number is greater than or equal to the set proportion threshold, then the central control node 101 and the edge computing nodes are determined to be A network connection is maintained between the clusters 102; on the contrary, if the number of edge computing nodes that maintain communication connections between the central control node 101 and the edge cluster 102 is less than the set number threshold, or the ratio of the number is less than the set ratio threshold, then the central control node is determined. The network connection between node 101 and edge cluster 102 is disconnected.
  • the values of the quantity threshold and the ratio threshold are not limited in this embodiment of the present application.
  • the proportional threshold can be 1, that is, the central control node 101 and all edge computing nodes in the edge cluster 102 are required to maintain communication connections, so that it can be considered that the network connection between the central control node 101 and the edge cluster 102 is maintained.
  • the ratio threshold can be 0.8, that is, the central control node 101 and no less than 80% of the edge computing nodes in the edge cluster 102 are required to maintain communication connections, so that it can be considered that a network is maintained between the central control node 101 and the edge cluster 102 connect.
  • the network between the central management and control node 101 and the edge cluster 102 may be determined according to whether the designated edge computing node is included in the central management and control node 101 and the edge computing node 102b that maintains a communication connection with the edge cluster 102 Whether the connection is disconnected. For example, if the central control node 101 maintains a communication connection with the edge computing node specified in the edge cluster 102, it is determined that the central control node 101 and the edge cluster 102 maintain a network connection; otherwise, if the central control node 101 is not connected to the edge cluster 102 If any designated edge computing node maintains the communication connection, it is determined that the network connection between the central management and control node 101 and the edge cluster 102 is disconnected.
  • the disconnection of the network connection between the central management and control node 101 and the edge cluster 102 mainly refers to the situation in which the edge cluster 102 is disconnected or basically disconnected from the control of the central management and control node 101 , and may include various situations that cause disconnection of the network connection.
  • the central management and control node 101 in addition to the central management and control node 101 being unable to manage and control the containerized applications in the edge cluster 102 when the network connection with the edge cluster 102 is disconnected, it may also be due to other factors. , the containerized applications in the edge cluster 102 cannot be managed and controlled.
  • the edge service demander can configure the central control node 101 to not manage and control the containerized applications in the edge cluster 102 within a set time.
  • the containerized applications in the edge cluster 102 may not be managed and controlled.
  • the management and control conditions corresponding to the edge cluster 102 may be pre-configured, and the management and control conditions may be: the central management and control node 101 is required to maintain a communication connection with the edge cluster 102, the central management and control node 101 is not in the upgrade period, and the central management and control node 101 is not in the upgrade period.
  • the non-controlled time set by the edge service demander, etc. the central management and control node 101 can perform various management and control on the containerized applications in the edge cluster 102 while satisfying the management and control conditions for the edge cluster 102 .
  • the serviceability of the containerized application may decrease due to the lack of control over the containerized application, and may even fail to provide services.
  • the edge computing node 102b where the containerized application is located fails, because the containerized application cannot be rebuilt on other edge computing nodes in time, the containerized application cannot continue to provide services at this time.
  • an edge control node 102a is added to each edge cluster 102, and the edge control node 102a can be used when the central control node 101 does not satisfy the edge control node 101.
  • the management and control conditions of the edge cluster 102 to which the node 102a belongs it is responsible for managing and controlling the containerized applications in the edge cluster 102 to which it belongs. Further, after the central management and control node 101 re-satisfies the management and control conditions for the edge cluster 102, the management and control authority can be reclaimed, and the containerized applications in the edge cluster 102 can be managed and controlled again.
  • the dual management and control of the center and the edge can greatly improve the edge autonomy capability of the cloud-edge fusion architecture, and greatly improve the service capability of edge containerized applications.
  • the central management and control node 101 after the central management and control node 101 re-satisfies the management and control conditions for the edge cluster 102, it does not rely on the management and control status of the edge cluster 102 by the edge management and control node 102a.
  • the management and control status of the containerized application in the cluster 102 is to re-manage and control the containerized application in the edge cluster 102 .
  • the central management and control node 101 determines that the management and control conditions for the edge cluster 102 are not satisfied, it can also record the management and control status of the containerized applications in the edge cluster 102 at this time, and record it as the first management and control status, so as to facilitate the resumption of After the management and control conditions are satisfied, the containerized applications in the edge cluster 102 can be managed and controlled again based on the previous management and control status, without relying on the management and control status of the containerized applications in the edge cluster 102 by the edge management and control node 102a.
  • the central management and control node 101 and the edge management and control node 102a are loosely coupled, and are independently managed and controlled, and the edge autonomy capability is more flexible.
  • the control status of the edge management and control node 102a or the central management and control node 101 on the containerized application in the edge cluster 102 means that after the edge management and control node 102a or the central management and control node 101 manages and controls the containerized application in the edge cluster 102 according to the corresponding management and control data
  • the state reached by the containerized applications in the edge cluster 102 may include, but is not limited to, the following information: which containerized applications are included in the edge cluster 102, which edge computing nodes the containerized applications run on, and what are the resource specifications of the containerized applications , creation time, whether it is in shutdown state, etc.
  • the edge management and control node 102a can also control the edge cluster 102 to which it belongs.
  • Containerized applications in at least one of the management and control operations of upgrade, migration, resource expansion, resource reduction, shutdown, restart and release.
  • the edge management and control node 102a can implement the management and control operation of the containerized application in the edge cluster 102 through the edge computing node 102b in the edge cluster 102 to which it belongs.
  • the edge management and control node 102a may generate the second management and control data required for the management and control of the containerized application in the edge cluster 102 according to the service requirement information submitted by the edge service demander and/or the monitored running status of the containerized application , the second management and control data is delivered to the corresponding edge computing nodes 102b in the edge cluster 102, and the corresponding edge computing nodes 102b perform deployment, upgrade, migration, resource expansion, and resource reduction for the containerized application to be managed and controlled according to the second management and control data. at least one of the management and control operations of capacity, shutdown, restart, and release.
  • the second control data refers to the data required for the edge computing node 102b to manage and control the containerized applications in the edge cluster 102 to which it belongs, and the data includes the type of control operation and various actions and parameters related to the control. . It should be noted that, according to the difference in service demand information and/or the running state of the containerized application, the content of the second control data and the type of the indicated control operation will be different, and the control performed by the edge computing node 102b will be different. Operation will also vary.
  • the edge service demander can interact with the edge management and control node 102a, and can query the operation of the containerized application through the edge management and control node 102a state, and can also initiate a management and control operation for the containerized application to the edge computing node 102b. That is to say, the edge management and control node 102a can perform management and control operations on the containerized applications in the edge cluster 102 to which it belongs can be initiated by the edge service demander.
  • the edge management and control node 102a can also automatically monitor the running status of the containerized application in the edge cluster 102 to which it belongs and/or the edge computing node where it is located, and automatically initiate management and control operations on the containerized application according to the monitoring results.
  • the process in which the edge management and control node 102a performs various management and control operations on the containerized applications in the edge cluster 102 to which it belongs, and the central management and control node 101 performs various management and control operations on the containerized applications in the edge cluster 102 The process is the same or similar, the detailed process will not be repeated, and reference may be made to the foregoing examples.
  • the main difference between the edge management and control node 102a and the central management and control node 101 when they manage and control the containerized applications in the edge cluster 102 is that: when the edge management and control node 102a determines that the central management and control node 101 does not meet the management and control conditions for the edge cluster 102, it can Acquires the management and control status of the central management and control node 101 controlling the containerized applications in the edge cluster before the management and control conditions are not met, and continues to manage and control the containerized applications in the edge cluster from the management and control status.
  • the management and control state in which the central management and control node 101 manages and controls the containerized applications in the edge cluster before the management and control conditions are not satisfied is recorded as the first management and control state.
  • edge management and control node 102a when the edge management and control node 102a manages and controls the containerized applications in the edge cluster 102 through the edge computing node 102b in the edge cluster 102 to which it belongs, the edge management and control node 102a needs to be networked with the edge computing node 102b in the edge cluster 102 to which it belongs. . In practical applications, it may happen that the network connection between the edge management and control node 102a and the edge computing node 102b is disconnected.
  • a heartbeat connection can be maintained between the edge computing node 102b and the edge management and control node 102a, that is, the edge computing node 102b can periodically send messages to the edge management and control node 102a.
  • the heartbeat message is reported. If the edge management and control node 102a does not receive the heartbeat message reported by the edge computing node 102b within a certain period of time, it is considered that the network connection between it and the edge computing node 102b is disconnected.
  • the edge management and control node 102a can also pass its center.
  • the network connection between the management and control nodes 101 is to report the information of the faulty edge computing node 102b to the central management and control node 101. Whether the network connection between the edge clusters 102 to which the edge computing node 102b belongs is disconnected.
  • the same edge computing node 102b can simultaneously establish network connections with the edge management and control node 102a and the central management and control node 101 in the edge cluster 102 to which it belongs. At a certain moment, the edge computing node 102b only communicates with One of the central management and control node 101 or the edge management and control node 102a maintains the network connection, but disconnects the network connection with the other party, or disconnects the network connection with both parties at the same time.
  • the central control node 101 and an edge computing node 102b when judging whether the network connection between the central control node 101 and an edge computing node 102b is disconnected, it can judge whether it has received the heartbeat message of the edge computing node 102b within a certain period of time, and if not received.
  • the heartbeat message to the edge computing node 102b, and the information about the failure of the edge computing node 102b reported by the edge control node 102a is also received, it can be determined that the network connection between the edge computing node 102b and the edge computing node 102b is disconnected. It is beneficial to improve the accuracy of the determination result.
  • the central control node 101 manages and controls the containerized applications in the edge cluster 102, if the central control node 101 does not receive the heartbeat message of an edge computing node 102b within the set time, and also receives the edge control If the information about the failure of the edge computing node 102b reported by the node 102a, it can be determined that the failure of the edge computing node 102b is not due to the disconnection caused by the network failure. The service can continue to be provided, and the central management and control node 101 can rebuild the containerized application on the failed edge computing node on other edge computing nodes in the edge cluster 102 .
  • the central management and control node 101 can select other edge computing nodes, and deliver the management and control data for reconstructing the containerized application to the other edge computing nodes, where the management and control data includes the identification information of the containerized application and the required image files or file templates, etc. , to instruct other edge computing nodes to rebuild the containerized application.
  • the central management and control node 101 manages and controls the containerized applications in the edge cluster 102, or the edge management and control node 102a in the edge cluster 102 manages and controls the containerized applications in the cluster, it is through the edge computing nodes in the cluster.
  • 102b is managed for containerized applications.
  • the first management and control data may also be cached locally.
  • the edge computing node 102b may also synchronize the locally cached first management and control data to the edge management and control node 102a in the edge cluster 102 to which it belongs.
  • the edge management and control node 102a determines that the central management and control node 101 does not meet the management and control conditions for the edge cluster 102, it can be determined, according to the first management and control data synchronized by the edge computing node 102b, that the central management and control node 101 controls the edge cluster before the management and control conditions are not met.
  • the first control state of the containerized application in 102, and the containerized application in the edge cluster 102 continues to be managed and controlled from the first control state.
  • the time when the edge computing node 102b synchronizes the first control data with the edge control node 102a is not limited, and the first control data can be synchronized with the edge control node 102a at any time before the edge control node 102a needs the first control data.
  • Manage data the edge computing node 102b may synchronize the first management and control data with the edge management and control node 102a in the edge cluster 102 to which it belongs when the network connection between the edge computing node 102b and the central management and control node 101 is disconnected.
  • the edge management and control node 102a can monitor the information of the edge computing node 102b in the edge cluster 102 to which the first management and control data is synchronized, and determine according to the information that the central management and control node 101 no longer satisfies the management and control of the edge cluster 102 condition.
  • the information of the edge computing node 102b that synchronizes the first management and control data with the edge management and control node 102a may be the number of edge computing nodes.
  • the edge management and control node 102a determines that the number of edge computing nodes to which the first management and control data is synchronized is greater than or equal to The set number threshold, or the number ratio is greater than or equal to the set ratio threshold, indicating that the number of edge computing nodes in the edge cluster 102 to which it belongs, maintaining network connections with the central management and control node 101 is less than the set number threshold, or the number ratio is less than If the set ratio threshold is set, it is determined that the network connection between the central management and control node 101 and the edge cluster 102 is disconnected, which belongs to the situation that the management and control conditions of the edge cluster 102 are not satisfied.
  • the information of the edge computing node 102b that synchronizes the first management and control data with the edge management and control node 102a includes information of all specified edge computing nodes, if the edge management and control node 102a determines that the edge computing node to which the first management and control data is synchronized is among the edge computing nodes. Including all the specified edge computing nodes, indicating that any specified edge computing node in the edge cluster 102 to which it belongs has not maintained a network connection with the central management and control node 101, then it is determined that the network connection between the central management and control node 101 and the edge cluster 102 is disconnected , which belongs to the situation that the management and control conditions for the edge cluster 102 are not satisfied.
  • the edge control node 102a determines whether the edge computing node 102b is disconnected from the central control node 101 according to the number of edge computing nodes 102b to which the first control data is reported. In addition, the edge management and control node 102a can also actively send an inquiry message to the edge computing node 102b to inquire whether the edge computing node 102b maintains a network connection with the central management and control node 101; after receiving the inquiry message, the edge computing node 102b can send an inquiry message to the edge computing node 102b.
  • the node 102a returns a response message, which can indicate that it maintains or disconnects the network connection with the central management and control node 101; the edge management and control node 102a disconnects the network connection with the central management and control node 101 according to the statistics of the response messages returned by the edge computing node 102b.
  • the edge computing node 102b during the execution of the management and control operation on the containerized application according to the second management and control data issued by the edge management and control node 102a, the second management and control data may also be cached locally.
  • the edge computing node 102b may also synchronize the locally cached second management and control data to the central management and control node 101 .
  • the edge computing node 102b may synchronize the second management and control data to the central management and control node 101 when the central management and control node 101 re-satisfies the management and control conditions for the edge cluster 102 .
  • the central management and control node 101 determines that it re-satisfies the management and control conditions for the edge cluster 102, it can determine, according to the second management and control data synchronized by the edge computing node 102b, that the edge management and control node 102a controls the edge cluster 102 during the period when it does not meet the management and control conditions.
  • the second control state of the containerized application is to determine whether the second control state is consistent with the first control state of the containerized application in the edge cluster 102 before the control conditions are not met; if the second control state is inconsistent with the first control state, The containerized application in the edge cluster 102 needs to be rolled back from the second management and control state to the first management and control state, and the containerized application in the edge cluster 102 needs to be managed and controlled again from the first management and control state. On the contrary, if the second management and control state is consistent with the first management and control state, the containerized application in the edge cluster 102 will continue to be managed and controlled directly from the first management and control state or the second management and control state.
  • the central control node 101 can compare the second control state with the first control state, and can identify the containerized application to be rolled back in the edge cluster 102 through the comparison, and the containerized application to be rolled back is the edge cluster 102 Containerized applications that are different in the second control state and the first control state; after that, determine whether the edge computing node 102b where the containerized application to be rolled back is located is faulty; if not, the containerized application to be rolled back is processed The rollback process is to roll back the containerized application to be rolled back from the second control state to the first control state. Depending on the management and control situation, the containerized application to be rolled back and the edge computing node where the containerized application to be rolled back is located will be different.
  • the following example illustrates:
  • the containerized application to be rolled back may be a newly added containerized application in the edge cluster 102, that is, a containerized application that is not included in the edge cluster 102 in the first management and control state but is included in the second management and control state.
  • the containerized application to be rolled back may be a newly added containerized application in the edge cluster 102, that is, a containerized application that is not included in the edge cluster 102 in the first management and control state but is included in the second management and control state.
  • the containerized application on other edge computing nodes is the newly added containerized application in the edge cluster .
  • these increased containerized applications also belong to the newly added containerized applications in the edge cluster.
  • the edge computing node 102b where the containerized application to be rolled back is located refers to the edge computing node 102b where the newly added containerized application is currently located. If the edge computing node 102b where it is located is not faulty (normal), the newly added containerized application on the edge computing node 102b is deleted.
  • the containerized application to be rolled back may be the original containerized application deleted in the edge cluster 102, that is, the containerized application included in the edge cluster 102 in the first management and control state but not included in the second management and control state. .
  • the original containerized applications will be deleted.
  • the containerized application on the faulty edge computing node can also be regarded as the deleted original containerized application. .
  • the central management and control node 101 rolls back the containerized application to be rolled back from the second management and control state to the first management and control state, specifically: rebuilding and deleting the original containerized application on the edge computing node that has deleted the original containerized application. of the original containerized application.
  • the edge computing node 102b where the containerized application to be rolled back is located refers to the edge computing node 102b where the deleted original containerized application was originally located, and the central management and control node 101 can use the deleted original containerized application If the edge computing node 102b where the application originally resides is not faulty (normal), the deleted original containerized application is rebuilt on the edge computing node 102b.
  • the containerized application to be rolled back may be an original containerized application whose resource specifications in the edge cluster 102 have changed, that is, the containerized application exists in the edge cluster both in the first management and control state and in the second management and control state. 102, only the resource specifications are different.
  • the central management and control node 101 rolls back the containerized application to be rolled back from the second management and control state to the first management and control state, specifically: restore the resource specification of the original containerized application whose resource specification has changed to the original containerized application whose resource specification has changed.
  • the edge computing node 102b where the containerized application to be rolled back is located refers to the edge computing node 102b where the original containerized application whose resource specification has changed has always been located.
  • the rollback process of the containerized application to be rolled back may be skipped, or the containerized application to be rolled back may be rolled back according to the Adaptive handling of the situation. For example, during the control period of the edge control center 102a, if the containerized application is rebuilt due to the failure of the edge computing node A, it is assumed that the containerized application on the edge computing node A is rebuilt on the edge computing node B, and the containerized application on the edge computing node A is rebuilt on the edge computing node.
  • the number of rebuilt containerized applications on B is the same as the number and resource configuration of containerized applications on edge computing node A, then during the rollback process, the central control node 101 needs to rebuild edge computing node B on the one hand.
  • the original containerized application needs to be rebuilt on the edge computing node A; if the edge computing node A is still in a faulty state at this time, other normal edge computing nodes can be selected in the edge cluster 102 , and rebuild the containerized application originally on edge computing node A on other edge computing nodes.
  • edge control center 102a performing resource expansion of the containerized applications in the edge cluster where it is located as an example to illustrate the rollback process after the central control node 101 resumes control:
  • a containerized application A is deployed in the edge cluster 102, the containerized application A occupies 2 CPU cores, and the containerized application A is an application that provides online education services.
  • the public network between the central management and control node 101 and the edge cluster 102 fails, and the central management and control node 101 cannot manage and control the edge cluster 102.
  • the edge management and control node 102a in the edge cluster 102 is responsible for the edge cluster.
  • Containerized applications in 102 are managed and controlled.
  • the edge management and control node 102a During the period when the edge management and control node 102a is controlling the edge cluster 102, the student holiday comes, the number of users of the online education application increases sharply, the number of service requests of the containerized application A is large, and the load is too heavy, so the edge management and control node 102a for containerization Application A has expanded its resources, and the number of CPU cores allocated to the containerized application A has been changed from 2 to 4. After a period of time, the network failure between the central management and control node 101 and the edge cluster 102 is eliminated, the central management and control node 101 re-controls the edge cluster 102, and obtains that the number of CPU cores occupied by the containerized application A at this time is 4. The resource specifications before the network failure are different.
  • the containerized application A is scaled down first, and the number of CPU cores occupied by the containerized application A is restored from 4 to 2, and then the containerized application A is continuously processed from this state. Management and control; during the management and control period, it is found that the containerized application A has a heavy load and a large response delay, which does not meet the current application requirements of the service demander. For management and control processing, the number of CPU cores occupied by the containerized application A is expanded from 2 to 4, 5 or more through the edge computing nodes 102b in the edge cluster 102 to meet current application requirements.
  • the central management and control node 101 rolls back the containerized application in the edge cluster 102 to the management and control state before it does not meet the management and control conditions (ie, the first management and control state above), it can start from the first management and control state. , and re-manage and control the containerized applications in the edge cluster 102 according to the current service requirements of the edge service demander.
  • the central control node 101 performs hot migration of the containerized application in the edge cluster 102 on the basis of the first control state; if the edge service demander The current service demand needs to rebuild the containerized application, and the central control node 101 rebuilds the containerized application in the edge cluster 102 on the basis of the first control state; if the current service demand of the edge service demander requires the containerized application After the containerized application is upgraded, the central management and control node 101 upgrades the containerized application in the edge cluster 102 on the basis of the first management and control state.
  • the edge service demander can also determine that the central management and control node cannot When the target edge cluster is under management and control, a management and control switching instruction is issued to the edge management and control node to instruct the management and control authority to switch from the central management and control node to the edge management and control node. Based on this, the edge management and control node can also determine that the central management and control node does not meet the management and control conditions for the edge cluster when receiving the management and control switching instruction sent by the edge service demander.
  • the edge service demander can flexibly switch between the central control node and the edge control node according to application requirements. For example, if the central control node meets the control conditions for the edge cluster, if due to special requirements, the central control node does not need to control the containerized applications in the edge cluster, but the edge control node in the edge cluster is required. To manage and control containerized applications, the edge service demander can also send a control switching instruction to the edge control node to instruct the edge control node to manage and control the containerized applications in the edge cluster to which it belongs.
  • the central management and control nodes 101 and the edge management and control nodes 102a are used to synchronize management and control data and information such as management and control status, forming an interaction as shown in FIG. 1b. loop.
  • FIG. 1b. loop As shown in Fig.
  • the first control data is provided to the edge computing node 102b, the edge computing node 102 caches it locally, and synchronizes it to the edge control node 102a; during the control period of the edge control node 102a, the second data
  • the management and control data is provided to the edge computing node 102b, which is cached locally by the edge computing node 102 and synchronized to the central management and control node 101.
  • the service is provided intermittently, and this synchronization method can also ensure the consistency of data synchronization. Further, after the central management and control node re-satisfies the management and control conditions, it does not rely on the edge management and control node's control status of the edge cluster, but re-processes the containerized applications in the edge cluster according to its control status of the edge cluster before the management and control conditions are not met. In this way, the two management and control nodes are loosely coupled and independently managed and controlled, and the edge autonomy capability is more flexible.
  • the central management and control node 101 and the edge management and control node 102a cooperate with each other to perform various management and control on the containerized applications in the edge cluster 102, so as to ensure the serviceability of the containerized applications.
  • Kubernetes (K8s) technology can be used.
  • the master components of Kubernetes can be deployed on the central control node 101 and the edge control node 102, respectively, which are respectively recorded as the central master.
  • a container group can be used to organize and manage containerized applications, and a Pod is the smallest atomic unit that can be scheduled.
  • the central master component when the central master component meets the control conditions for the edge cluster, the central master component performs various management and control operations on the Pods in the edge cluster through the worker component; while the central master component does not meet the control of the edge cluster.
  • the edge master component in the cluster performs various management and control operations on the Pod through the worker component.
  • the central master component and the edge master component cooperate with each other.
  • the edge side can still perform management and control operations such as migration, expansion and contraction, and upgrade of edge applications, which can still ensure that edge applications can provide services without interruption.
  • the disconnection of the cloud-edge network mainly refers to the situation where the edge side is separated from the cloud management and control.
  • edge service demanders can perform native K8s operation and maintenance management operations at the edge, such as querying the edge cluster, the running status of edge computing nodes, log data, and logging in to the Pod to perform various operation and maintenance operations, etc. .
  • it can make full use of the orchestration and scheduling capabilities of K8s to improve the production efficiency and operation and maintenance efficiency of edge computing scenarios.
  • the central management and control node 101 can not only manage and control the containerized applications in the edge cluster 102, but also manage and control the edge cluster 102 in terms of resource scheduling, operation and maintenance, network, security, etc., so that the Edge services are placed in each edge cluster 102 for processing.
  • the central management and control node 101 may be deployed in one or more cloud computing data centers, or may be deployed in one or more traditional data centers, and the central management and control node 101 may also be deployed in one or more of the cloud computing data centers it manages and controls.
  • this embodiment does not limit this.
  • tasks such as network forwarding, storage, computing, and/or intelligent data analysis can be processed in each edge cluster 102. Since each edge cluster 102 is closer to the terminal, the response can be reduced Delay, reduce the pressure on the central cloud or traditional cloud computing platforms, and reduce bandwidth costs.
  • FIG. 2a is a schematic flowchart of an edge management and control method provided by an exemplary embodiment of the present application.
  • the method is suitable for the edge cloud system shown in Figure 1a, as shown in Figure 2a, the method includes:
  • the containerized application in the target edge cluster is managed and controlled, and the target edge cluster is any edge cluster in the edge cloud system.
  • the edge control node in the target edge cluster manages and controls the containerized application in the target edge cluster, so that the containerized application continues to provide services.
  • the central management and control node After the central management and control node re-satisfies the management and control conditions for the target edge cluster, it re-manages and controls the containerized applications in the target edge cluster based on its state of management and control of the containerized applications in the target edge cluster before the management and control conditions are not met.
  • FIG. 2b is a schematic flowchart of another edge management and control method provided by an exemplary embodiment of the present application. The method is described from the point of view of edge control nodes, as shown in Figure 2b, the method includes:
  • the edge management and control node determines that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster, and the target edge cluster is the edge cluster to which the edge management and control node in the edge cloud system belongs.
  • an implementation manner of managing and controlling the containerized applications in the target edge cluster during the period when the central management and control node does not meet the management and control conditions for the target edge cluster includes: determining that the central management and control node does not meet the requirements.
  • controlling the control conditions of the target edge cluster obtain the first control state of the containerized applications in the target edge cluster by the central control node before the control conditions are met; start to manage and control the containerized applications in the target edge cluster from the first control state .
  • an implementation manner of obtaining the first management and control state of the containerized application in the target edge cluster by the central management and control node before the management and control conditions are not satisfied including: according to the first management and control of the synchronization of the edge computing nodes in the target edge cluster Data to determine the first control state of the central control node on the containerized application in the target edge cluster before the control conditions are met; the first control data is the central control node before the control conditions are not met.
  • the method of this embodiment further includes: during the management and control of the containerized application in the target edge cluster, according to the service requirement information submitted by the edge service demander and/or the running status of the containerized application, Generate the second management and control data required to manage and control the containerized application in the target edge cluster; deliver the second management and control data to the edge computing nodes in the target edge cluster, so that the edge computing nodes can perform management and control operations for the containerized applications .
  • an implementation manner of determining that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster includes: monitoring the edge computing node in the target edge cluster that synchronizes the first management and control data with the edge management and control node information, and according to the information, it is determined that the central management and control node does not meet the management and control conditions for the target edge cluster; wherein, the edge computing node synchronizes the first management and control with the edge management and control node when the network connection between it and the central management and control node is disconnected. data. or,
  • another implementation manner of determining that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster includes: sending an inquiry message to the edge computing node in the target edge cluster to inquire about the edge computing Whether the node maintains network connection with the central management and control node, and counts the number of edge computing nodes disconnected from the central management and control node according to the response message returned by the edge computing node; when the number is greater than or equal to the set number threshold, determine the center The control node does not meet the control conditions for the target edge cluster. or,
  • the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster
  • determine the central management and control in the case of receiving the management and control switching instruction sent by the edge service demander, determine the central management and control.
  • the node does not meet the management and control conditions for the target edge cluster; wherein, the management and control switching instruction is that the edge service demander does not need the central management and control node to manage and control the target edge cluster or when it is determined that the central management and control node cannot manage and control the target edge cluster. sent.
  • FIG. 2c is a schematic flowchart of still another edge management and control method provided by an exemplary embodiment of the present application. The method is described from the perspective of the central control node, as shown in Figure 2c, the method includes:
  • the central management and control node determines that the management and control conditions for the target edge cluster are re-satisfied, and the target edge cluster is any edge cluster in the edge cloud system;
  • the method further includes: before the central management and control node does not meet the management and control conditions for the target edge cluster, according to the service demand information submitted by the edge service demander and/or the running status of the containerized application, generate a First management and control data required to manage and control the containerized applications in the target edge cluster; deliver the first management and control data to the edge computing nodes in the target edge cluster, so that the edge computing nodes can execute the containerized application according to the first management and control data Application control operations.
  • an implementation manner of re-managing and controlling the containerized applications in the target edge cluster based on the management and control status of the containerized applications in the target edge cluster before the management and control conditions are not met including: obtaining the edge The second control state of the control node on the containerized application in the target edge cluster; if the second control state is inconsistent with the first control state of the containerized application in the target edge cluster by the central control node before the control conditions are met, the target edge cluster The containerized applications in the system are rolled back from the second control state to the first control state; starting from the first control state, the containerized applications in the target edge cluster are re-managed.
  • an implementation manner of rolling back the containerized application in the target edge cluster from the second control state to the first control state includes: identifying the containerized application to be rolled back in the target edge cluster, so The containerized application to be rolled back is a containerized application with differences between the second control state and the first control state of the target edge cluster; it is judged whether the edge computing node where the containerized application to be rolled back is located is faulty; Then, the containerized application to be rolled back is rolled back from the second management state to the first management state.
  • rolling back the containerized application to be rolled back from the second management and control state to the first management and control state includes at least one of the following situations:
  • the containerized application to be rolled back is a newly added containerized application in the target edge cluster, delete the newly added containerized application;
  • the containerized application to be rolled back is the original containerized application that was deleted in the target edge cluster, rebuild the deleted original containerized application on the edge computing node where the original containerized application was deleted;
  • the containerized application to be rolled back is the original containerized application whose resource specifications have changed in the target edge cluster, restore the resource specifications of the original containerized application whose resource specifications have changed to the resources in the first control state. Specification.
  • an implementation manner of obtaining the second management and control state of the containerized application in the target edge cluster by the edge management and control node includes: determining the edge management and control node according to the second management and control data synchronized by the edge computing nodes in the target edge cluster The second control state of the containerized application in the target edge cluster; wherein, the second control data is that the edge control node sends it to the edge computing node during the management and control of the target edge cluster, so that it can process the containerized application in the target edge cluster. controlled.
  • an implementation manner of determining to re-satisfy the management and control conditions for the target edge cluster during the period when the control conditions for the target edge cluster in the edge cloud system are not satisfied includes: when the control conditions for the target edge cluster are not satisfied During the management and control conditions of the target edge cluster, the information of the edge computing nodes in the target edge cluster that maintain network connection with the central management and control node is counted; according to the information, it is determined that the central management and control node re-satisfies the management and control conditions for the target edge cluster.
  • an implementation method for determining that the central management and control node re-satisfies the management and control conditions for the target edge cluster includes: If the number of edge computing nodes that maintain heartbeat connections is greater than or equal to the set number threshold, or the number ratio is greater than or equal to the set proportion threshold, it is determined that the central control node re-satisfies the control conditions for the target edge cluster.
  • the method of this embodiment further includes: according to the number of received heartbeat packets from the edge computing node and the information of the faulty edge computing node reported by the edge control node, determining the edge computing node that maintains a network connection with it. The number or ratio of the number of nodes; wherein, when the edge management and control node fails to receive the heartbeat message of the edge computing node, it reports the information of the failure of the edge computing node to the central management and control node.
  • the method of this embodiment further includes: if the information of the faulty edge computing node reported by the edge management and control node is received, and the heartbeat message of the faulty edge computing node is not received within the set time, Then, rebuild the containerized application on the failed edge computing node on other edge computing nodes in the target edge cluster.
  • a cloud-native technology such as container is used in the edge cloud system to realize a cloud-edge fusion architecture.
  • the edge autonomy capability of the edge fusion architecture greatly improves the service capability of edge containerized applications; further, after the central control node re-satisfies the control conditions, it does not depend on the edge control node's control status of the edge cluster, but based on its unsatisfactory Before the management and control conditions, the management and control status of the edge cluster is re-managed to control the containerized applications in the edge cluster. In this way, the two management and control nodes are loosely coupled and independently managed and controlled, and the edge autonomy capability is more flexible.
  • the execution subject of each step of the method provided in the above-mentioned embodiments may be the same device, or the method may also be executed by different devices.
  • the execution subject of steps 21a and 23a may be a central control node
  • the execution subject of step 22a may be an edge control node; and so on.
  • FIG. 3 a is a schematic structural diagram of an edge management and control apparatus provided by an exemplary embodiment of the present application.
  • the edge management and control device can be applied to the edge management and control nodes in the above-mentioned system. As shown in FIG. 3a, the device includes: a determination module 31a and a management and control module 32a.
  • the determining module 31a is used to determine that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster, and the target edge cluster is the edge cluster to which the edge management and control node in the edge cloud system belongs.
  • the management and control module 32a is configured to manage and control the containerized applications in the target edge cluster when the central management and control node does not meet the management and control conditions for the target edge cluster, so that the containerized applications continue to provide services;
  • the management and control module 32a is specifically configured to: when it is determined that the central management and control node does not meet the management and control conditions for the target edge cluster, obtain the information about the containerized applications in the target edge cluster by the central management and control node before the management and control conditions are not met.
  • the first control state the containerized applications in the target edge cluster are managed and controlled from the first control state.
  • the management and control module 32a when acquiring the first management and control state, is configured to: determine, according to the first management and control data synchronized by the edge computing nodes in the target edge cluster, that the central management and control node controls the target edge before the management and control conditions are not satisfied.
  • the first control state of the containerized application in the cluster wherein, the first control data is delivered by the central control node to the edge computing node for it to manage and control the containerized application in the target edge cluster before the control conditions are not met.
  • the apparatus further includes: a generating module 33a and a sending module 34a.
  • the generating module 33a is used to generate a containerized application in the target edge cluster according to the service requirement information submitted by the edge service demander and/or the running status of the containerized application during the management and control of the containerized application in the target edge cluster
  • the sending module 34a is configured to deliver the second management and control data to the edge computing nodes in the target edge cluster, so that the edge computing nodes can perform management and control operations for the containerized application.
  • the determining module 31a is specifically configured to: monitor the information of the edge computing node in the target edge cluster that synchronizes the first management and control data with the edge management and control node, and according to the information, determine that the central management and control node does not meet the requirements for the target edge.
  • the management and control conditions of the cluster wherein the edge computing node synchronizes the first management and control data with the edge management and control node when the network connection between it and the central management and control node is disconnected.
  • the determining module 31a is specifically configured to: send an inquiry message to the edge computing node in the target edge cluster to inquire whether the edge computing node maintains a network connection with the central management and control node, and according to the response message returned by the edge computing node Count the number of edge computing nodes disconnected from the network connection with the central control node; when the number is greater than or equal to the set number threshold, it is determined that the central control node does not meet the control conditions for the target edge cluster.
  • the determining module 31a is specifically configured to: in the case of receiving the management and control switching instruction sent by the edge service demander, determine that the central management and control node does not meet the management and control conditions for the target edge cluster; wherein, the management and control The switching instruction is sent by the edge service demander without requiring the central control node to control the target edge cluster or when it is determined that the central control node cannot control the target edge cluster.
  • the edge management and control device can be implemented as an edge management and control node, including: a memory 31b, a processor 32b, and a communication component 33b.
  • the memory 31b is used to store computer programs and may be configured to store various other data to support operations on the edge management node. Examples of such data include instructions for any application or method operating on the edge governance node, contact data, phonebook data, messages, pictures, videos, etc.
  • the processor 32b coupled with the memory 31b, is used for executing the computer program in the memory 31b, so as to: determine that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster; During the management and control conditions of the target edge cluster, the containerized applications in the target edge cluster are managed and controlled so that the containerized applications can continue to provide services; the target edge cluster is the edge cluster to which the edge management and control nodes in the edge cloud system belong.
  • the processor 32b when the processor 32b manages and controls the containerized applications in the target edge cluster, it is specifically configured to: when it is determined that the central management and control node does not meet the management and control conditions for the target edge cluster, obtain the information on whether the central management and control node is in the target edge cluster. The first control state of the containerized application in the target edge cluster before the control conditions are not met; the containerized application in the target edge cluster is managed and controlled from the first control state.
  • the processor 32b when the processor 32b acquires the first management and control state, it is specifically configured to: determine, according to the first management and control data synchronized by the edge computing nodes in the target edge cluster, that the central management and control node controls the target before the management and control conditions are not satisfied.
  • the first control state of the containerized application in the edge cluster wherein, the first control data is delivered by the central control node to the edge computing node for it to manage and control the containerized application in the target edge cluster before the control conditions are not met.
  • the processor 32b is further configured to: during the management and control of the containerized application in the target edge cluster, according to the service requirement information submitted by the edge service demander and/or the running status of the containerized application, generate a Second management and control data required to manage and control the containerized application in the target edge cluster; the second management and control data is delivered to the edge computing node in the target edge cluster through the communication component 33b, so that the edge computing node can execute the containerized application. control operations.
  • the processor 32b when it is determined that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster, the processor 32b is specifically configured to: monitor the target edge cluster to synchronize the first management and control data to the edge management and control node. The information of the edge computing node, and according to the information, it is determined that the central management and control node does not meet the management and control conditions for the target edge cluster; where the edge computing node is disconnected from the network connection between the central management and control node. Synchronize the first control data.
  • the processor 32b when it is determined that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster, the processor 32b is specifically configured to: send a query message to the edge computing node in the target edge cluster to Ask the edge computing node whether to maintain network connection with the central control node, and count the number of edge computing nodes disconnected from the central control node according to the response message returned by the edge computing node; when the number is greater than or equal to the set number threshold , determine that the central control node does not meet the control conditions for the target edge cluster.
  • the processor 32b when it is determined that the central management and control node in the edge cloud system does not meet the management and control conditions for the target edge cluster, the processor 32b is specifically used for: in the case of receiving a management and control switching instruction sent by the edge service demander Next, it is determined that the central control node does not meet the control conditions for the target edge cluster; wherein, the control switching instruction is that the edge service demander does not need the central control node to manage and control the target edge cluster, or it is determined that the central control node cannot control the target edge cluster. Sent under control.
  • the edge management node further includes: a display 34b, a power supply component 35b, an audio component 36b and other components. Only some components are schematically shown in FIG. 3b, which does not mean that the edge management and control node only includes the components shown in FIG. 3b. In addition, some components shown in FIG. 3b, such as the components in the dashed box, are optional components, not mandatory components, which may depend on the device form of the edge management and control node.
  • the embodiments of the present application further provide a computer-readable storage medium storing a computer program.
  • the processor can implement the steps in the foregoing method embodiments that can be executed by the edge control node.
  • an embodiment of the present application further provides a computer program product, including a computer program/instruction, when the computer program/instruction is executed by the processor, the processor can enable the processor to implement the steps in the above method embodiments that can be executed by the edge control node. .
  • FIG. 4a is a schematic structural diagram of another edge management and control apparatus according to an embodiment of the present application.
  • the edge management and control device can be applied to the central management and control node in the above-mentioned system. As shown in FIG. 4a, the device includes: a determination module 41a and a management and control module 42a.
  • the determining module 41a is configured to determine that the management and control conditions for the target edge cluster are re-satisfied when the central management and control node does not meet the management and control conditions for the target edge cluster in the edge cloud system.
  • the management and control module 42a is configured to re-manage and control the containerized applications in the target edge cluster based on the management and control state of the containerized applications in the target edge cluster before the central management and control node does not meet the management and control conditions; wherein, the target edge cluster is an edge cluster For any edge cluster in the cloud system, when the central control node does not meet the control conditions for the target edge cluster, the edge control node in the target edge cluster manages and controls the containerized applications in the target edge cluster.
  • the management and control module 42a is specifically configured to: obtain the second management and control status of the containerized application in the target edge cluster by the edge management and control node; If the first control state of the containerized application in the edge cluster is inconsistent, roll back the containerized application in the target edge cluster from the second control state to the first control state; starting from the first control state, re-run the container in the target edge cluster application management.
  • the management and control module 42a when acquiring the second management and control state, is specifically configured to: determine, according to the second management and control data synchronized by the edge computing nodes in the target edge cluster, whether the edge management and control node is responsible for the containerized application in the target edge cluster.
  • the second management and control state wherein, the second management and control data is issued by the edge management and control node to the edge computing node during the management and control of the target edge cluster for it to manage and control the containerized applications in the target edge cluster.
  • the management and control module 42a when rolling back the containerized application in the target edge cluster from the second management and control state to the first management and control state, is specifically configured to: identify the container to be rolled back in the target edge cluster
  • the containerized application to be rolled back is a containerized application with a difference between the second control state and the first control state of the target edge cluster; determine whether the edge computing node where the containerized application to be rolled back is located is faulty; If there is no failure, the containerized application to be rolled back is rolled back from the second management and control state to the first management and control state.
  • the management and control module 42a is specifically used for: if the containerized application to be rolled back is a newly added containerized application in the target edge cluster. If the containerized application to be rolled back is the original containerized application deleted in the target edge cluster, the edge computing node where the original containerized application is deleted will be deleted. If the containerized application to be rolled back is the original containerized application whose resource specification has changed in the target edge cluster, the resource specification of the original containerized application whose resource specification has changed will be rebuilt. Revert to the resource specification when it was in the first control state.
  • the determining module 41a is specifically configured to: during the period when the central management and control node does not meet the management and control conditions for the target edge cluster, count the information of the edge computing nodes in the target edge cluster that maintain network connection with the central management and control node; This information determines that the central control node re-satisfies the control conditions for the target edge cluster.
  • the determining module 41a is specifically used for: if the number of edge computing nodes that maintain heartbeat connection with the central management and control node is greater than or equal to a set number threshold , or the number ratio is greater than or equal to the set ratio threshold, it is determined that the central control node re-satisfies the control conditions for the target edge cluster.
  • the determining module 41a is also used to: determine the edge computing node that maintains a network connection with it according to the number of received heartbeat messages from the edge computing node and the information of the faulty edge computing node reported by the edge control node. The number or the proportion of the number; wherein, when the edge management and control node fails to receive the heartbeat message of the edge computing node, it reports the information of the failure of the edge computing node to the central management and control node.
  • the management and control module 42a is also used for: in the case of receiving the information of the faulty edge computing node reported by the edge management and control node, and not receiving the heartbeat message of the faulty edge computing node within the set time. Next, rebuild the containerized application on the failed edge computing node on other edge computing nodes in the target edge cluster.
  • the edge management and control device can be implemented as a central management and control node, including: a memory 41b, a processor 42b, and a communication component 43b.
  • the memory 41b is used to store computer programs and can be configured to store various other data to support operations on the central management node. Examples of such data include instructions for any application or method operating on the central control node, contact data, phonebook data, messages, pictures, videos, etc.
  • the processor 42b coupled with the memory 41b, is used for executing the computer program in the memory 41b, so as to: during the period when the central management and control node does not meet the control conditions for the target edge cluster in the edge cloud system, determine that the target edge cluster is re-satisfied. Management and control conditions; based on the management and control status of the containerized applications in the target edge cluster by the central management and control node before the management and control conditions are not met, the containerized applications in the target edge cluster are re-managed; among them, the target edge cluster is any part of the edge cloud system.
  • the edge management and control nodes in the target edge cluster manage and control the containerized applications in the target edge cluster.
  • the processor 42b when the processor 42b re-manages and controls the containerized application in the target edge cluster, it is specifically configured to: obtain the second management and control state of the containerized application in the target edge cluster by the edge management and control node; The second control state is inconsistent with the first control state of the containerized application in the target edge cluster by the central control node before the control conditions are met, and the containerized application in the target edge cluster is rolled back from the second control state to the first control state; From the first control state, the containerized applications in the target edge cluster are re-managed.
  • the processor 42b when acquiring the second management and control state of the containerized application in the target edge cluster by the edge management and control node, is specifically configured to: determine the edge management and control according to the second management and control data synchronized by the edge computing node in the target edge cluster.
  • the processor 42b when rolling back the containerized application in the target edge cluster from the second control state to the first control state, is specifically configured to: identify the container to be rolled back in the target edge cluster
  • the containerized application to be rolled back is a containerized application with a difference between the second control state and the first control state of the target edge cluster; determine whether the edge computing node where the containerized application to be rolled back is located is faulty; If there is no failure, the containerized application to be rolled back is rolled back from the second management and control state to the first management and control state.
  • the processor 42b rolls back the containerized application to be rolled back from the second management and control state to the first management and control state
  • the processor 42b is specifically configured to: if the containerized application to be rolled back is a newly added containerized application in the target edge cluster If the containerized application to be rolled back is the original containerized application deleted in the target edge cluster, the edge computing node where the original containerized application is deleted will be deleted. If the containerized application to be rolled back is the original containerized application whose resource specification has changed in the target edge cluster, the resource specification of the original containerized application whose resource specification has changed will be rebuilt. Revert to the resource specification when it was in the first control state.
  • the processor 42b when determining that the management and control conditions for the target edge cluster are re-satisfied, is specifically configured to: during the period when the management and control conditions for the target edge cluster are not satisfied, count the number of nodes in the target edge cluster that maintain network connection with the central management and control node. Information of edge computing nodes; according to this information, it is determined that the central management and control node re-satisfies the management and control conditions for the target edge cluster.
  • the processor 42b determines according to the above information that the central management and control node re-satisfies the management and control conditions for the target edge cluster, it is specifically used for: if the number of edge computing nodes that maintain a heartbeat connection with the central management and control node is greater than or equal to the set value.
  • the number threshold, or the number ratio is greater than or equal to the set ratio threshold, it is determined that the central control node re-satisfies the control conditions for the target edge cluster.
  • the processor 42b is further configured to: determine the edge computing node that maintains a network connection with the edge computing node according to the number of received heartbeat messages from the edge computing node and the information of the faulty edge computing node reported by the edge control node. The number or the proportion of the number; wherein, when the edge management and control node fails to receive the heartbeat message of the edge computing node, it reports the information of the failure of the edge computing node to the central management and control node.
  • the processor 42b is further configured to: in the case of receiving the information of the faulty edge computing node reported by the edge control node, and not receiving the heartbeat message of the faulty edge computing node within the set time Next, rebuild the containerized application on the failed edge computing node on other edge computing nodes in the target edge cluster.
  • the central management node further includes: a display 44b, a power supply component 45b, an audio component 46b and other components. Only some components are schematically shown in FIG. 4b, which does not mean that the central management and control node only includes the components shown in FIG. 4b. In addition, some components shown in FIG. 4b, such as the components in the dashed box, are optional components, not mandatory components, which may depend on the equipment form of the central management and control node.
  • the embodiments of the present application further provide a computer-readable storage medium storing a computer program.
  • the processor can implement the steps in the above method embodiments that can be executed by the central control node.
  • an embodiment of the present application also provides a computer program product, including a computer program/instruction, when the computer program/instruction is executed by the processor, the processor can implement the steps in the above method embodiments that can be executed by the central management and control node. .
  • the memory in the above-described embodiments may be implemented by any type of volatile or non-volatile memory device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM Static Random Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk any type of volatile or non-volatile memory device or a combination thereof
  • the communication components in the above embodiments are configured to facilitate wired or wireless communication between the device where the communication components are located and other devices.
  • the device where the communication component is located can access a wireless network based on a communication standard, such as WiFi, a mobile communication network such as 2G, 3G, 4G/LTE, 5G, or a combination thereof.
  • the communication component receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication assembly further includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the display in the above-described embodiments includes a screen, and the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • a power supply assembly may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the equipment in which the power supply assembly is located.
  • the audio components in the above-described embodiments may be configured to output and/or input audio signals.
  • the audio component includes a microphone (MIC) that is configured to receive external audio signals when the device in which the audio component is located is in operating modes, such as call mode, recording mode, and speech recognition mode.
  • the received audio signal may be further stored in memory or transmitted via the communication component.
  • the audio assembly further includes a speaker for outputting audio signals.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include forms of non-persistent memory, random access memory (RAM) and/or non-volatile memory in computer readable media, such as read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请实施例提供一种边缘云系统、边缘管控方法、管控节点及存储介质。在本申请实施例中,提供一种云边融合架构,针对该云边融合架构,采用中心和边缘双重管控的方式,可极大地提高云边融合架构的边缘自治能力,极大地提升边缘容器化应用的服务能力;进一步,在中心管控节点重新满足管控条件之后,不依赖边缘管控节点对边缘集群的管控状态,而是依据其在不满足管控条件之前对边缘集群的管控状态重新对边缘集群中的容器化应用进行管控,这样两个管控节点之间为松耦合,各自独立管控,边缘自治能力更加灵活。

Description

边缘云系统、边缘管控方法、管控节点及存储介质
本申请要求2021年02月01日递交的申请号为202110139135.6、发明名称为“边缘云系统、边缘管控方法、管控节点及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及边缘云技术领域,尤其涉及一种边缘云系统、边缘管控方法、管控节点及存储介质。
背景技术
随着5G、物联网时代的到来以及云计算应用的逐渐增加,终端侧对云计算资源在时延、带宽等性能上的要求越来越高,集中式的云网络已经无法满足终端侧日渐增高的需求,于是出现了边缘计算技术。边缘计算是一种将数据处理、存储等放在距离终端更近的边缘节点中处理的分布式计算形式,贴近数据源,有利于降低服务响应时延,降低带宽成本。
随着边缘计算的日益发展,在边缘侧需要部署大量的应用,而容器(container)作为一种云原生技术,具有轻量化和可移植性等优秀的特性,非常适合承载边缘计算场景中的应用实例,而且由于容器天生对应用具有更好的亲和性,便于快速方便地在短时间内部署或关闭应用实例,来满足边缘侧的实时流量。然而,如何对边缘容器进行编排、调度是云原生和边缘计算融合(简称为云边融合)所面临的技术问题。
在现有技术中,采用开源的容器编排调度系统kubernetes解决云原生和边缘计算融合场景下容器的编排、调度问题。具体地,可以将kubernetes的主(master)组件托管在云端,将kubernetes的工作(worker)组件部署在边缘计算节点上,worker组件通过公网连接到云端的master组件,master组件和worker组件相互配合实现云边融合场景下容器的编排和调度。但是,由于云端和边缘侧之间通过公网连接,公网网络的延迟和不稳定等客观因素,使运行在边缘侧的应用时常脱离云端管控,无法保障给边缘应用的可服务性。
发明内容
本申请的多个方面提供一种边缘云系统、边缘管控方法、管控节点及存储介质,用以实现云边融合架构下的边缘自治能力,保障边缘应用的可服务性。
本申请实施例提供一种边缘云系统,包括:中心管控节点,以及与所述中心管控节点网络连接的至少一个边缘集群,每个边缘集群包括边缘管控节点和边缘计算节点,所述边缘计算节点上可部署容器化应用;
所述边缘管控节点,用于在所述中心管控节点不满足对所述边缘管控节点所属目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;
所述中心管控节点,用于在重新满足对所述目标边缘集群的管控条件之后,基于其在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控。
本申请实施例还提供一种边缘管控方法,适用于边缘管控节点,所述方法包括:确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件;以及在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;其中,所述目标边缘集群是所述边缘云系统中所述边缘管控节点所属的边缘集群。
本申请实施例还提供一种边缘管控方法,适用于中心管控节点,所述方法包括:在不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件;基于在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控;其中,所述目标边缘集群是所述边缘云系统中的任一边缘集群,在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,由所述目标边缘集群中的边缘管控节点对所述目标边缘集群中的容器化应用进行管控。
本申请实施例还提供一种边缘管控节点,包括:存储器和处理器;所述存储器,用于存储计算机程序;所述处理器,与所述存储器耦合,用于执行所述计算机程序,以用于:确定其所属边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件;以及在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;其中,所述目标边缘集群是所述边缘云系统中所述边缘管控节点所属的边缘集群。
本申请实施例还提供一种中心管控节点,包括:存储器和处理器;所述存储器,用于存储计算机程序;所述处理器,与所述存储器耦合,用于执行所述计算机程序,以用于:在不满足对其所属边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件;基于在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控;其中,所述目标边缘集群是所述边缘云系统中的任一边缘集群,在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,由所述目标边缘集群中的边缘管控节点对所述目标边缘集群中的容器化应用进行管控。
本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,当所述计算机程序被处理器执行时,致使所述处理器实现本申请实施例提供的任一方法中的步骤。
本申请实施例还提供一种计算机程序产品,包括计算机程序/指令,其中,当所述计算机程序/指令被处理器执行时,致使所述处理器实现本申请实施例提供的任一方法中的步骤。
在本申请实施例中,提供一种云边融合架构,针对该云边融合架构,除了通过中心管控节点对边缘集群中的容器化应用进行管控之外,还在边缘集群中增设边缘管控节点,这样在中心管控节点不满足管控条件期间边缘管控节点可对其所在边缘集群中的容器化应用进行管控,采用中心和边缘双重管控的方式,可极大地提高云边融合架构的边缘自治能力,极大地提升边缘容器化应用的服务能力;进一步,在中心管控节点重新满足管控条件之后,不依赖边缘管控节点对边缘集群的管控状态,而是依据其在不满足管控条件之前对边缘集群的管控状态重新对边缘集群中的容器化应用进行管控,这样两个管控节点之间为松耦合,各自独立管控,边缘自治能力更加灵活。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1a为本申请示例性实施例提供的一种边缘云系统的结构示意图;
图1b为本申请实施例提供的边缘云系统中中心管控节点、边缘管控节点以及边缘计算节点之间进行数据交互的关系状态示意图;
图2a为本申请示例性实施例提供的一种边缘管控方法的流程示意图;
图2b为本申请示例性实施例提供的另一种边缘管控方法的流程示意图;
图2c为本申请示例性实施例提供的又一种边缘管控方法的流程示意图;
图3a为本申请示例性实施例提供的一种边缘管控装置的结构示意图;
图3b为本申请示例性实施例提供的一种边缘管控节点的结构示意图;
图4a为本申请示例性实施例提供的另一种边缘管控装置的结构示意图;
图4b为本申请示例性实施例提供的一种中心管控节点的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1a为本申请示例性实施例提供的一种边缘云系统的结构示意图。如图1a所示,该边缘云系统100包括:中心管控节点101,以及与中心管控节点101网络连接的至少一个边缘集群102。每个边缘集群102包括边缘管控节点102a和边缘计算节点102b,其中,在边缘计算节点102b上可部署容器化应用。
本实施例的边缘云系统100是基于云计算技术和边缘计算的能力,构筑在边缘基础设施之上的云计算平台,是一种靠近边缘位置的具备计算、网络、存储以及安全等能力的网络系统。边缘云是个相对概念,边缘云是指相对靠近终端的云计算平台,这里的终端是指云计算服务的需求端,例如可以是互联网中的终端或者用户端,或者物联网中的 终端或用户端。或者说,本实施例的边缘云系统100与中心云或者传统的云计算平台相区别,中心云或者传统的云计算平台可以包括资源规模化且位置集中的数据中心,而本实施例的边缘云系统100包括至少一个边缘集群102,这些边缘集群102覆盖的网络范围更广泛,也因此具备距离终端更近的特性,单个边缘集群102的资源规模较小,但是边缘集群102的数量相对较多。
在本实施例中,每个边缘集群102包括一系列的边缘基础设施,这些边缘基础设施包括但不限于:分布式数据中心(DC)、无线机房或集群,运营商的通信网络、核心网设备、基站、边缘网关、家庭网关、计算设备和/或存储设备等边缘设备及对应的网络环境等等。在一些可选实施例中,边缘集群102可实现为位于边缘的互联网数据中心(Internet Data Center,IDC),即一个边缘IDC即为本申请实施例中的一个边缘集群102;或者,边缘集群102可实现为位于边缘的机房,即一个机房即为本申请实施例中的一个边缘集群102。在此说明,不同边缘集群102的位置、能力以及包含的基础设施可以相同,也可以不相同。基于这些边缘基础设施,边缘集群102可以对外提供各种资源,例如CPU、GPU、服务器、计算设备等具有一定计算能力的资源,内存、硬盘等具有存储能力的资源,以及带宽等网络资源等。在本实施例中,将边缘集群102中具有一定计算能力的资源称为边缘计算节点102b,例如可以是服务器、计算设备等,在每个边缘集群102中包括至少一个边缘计算节点102b。
本实施例的边缘云系统100可应用于内容分发网络(Content Delivery Network,CDN)、电商、游戏、音视频、物联网、物流、工业大脑、城市大脑等各种应用场景中,面向各种场景中的终端用户提供云计算服务。具体地,针对各应用场景,可以在边缘云系统100中的边缘集群102中部署该应用场景中可提供云计算服务的应用(后续简称为应用),其中,在边缘集群102中部署应用实际上是在边缘集群102中的边缘计算节点102b上部署应用的过程。例如,在电商场景中,可以在边缘集群102中的边缘计算节点102b上部署可提供在线购物功能的应用,例如可以是在线购物应用的服务端,该服务端与购物终端进行交互可为购物用户提供在线购物功能;在游戏场景中,可以在边缘集群102中的边缘计算节点102b上部署可提供在线游戏功能的应用,例如可以是在线游戏应用的服务端,该服务端与游戏终端进行交互可为游戏用户提供在线游戏服务;在音视频领域中,可以在边缘集群102中的边缘计算节点102b上部署可提供音视频功能的应用,例如可以是直播服务端、点播服务端或视频监测服务端等,这些服务端与播放终端进行交互可为观看用户提供直播、点播或监测等服务。
考虑到在边缘云系统100中可能需要部署大量应用,鉴于容器这种云原生技术具有轻量化、可移植性等优秀的特性,在本实施例中采用容器这种云原生技术,即通过容器承载应用,然后以容器为单位进行应用的部署,这样可实现一种云原生与边缘计算融合的架构,简称为云边融合架构。在本实施例中,将承载于容器中的应用称为容器化应用, 也可以简称为容器实例。通过运行这些部署在边缘计算节点102b上的容器化应用,可以面向终端用户提供相应的云计算服务。这就涉及容器化应用的部署问题。除此之外,在实际应用中,为了保证容器化应用的可服务性,还可能对容器化应用进行其它管控操作。例如,若承载容器化应用的边缘计算节点102b有可能发生故障,会涉及对容器化应用的热迁移操作,或者,在承载容器化应用的边缘计算节点102b的资源不足时,会涉及针对容器化应用的资源扩容,或者,在容器化应用的负载过高时,还会涉及对容器化应用的资源扩容或者升级处理。这意味着,本实施例的边缘云系统100就面临着对容器化应用的各种管控问题。
在本实施例的边缘云系统100中,部署有中心管控节点101,中心管控节点101可以以边缘集群102为管控对象,对各个边缘集群102中的容器化应用进行管控。其中,中心管控节点101可根据边缘服务需求方提交的服务需求信息,对相应边缘集群102中的容器化应用进行管控。例如,边缘服务需求方提交的服务需求信息可以是要求在指定区域部署容器化应用,变更容器化应用的服务质量要求,或者要求对容器化应用进行升级,或者要求增加或减少容器化应用的数量,等等。其中,边缘服务需求方是指需要借助边缘集群102为其提供云计算服务的一方。除此之外,中心管控节点101也可以自动监测边缘集群102中容器化应用的运行状态,据此对这些容器化应用进行管控。其中,容器化应用的运行状态主要是指容器化应用是否正常运行。
无论是根据上述哪种信息对容器化应用进行管控,对容器化应用的管控包括以下至少一种:容器化应用的部署、重建、升级、迁移、资源扩容、资源缩容、关停、重启以及释放。进一步,无论对容器化应用进行哪种管控,中心管控节点101均可通过边缘集群102中的边缘计算节点102b实现对该边缘集群102中的容器化应用的管控操作。具体地,中心管控节点101可以根据边缘服务需求方提交的服务需求信息和/或监测到的容器化应用的运行状态,生成对边缘集群102中的容器化应用进行管控所需的第一管控数据,将第一管控数据下发给边缘集群102中的相应边缘计算节点102b,由相应边缘计算节点102b根据第一管控数据针对待管控的容器化应用执行部署、升级、迁移、资源扩容、资源缩容、关停、重启和释放中的至少一种管控操作。其中,第一管控数据是指供边缘计算节点102b对其所属边缘集群102中的容器化应用进行管控所需的数据,该数据中包括管控操作的类型以及与管控相关的各种动作、参数等。需要说明的是,根据服务需求信息和/或容器化应用的运行状态的不同,第一管控数据的内容以及所指示的管控操作的类型等均会有所不同,边缘计算节点102b所执行的管控操作也会有所不同。下面进行示例性说明:
容器化应用的部署:在一些可选实施例中,中心管控节点101对外提供需求提交入口,该需求提交入口可以是web页面、应用页面或命令窗等。该需求提交入口的作用是供边缘服务需求方向中心管控节点101提交自己的服务需求信息。基于此,在初始阶段, 边缘服务需求方可通过中心管控节点101提供的需求提交入口,向中心管控节点101提供请求部署容器化应用的服务需求信息,该服务需求信息包括:边缘集群选择参数、资源选择参数以及指向应用所需镜像文件的信息;边缘集群选择参数包括边缘集群所在的区域位置和/或对边缘集群的性能要求等,主要用于选择边缘集群;资源选择参数包括资源类型、资源数量以及对资源设备的性能要求等,主要用于选择边缘集群102中的边缘计算节点102b;指向应用所需镜像文件的信息可以是镜像文件的存储地址,或者可提供镜像文件的设备的访问地址,用于获取应用所需的镜像文件。基于此,中心管控设备101根据边缘集群选择参数,从至少一个边缘集群102中选择满足该参数要求的边缘集群;进一步,根据该资源选择参数对目标边缘集群102中的边缘计算节点102b进行调度,向被调度的边缘计算节点102b发送部署容器化应用相关的管控数据,该管控数据中包括指向应用所需镜像文件的信息以及容器化应用所需的资源信息,以指示被调度的边缘计算节点102b根据该管控数据在其上部署容器化应用。被调度的边缘计算节点102b根据该管控数据,一方面根据管控数据中的资源信息为容器化应用预留资源,另一方面根据管控数据中指向镜像文件的信息获取镜像文件,运行该镜像文件从而在预留的资源上完成容器化应用的部署。
在容器化应用运行过程中,边缘服务需求方可以通过中心管控节点101查询边缘集群102中容器化应用以及边缘计算节点102b产生的日志数据,可以通过这些日志数据了解容器化应用及其所在边缘计算节点102b的运行状态,据此可以确定是否针对容器化应用进行升级、迁移、资源扩容、资源缩容、关停、重启或释放等管控操作。或者,边缘服务需求方也可以根据中心管控节点101返回的容器化应用或其所在边缘计算节点102b的运行状态,确定是否针对容器化应用进行升级、迁移、资源扩容、资源缩容、关停、重启或释放等管控操作。或者,边缘服务需求方也可以根据服务需求,主动请求中心管控节点101对容器化应用进行升级、迁移、资源扩容、资源缩容、关停、重启或释放等管控操作。下面举例说明:
容器化应用的扩缩容:在容器化应用运行过程中,边缘服务需求方可能会因为服务扩展需求、用户流量增加或服务性能提升等因素希望对容器化应用行资源扩容,可通过中心管控节点101提供的需求提交入口,向中心管控节点101提供请求扩容的服务需求信息,该服务需求信息包括:资源增量或增量后的资源总量等信息。基于此,中心管控设备101根据资源增量或增量后的资源总量等信息,向承载容器化应用的边缘计算节点102b发送指示扩容的管控数据,该边缘计算节点102b接收到指示扩让的管控数据之后,根据该管控数据针对该容器化应用进行资源扩容,例如将分配给该容器化应用的内存资源从2Gb增加为4Gb,将分配给该容器化应用的CPU核数从2核增加为4核等。
当然,在容器化应用运行过程中,中心管控节点101也可以自动监测容器化应用的运行状态,在发现承载容器化应用的边缘计算节点102b的资源不足或者容器化应用的负 载过高时,也可以向承载容器化应用的边缘计算节点102b发送指示扩容的管控数据,该边缘计算节点102b接收到指示扩容的管控数据之后,根据该管控数据针对该容器化应用进行资源扩容。
同理,中心管控节点101也可以根据边缘服务需求方提供请求缩容的服务需求信息,或者,在监测到承载容器化应用的边缘计算节点102b的资源过剩或者容器化应用的负载较低时,向承载容器化应用的边缘计算节点102b发送指示缩容的管控数据,该边缘计算节点102b接收到指示缩容的管控数据之后,根据该管控数据针对该容器化应用进行资源缩容,例如将分配给该容器化应用的内存资源从2Gb减少为1Gb,将分配给该容器化应用的CPU核数从2核减少为1核等。
容器化应用的重建:在容器化应用运行过程中,若边缘服务需求方接收到中心管控节点101反馈的容器化应用所在边缘计算节点102b发生故障的通知消息,为了保证容器化应用的可服务性,可以向中心管控节点101提供请求重建的服务需求信息,该服务需求信息中可以包括指示重建容器化应用的信息,可选地还可以包括新边缘计算节点102b的信息,当然新的边缘计算节点102b也可以由中心管控节点101根据各边缘计算节点102b的负载、资源余量等信息自主选择,对此不做限定。中心管控节点101根据该服务需求信息,向新边缘计算节点102b发送指示重建容器化应用的管控数据,该管控数据包括待重建的容器化应用的标识信息以及所需的镜像文件或文件模板等,新边缘计算节点102b在接收到指示重建容器化应用的管控数据之后,根据管控数据中的镜像文件或文件模板在本地重建容器化应用。
当然,在容器化应用运行过程中,中心管控节点101也可以自动监测容器化应用所在边缘计算节点102b的运行状态,在发现承载容器化应用的原边缘计算节点102b发生故障时,为了保证可服务性,可以在新边缘计算节点102b重建发生故障的边缘计算节点上的容器化应用,于是向新边缘计算节点102b发送指示重建容器化应用的管控数据,新边缘计算节点102b在接收到指示重建容器化应用的管控数据之后,根据管控数据中包含的镜像文件或文件模板在本地重建相应容器化应用。
容器化应用的热迁移:在容器化应用运行过程中,可能会因资源归并、边缘计算节点的升级等需求,需要将容器化应用从原边缘计算节点迁移到新的边缘计算节点上。为了保证容器化应用的可服务性,可以采用热迁移方式。边缘服务需求方可以向中心管控节点101提供请求热迁移的服务需求信息,该服务需求信息中可以包括指示热迁移的信息,可选地还可以包括新边缘计算节点102b的信息,如IP地址,等等。当然新的边缘计算节点102b也可以由中心管控节点101根据各边缘计算节点102b的负载、资源余量等信息自主选择,对此不做限定。中心管控节点101根据该服务需求信息,向承载容器化应用的原边缘计算节点102b和新边缘计算节点102b分别发送指示热迁移的管控数据,该管控数据包括新边缘计算节点102b的信息,如IP地址,待迁移的容器化应用的标识 信息以及所需镜像文件或文件模板等,原边缘计算节点102b和新边缘计算节点102b在接收到指示热迁移的管控数据之后,基于两者之间建立通信连接并开始进行容器化应用的迁移,这里主要是指进行容器化应用状态的同步;在此期间,新边缘计算节点102b首先会根据管控数据中的镜像文件或文件模板在本地创建容器化应用,并根据原边缘计算节点102b同步过来的状态数据启动并运行容器化应用,并当容器化应用在新边缘计算节点102b成功运行之后,原边缘计算节点102b释放该容器化应用占据的资源。
当然,在容器化应用运行过程中,中心管控节点101也可以自动监测边缘计算节点102b上的资源碎片,当监测到某个边缘计算节点102b上的资源碎片较多,需要对其进行资源归并时,确定需要将该边缘计算节点102b上的容器化应用热迁移到的其它边缘计算节点102b上,于是向原边缘计算节点102b和新边缘计算节点102b分别发送指示热迁移的管控数据,原边缘计算节点102b和新边缘计算节点102b在接收到指示热迁移的管控数据之后,两者之间建立通信连接并开始进行容器化应用的迁移,当容器化应用在新边缘计算节点102b成功运行之后,原边缘计算节点102b释放该容器化应用占据的资源。
容器化应用的升级:随着服务需求的变化或镜像版本的更新,需要对部署于边缘计算节点102b上的相应容器化应用进行升级。在服务需求变化或镜像版本升级的情况下,边缘服务需求方可以通过中心管控节点101查询待升级容器化应用的运行状态,根据待升级容器化应用的运行状态,例如待升级容器化应用的服务请求及服务请求的响应状态等,判断待升级容器化应用是否适合升级,什么时间适合升级,采用什么方法进行升级等,进而为该待升级容器化应用生成升级策略,将待升级容器化应用的标识和升级策略携带在升级通知中发送给中心管控节点101。中心管控节点101接收边缘服务需求方发送的升级通知,确定待升级容器化应用的ID、名称等标识信息和其对应的升级策略,据此生成指示升级的管控数据,该管控数据包括待升级容器化应用的ID、名称等标识信息和升级策略,将该管控数据下发给待升级容器化应用所在边缘集群102中的相应边缘计算节点102b;边缘计算节点102b接收到管控数据后,可根据管控数据中的升级策略对标识信息所标识的容器化应用进行升级处理。其中,升级策略可以包括:升级时间、升级方法等。
当然,对容器化应用进行升级,也可由中心管控节点101主动发起。例如,中心管控节点101可以监测各容器化应用对应镜像的版本信息,当发现新版本的镜像时,可以确定需要对与该新版本的镜像对应的容器化应用进行升级;或者,也可以监测各容器化应用的运行状态、生命周期等信息,当发现容器化应用运行过程中出现漏洞、不稳定、功能不全、CPU或内存资源消耗过大等问题时,可以确定需要对出现这些问题的容器化应用进行升级,并生成指示升级的管控数据,该管控数据包括待升级容器化应用的ID、名称等标识信息和升级策略,将该管控数据下发给待升级容器化应用所在边缘集群102中的相应边缘计算节点102b;边缘计算节点102b接收到管控数据后,可根据管控数据 中的升级策略对标识信息所标识的容器化应用进行升级。
其中,边缘计算节点102b对待升级容器化应用进行升级主要是指:关停待升级容器化应用,根据新版本的镜像对待升级容器化应用进行更新,更新完后再重启容器化应用。
在本实施例中,中心管控节点101通过边缘集群102中的边缘计算节点102b对容器化应用进行管控时,需要中心管控节点101与边缘计算节点102b保持网络连接。在实际应用中,可能发生中心管控节点101与边缘计算节点102b之间的网络连接断开的情况。为了便于中心管控节点101感知其与边缘计算节点102b之间的网络连接是否断开,边缘计算节点102b与中心管控节点101之间可以保持心跳连接,即边缘计算节点102b可定时向中心管控节点101上报心跳报文,如果中心管控节点101在一定时间内未收到边缘计算节点102b上报的心跳报文,则认为其与边缘计算节点102b之间的网络连接断开。如果中心管控节点101与边缘计算节点102b之间的网络连接断开,可选地,中心管控节点101可以不再对该边缘计算节点102b进行调度,而是该边缘计算节点102b及其上部署的容器化应用保持当前状态。例如,当需要在边缘集群102中部署新的容器化应用或者需要对容器化应用进行热迁移时,中心管控节点101只会从与其保持网络连接的边缘计算节点102b中选择需要使用的边缘计算节点,不会考虑与其断开网络连接的边缘计算节点。当然除了这种方式之外,考虑到边缘集群102中通常包括多个边缘计算节点102b,且边缘计算节点102b之间通常会网络连接,故在中心管控节点101与某一边缘计算节点102b之间的网络连接断开后,中心管控节点101也可以基于其与其它边缘计算节点102b之间的通信连接通过其它边缘计算节点102b间接对与其断开连接的边缘计算节点102b上的容器化应用进行管控。
基于上述,在本申请可选实施例中,以边缘集群102为维度,考虑中心管控节点101与该边缘集群102之间的网络连接是否断开;如果中心管控节点101与边缘集群102之间的网络连接断开,则中心管控节点101无法对该边缘集群102中的容器化应用进行管控。进一步可选地,可以通过中心管控节点101与边缘集群102中保持通信连接的边缘计算节点102b的信息,来判断中心管控节点101与边缘集群102之间的网络连接是否断开。
在一可选实施例中,可以根据中心管控节点101与边缘集群102中保持通信连接的边缘计算节点102b的数量或数量比例,来判断中心管控节点101与边缘集群102之间的网络连接是否断开。例如,若中心管控节点101与边缘集群102中保持通信连接的边缘计算节点的数量大于或等于设定数量阈值,或者该数量比例大于或等于设定的比例阈值,则确定中心管控节点101与边缘集群102之间保持网络连接;反之,若中心管控节点101与边缘集群102中保持通信连接的边缘计算节点的数量小于设定数量阈值,或者该数量比例小于设定的比例阈值,则确定中心管控节点101与边缘集群102之间的网络连接断开。关于数量阈值和比例阈值的取值,本申请实施例不做限定。以比例阈值为例,该比 例阈值可以是1,即要求中心管控节点101与边缘集群102中的全部边缘计算节点均保持通信连接,才能认为是中心管控节点101与边缘集群102之间保持网络连接;当然,该比例阈值可以是0.8,即要求中心管控节点101与边缘集群102中不少于80%的边缘计算节点均保持通信连接,才能认为是中心管控节点101与边缘集群102之间保持网络连接。
在另一可选实施例中,可以根据中心管控节点101与边缘集群102中保持通信连接的边缘计算节点102b中是否包含指定边缘计算节点,来判断中心管控节点101与边缘集群102之间的网络连接是否断开。例如,若中心管控节点101与边缘集群102中指定的边缘计算节点保持通信连接,则确定中心管控节点101与边缘集群102之间保持网络连接;反之,若中心管控节点101未与边缘集群102中任何指定的边缘计算节点保持通信连接,则确定中心管控节点101与边缘集群102之间的网络连接断开。其中,中心管控节点101与边缘集群102之间的网络连接断开主要指该边缘集群102脱离或基本上脱离中心管控节点101管控的情况,可以包括引起网络连接断开的各种情况。
在本申请实施例中,中心管控节点101除了在与边缘集群102之间的网络连接断开的情况下,无法对该边缘集群102中的容器化应用进行管控之外,还可能因为其它一些因素,不能对边缘集群102中的容器化应用进行管控,例如边缘服务需求方可以配置中心管控节点101在设定时间内不能对边缘集群102中的容器化应用进行管控,又例如,在中心管控节点101升级期间可能无法对边缘集群102中的容器化应用进行管控。鉴于此,在本实施例中,可以预先配置边缘集群102对应的管控条件,该管控条件可以是:要求中心管控节点101与边缘集群102保持通信连接,中心管控节点101不在升级期间,以及未在边缘服务需求方设定的非管控时间内等。这样,中心管控节点101可以在满足对边缘集群102的管控条件期间,对边缘集群102中的容器化应用进行各种管控。
那么,在中心管控节点101不满足对边缘集群102的管控条件期间,会因为缺少了对容器化应用的管控,容器化应用的可服务性可能会下降,甚至可能发生无法提供服务的情况。例如,当容器化应用所在的边缘计算节点102b发生故障时,由于未能及时对容器化应用在其它边缘计算节点上进行重建,此时容器化应用将无法继续提供服务。为了保证容器化应用的可服务性,在本实施例的边缘云系统100中,每个边缘集群102中增设了边缘管控节点102a,该边缘管控节点102a可在中心管控节点101不满足对该边缘管控节点102a所属边缘集群102的管控条件期间,负责对其所属边缘集群102中的容器化应用进行管控。进一步,在中心管控节点101重新满足对边缘集群102的管控条件之后,可重新收回管控权限,重新对该边缘集群102中的容器化应用进行管控。在本实施例中,采用中心和边缘双重管控的方式,可极大地提高云边融合架构的边缘自治能力,极大地提升边缘容器化应用的服务能力。
进一步,在本实施例中,在中心管控节点101重新满足对边缘集群102的管控条件 之后,不依赖边缘管控节点102a对边缘集群102的管控状态,而是基于其在不满足管控条件之前对边缘集群102中容器化应用的管控状态,重新对该边缘集群102中的容器化应用进行管控。需要说明的是,中心管控节点101在确定不满足对边缘集群102的管控条件时,还可以记录此时对边缘集群102中容器化应用的管控状态,记为第一管控状态,以便于在重新满足管控条件之后,能够基于之前的管控状态重新对该边缘集群102中的容器化应用进行管控,而不用依赖边缘管控节点102a对边缘集群102中容器化应用的管控状态。由此可见,在本申请实施例中,中心管控节点101和边缘管控节点102a之间为松耦合,各自独立管控,边缘自治能力更加灵活。其中,边缘管控节点102a或中心管控节点101对边缘集群102中容器化应用的管控状态是指边缘管控节点102a或中心管控节点101根据相应管控数据对边缘集群102中的容器化应用进行管控后,边缘集群102中的容器化应用所达到的状态,可以包括但不限于以下信息:边缘集群102中包括哪些容器化应用、容器化应用运行在哪些边缘计算节点上、容器化应用的资源规格是多少、创建时间、是否处于关停状态等。
在本实施例中,在边缘管控节点102a对其所属边缘集群102进行管控期间,即中心管控节点101不满足对该边缘集群102的管控条件期间,边缘管控节点102a同样可对其所属边缘集群102中的容器化应用进行升级、迁移、资源扩容、资源缩容、关停、重启和释放中的至少一种管控操作。而且,无论对容器化应用进行哪种管控,边缘管控节点102a可通过其所属边缘集群102中的边缘计算节点102b实现对该边缘集群102中的容器化应用的管控操作。具体地,边缘管控节点102a可以根据边缘服务需求方提交的服务需求信息和/或监测到的容器化应用的运行状态,生成对边缘集群102中的容器化应用进行管控所需的第二管控数据,将第二管控数据下发给边缘集群102中的相应边缘计算节点102b,由相应边缘计算节点102b根据第二管控数据针对待管控的容器化应用执行部署、升级、迁移、资源扩容、资源缩容、关停、重启和释放中的至少一种管控操作。其中,第二管控数据是指供边缘计算节点102b对其所属边缘集群102中的容器化应用进行管控所需的数据,该数据中包括管控操作的类型以及与管控相关的各种动作、参数等。需要说明的是,根据服务需求信息和/或容器化应用的运行状态的不同,第二管控数据的内容以及所指示的管控操作的类型等均会有所不同,边缘计算节点102b所执行的管控操作也会有所不同。
需要说明的是,在边缘管控节点102a对其所属边缘集群102中的容器化应用进行管控期间,边缘服务需求方可以与边缘管控节点102a进行交互,可通过边缘管控节点102a查询容器化应用的运行状态,也可以向边缘计算节点102b发起针对容器化应用的管控操作。也就是说,边缘管控节点102a对其所属边缘集群102中的容器化应用进行管控操作可以由边缘服务需求方发起。当然,边缘管控节点102a也可以自动监测其所属边缘集群102中容器化应用和/或其所在边缘计算节点的运行状态,根据监测结果自动发起对容器 化应用的管控操作。无论是哪种情况,边缘管控节点102a对其所属边缘集群102中的容器化应用进行各种管控操作的过程,与中心管控节点101对该边缘集群102中的容器化应用进行各种管控操作的过程相同或相似,详细过程不再赘述,可参见前述示例。其中,边缘管控节点102a和中心管控节点101对边缘集群102中的容器化应用进行管控时的主要区别在于:边缘管控节点102a在确定中心管控节点101不满足对边缘集群102的管控条件时,可获取中心管控节点101在不满足管控条件之前对边缘集群中的容器化应用进行管控的管控状态,并自该管控状态开始对边缘集群中的容器化应用继续进行管控。为了便于区分和描述,将中心管控节点101在不满足管控条件之前对边缘集群中的容器化应用进行管控的管控状态记为第一管控状态。
其中,边缘管控节点102a通过其所属边缘集群102中的边缘计算节点102b对该边缘集群102中的容器化应用进行管控时,需要边缘管控节点102a与其所属边缘集群102中的边缘计算节点102b网络连接。在实际应用中,可能发生边缘管控节点102a与边缘计算节点102b之间的网络连接断开的情况。为了便于边缘管控节点102a感知其与边缘计算节点102b之间的网络连接是否断开,边缘计算节点102b与边缘管控节点102a之间可以保持心跳连接,即边缘计算节点102b可定时向边缘管控节点102a上报心跳报文,如果边缘管控节点102a在一定时间内未收到边缘计算节点102b上报的心跳报文,则认为其与边缘计算节点102b之间的网络连接断开。
进一步可选地,无论是在中心管控节点101管控期间,还是在边缘管控节点102a管控期间,在一定时间内未收到边缘计算节点102b的心跳报文时,边缘管控节点102a还可以通过其中心管控节点101之间的网络连接,向中心管控节点101上报发生故障的边缘计算节点102b的信息,该信息至少包括发生故障的边缘计算节点102b的标识信息,以辅助中心管控节点101判断其与该边缘计算节点102b所属边缘集群102之间的网络连接是否断开。需要说明的是,在本实施例中,同一边缘计算节点102b可以同时与其所属边缘集群102中的边缘管控节点102a和中心管控节点101建立网络连接,在某一时刻,该边缘计算节点102b仅与中心管控节点101或边缘管控节点102a中的一方保持网络连接,而与另一方的网络连接断开,或者,同时与两方的网络连接均断开。鉴于此,中心管控节点101在判断其与某一边缘计算节点102b之间的网络连接是否断开时,可以判断其在一定时间内是否接收到该边缘计算节点102b的心跳报文,如果未接收到该边缘计算节点102b的心跳报文,且还收到了边缘管控节点102a上报的该边缘计算节点102b发生故障的信息,则可以确定其与边缘计算节点102b之间的网络连接断开,通过两步判断,有利于提高确定结果的准确度。
进一步,在中心管控节点101对边缘集群102中的容器化应用进行管控期间,如果中心管控节点101未在设定时间内接收到某个边缘计算节点102b的心跳报文,且还收到了边缘管控节点102a上报的该边缘计算节点102b发生故障的信息,则可以确定该边缘 计算节点102b发生了故障而并非是因为网络故障引起的断连,为了保证发生故障的边缘计算节点102b上的容器化应用能够继续提供服务,中心管控节点101可以在边缘集群102中的其它边缘计算节点上重建该发生故障的边缘计算节点上的容器化应用。具体地,中心管控节点101可以选择其它边缘计算节点,向其它边缘计算节点下发重建容器化应用的管控数据,该管控数据中包含容器化应用的标识信息以及所需的镜像文件或文件模板等,以指示其它边缘计算节点重建容器化应用。
无论是中心管控节点101对边缘集群102中的容器化应用进行管控,还是该边缘集群102中的边缘管控节点102a对该集群中的容器化应用进行管控,均是通过该集群中的边缘计算节点102b针对容器化应用进行管控的。对边缘计算节点102b来说,在根据中心管控节点101下发的第一管控数据对容器化应用执行管控操作期间,还可以将第一管控数据缓存于本地。除此之外,边缘计算节点102b还可以将本地缓存的第一管控数据同步给其所属边缘集群102中的边缘管控节点102a。这样,在边缘管控节点102a确定中心管控节点101不满足对边缘集群102的管控条件时,可以根据边缘计算节点102b同步的第一管控数据,确定中心管控节点101在不满足管控条件之前对边缘集群102中容器化应用的第一管控状态,并自第一管控状态开始继续对该边缘集群102中容器化应用进行管控。
在本实施例中,并不限定边缘计算节点102b向边缘管控节点102a同步第一管控数据的时间,可以在边缘管控节点102a需要第一管控数据之前的任何时间,向边缘管控节点102a同步第一管控数据。在一可选实施例中,边缘计算节点102b可以在监测到其与中心管控节点101之间的网络连接断开的情况下,向其所属边缘集群102中的边缘管控节点102a同步第一管控数据,基于此,边缘管控节点102a可以监测器所属边缘集群102中向其同步第一管控数据的边缘计算节点102b的信息,并根据该信息确定中心管控节点101不再满足对该边缘集群102的管控条件。例如,向边缘管控节点102a同步第一管控数据的边缘计算节点102b的信息可以是边缘计算节点的数量,则如果边缘管控节点102a判断出向其同步第一管控数据的边缘计算节点的数量大于或等于设定的数量阈值,或者数量比例大于或等于设定的比例阈值,说明其所属边缘集群102中与中心管控节点101保持网络连接的边缘计算节点的数量小于设定的数量阈值,或者数量比例小于设定的比例阈值,则确定中心管控节点101与该边缘集群102之间的网络连接断开,属于不满足对该边缘集群102的管控条件的情况。又例如,向边缘管控节点102a同步第一管控数据的边缘计算节点102b的信息是包含指定的全部边缘计算节点的信息,则如果边缘管控节点102a判断出向其同步第一管控数据的边缘计算节点中包含指定的全部边缘计算节点,说明其所属边缘集群102中任何指定的边缘计算节点均未与中心管控节点101保持网络连接,则确定中心管控节点101与该边缘集群102之间的网络连接断开,属于不满足对该边缘集群102的管控条件的情况。
在上述实施例中,边缘管控节点102a根据向其上报第一管控数据的边缘计算节点102b的数量来确定边缘计算节点102b是否与中心管控节点101断开网络连接。除此之外,边缘管控节点102a也可以主动向边缘计算节点102b发送询问消息,以询问边缘计算节点102b是否与中心管控节点101保持网络连接;边缘计算节点102b接收到询问消息后可向边缘管控节点102a返回响应消息,该响应消息可以指示其与中心管控节点101保持网络连接或断开网络连接;边缘管控节点102a根据边缘计算节点102b返回的响应消息统计与中心管控节点101断开网络连接的边缘计算节点102b的数量;在该数量大于或等于设定的数量阈值时,确定中心管控节点与边缘集群102断开网络连接,不满足对边缘集群102的管控条件。
同理,对边缘计算节点102b来说,在根据边缘管控节点102a下发的第二管控数据对容器化应用执行管控操作期间,还可以将第二管控数据缓存于本地。除此之外,边缘计算节点102b还可以将本地缓存的第二管控数据同步给中心管控节点101。可选地,边缘计算节点102b可以在中心管控节点101重新满足对边缘集群102的管控条件时,将第二管控数据同步给中心管控节点101。这样,在中心管控节点101确定其重新满足对边缘集群102的管控条件时,可以根据边缘计算节点102b同步的第二管控数据,确定边缘管控节点102a在其不满足管控条件期间对边缘集群102中容器化应用的第二管控状态,判断第二管控状态与其在不满足管控条件之前对该边缘集群102中容器化应用的第一管控状态是否一致;若第二管控状态与第一管控状态不一致,需要将该边缘集群102中的容器化应用从第二管控状态回滚到第一管控状态,并自第一管控状态开始,重新对该边缘集群102中的容器化应用进行管控。反之,若第二管控状态与第一管控状态一致,则直接自第一管控状态或第二管控状态开始,继续对该边缘集群102中的容器化应用进行管控。
进一步,中心管控节点101可以将第二管控状态与第一管控状态进行比对,通过比对可以识别出边缘集群102中待回滚的容器化应用,待回滚的容器化应用是边缘集群102在第二管控状态和第一管控状态下存在差异的容器化应用;之后,判断待回滚的容器化应用所在的边缘计算节点102b是否故障;若未故障,则对待回滚的容器化应用进行回滚处理,即将待回滚的容器化应用从第二管控状态回滚到第一管控状态。根据管控情况的不同,待回滚的容器化应用以及待回滚容器化应用所在的边缘计算节点均会有所不同,下面举例说明:
可选地,待回滚的容器化应用可以是边缘集群102中新增的容器化应用,即边缘集群102在第一管控状态下不包含但在第二管控状态下包含的容器化应用。例如,在边缘管控中心102a管控期间,若因边缘计算节点故障而发生容器化应用重建的情况,则在其它边缘计算节点上被重建的容器化应用即为该边缘集群中新增的容器化应用。又例如,在边缘管控中心102a管控期间,因为服务需求增加了容器化应用的数量,这些被增加的 容器化应用也属于边缘集群中新增的容器化应用。对于此,中心管控节点101在将待回滚的容器化应用从第二管控状态回滚到第一管控状态时,具体为:将所述新增的容器化应用删除。在该情况下,待回滚的容器化应用所在的边缘计算节点102b是指所述新增的容器化应用当前所在的边缘计算节点102b,则中心管控节点101可以在新增的容器化应用当前所在的边缘计算节点102b未故障(正常)的情况下,将该边缘计算节点102b上的新增的容器化应用删除。
可选地,待回滚的容器化应用可以是边缘集群102中被删除的原有容器化应用,即边缘集群102在第一管控状态下包含但在第二管控状态下不包含的容器化应用。例如,在边缘管控中心102a管控期间,因为服务需求减少了容器化应用的数量,这就需要删除一些容器化应用,相对于第一管控状态,就会出现被删除原有容器化应用的情况。或者,在边缘管控中心102a管控期间,若因边缘计算节点故障而发生容器化应用重建的情况,发生故障的边缘计算节点上的容纳器化应用也可视为是被删除的原有容器化应用。对于此,中心管控节点101在将待回滚的容器化应用从第二管控状态回滚到第一管控状态时,具体为:在删除了该原有容器化应用的边缘计算节点上重建被删除的原有容器化应用。在该情况下,待回滚的容器化应用所在的边缘计算节点102b是指被删除的原有容器化应用原来所在的边缘计算节点102b,则中心管控节点101可以在被删除的原有容器化应用原来所在的边缘计算节点102b未故障(正常)的情况下,在该边缘计算节点102b上重建被删除的原有容器化应用。
可选地,待回滚的容器化应用可以是边缘集群102中资源规格发生变化的原有容器化应用,即该容器化应用在第一管控状态下和第二管控状态下均存在于边缘集群102中,只是资源规格不同。对于此,中心管控节点101在将待回滚的容器化应用从第二管控状态回滚到第一管控状态时,具体为:将资源规格发生变化的原有容器化应用的资源规格还原为其在第一管控状态时的资源规格。在该情况下,待回滚的容器化应用所在的边缘计算节点102b是指资源规格发生变化的原有容器化应用一直所在的边缘计算节点102b。
进一步可选地,若上述待回滚的容器化应用所在的边缘计算节点102b发生故障,则可以略过对待回滚的容器化应用的回滚处理,或者,也可以根据待回滚容器化应用的情况做适应性处理。例如,在边缘管控中心102a管控期间,因边缘计算节点A故障而发生容器化应用重建的情况,假设在边缘计算节点B上对边缘计算节点A上的容器化应用进行了重建,在边缘计算节点B上被重建的容器化应用的数量和边缘计算节点A上的容器化应用的数量以及资源配置都相同,则在回滚过程中,中心管控节点101一方面需将边缘计算节点B上被重建的容器化应用删除,另一方面需在边缘计算节点A上重建出原来的容器化应用;若是此时边缘计算节点A仍处于故障状态,则可以在边缘集群102中选择其他正常的边缘计算节点,并在其它边缘计算节点上重建出原本在边缘计算节点A上的容器化应用。
下面以边缘管控中心102a在管控期间对其所在边缘集群中的容器化应用进行资源扩容为例,对中心管控节点101恢复管控后的回滚处理进行示例性说明:
假设中心管控节点101对边缘集群102进行管控期间,在该边缘集群102中部署了容器化应用A,该容器化应用A占用2个CPU核,该容器化应用A为提供在线教育服务的应用。在一段时间内,中心管控节点101与边缘集群102之间的公网故障,中心管控节点101无法对边缘集群102进行管控,此时,该边缘集群102中的边缘管控节点102a负责对该边缘集群102中容器化应用进行管控。在边缘管控节点102a对边缘集群102进行管控期间,学生假期到来,该在线教育应用的用户数量暴增,容器化应用A的服务请求数量较多,负载过重,于是边缘管控节点102a对容器化应用A进行了资源扩容,将分配给该容器化应用A的CPU核数从2个变成了4个。一段时间之后,中心管控节点101与边缘集群102之间的网络故障消除,中心管控节点101重新对边缘集群102进行管控,并获取容器化应用A此时占用的CPU核数为4个,与其在网络故障之前的资源规格不同,于是,先对容器化应用A进行资源缩容,将容器化应用A占用的CPU核数从4个恢复为2个,然后自该状态继续对容器化应用A进行管控;在管控期间,发现容器化应用A的负载较重,响应延迟较大,不满足服务需求方的当前应用需求,则根据服务需求方的当前应用需求,对容器化应用A进行资源扩容的管控处理,通过边缘集群102中的边缘计算节点102b将容器化应用A占用的CPU核数从2个扩展为4个、5个或更多,以满足当前应用需求。需要说明的是,中心管控节点101在将边缘集群102中的容器化应用回滚至其不满足管控条件之前的管控状态(即上文的第一管控状态)之后,可自第一管控状态开始,结合边缘服务需求方当前的服务需求,重新对边缘集群102中的容器化应用进行管控。如果边缘服务需求方当前的服务需求需要对容器化应用进行热迁移,中心管控节点101就在自第一管控状态的基础上对边缘集群102中的容器化应用进行热迁移;如果边缘服务需求方当前的服务需求需要对容器化应用进行重建,中心管控节点101就在自第一管控状态的基础上对边缘集群102中的容器化应用进行重建;如果边缘服务需求方当前的服务需求需要对容器化应用进行升级,中心管控节点101就在自第一管控状态的基础上对边缘集群102中的容器化应用进行升级。
关于边缘管控节点102a在管控期间对边缘集群102中容器化应用的其它管控,中心管控节点101重新对边缘集群102进行管控时,也会做类似的回滚处理,在此不再一一详述。
在此说明,中心管控节点与边缘管控节点之间,除了前文所描述的依据中心管控节点是否满足对边缘集群的管控条件进行自动切换之外,边缘服务需求方也可以在确定中心管控节点无法对目标边缘集群进行管控的情况下,向边缘管控节点下发管控切换指令,指示管控权限从中心管控节点切换至边缘管控节点。基于此,边缘管控节点还可以在接收到边缘服务需求方发送的管控切换指令的情况下,确定中心管控节点不满足对边缘集 群的管控条件。除此之外,边缘服务需求方还可以根据应用需求,灵活地在中心管控节点和边缘管控节点之间进行切换。例如,在中心管控节点满足对边缘集群的管控条件的情况下,如果因为特殊需求,不需要中心管控节点对边缘集群中的容器化应用进行管控,而是需要通过该边缘集群中的边缘管控节点对容器化应用进行管控,则边缘服务需求方也可以向边缘管控节点发送管控切换指令,以指示边缘管控节点对其所属边缘集群中的容器化应用进行管控。
在本申请各实施例中,借助边缘集群102中的边缘计算节点102b,在中心管控节点101和边缘管控节点102a之间进行管控数据以及管控状态等信息的同步,形成如图1b所示的交互环路。如图1b所示,中心管控节点101管控期间,将第一管控数据提供给边缘计算节点102b,边缘计算节点102本地缓存,并同步给边缘管控节点102a;边缘管控节点102a管控期间,将第二管控数据提供给边缘计算节点102b,边缘计算节点102本地缓存,并同步给中心管控节点101。利用中心和边缘双重管控并结合缓存技术,可提高云边融合架构的边缘自治能力,提升边缘容器化应用的服务能力,在云边网络异常的情况下也能快速恢复并确保边缘容器化应用不间断提供服务,而且采用这种同步方式还可保证数据同步的一致性。进一步,在中心管控节点重新满足管控条件之后,不依赖边缘管控节点对边缘集群的管控状态,而是依据其在不满足管控条件之前对边缘集群的管控状态重新对边缘集群中的容器化应用进行管控,这样两个管控节点之间为松耦合,各自独立管控,边缘自治能力更加灵活。
在本实施例中,中心管控节点101与边缘管控节点102a相互配合可对边缘集群102中的容器化应用进行各种管控,确保容器化应用的可服务性,但并不限制中心管控节点101与边缘管控节点102a对容器化应用进行管控使用的具体技术。在本申请一可选实施例中,如图1a所示,可以采用Kubernetes(K8s)技术,具体地,可以在中心管控节点101和边缘管控节点102分别部署Kubernetes的master组件,分别记为中心master组件和边缘master组件,在边缘集群102中的边缘计算节点102b上部署kubernetes的worker组件,两个master组件与worker组件相互配合实现对边缘集群102中容器化应用的部署、升级、迁移、资源扩容、资源缩容、关停、重启或释放等各种管控操作。在采用K8s技术的实施例中,可采用容器组(Pod)对容器化应用进行组织和管理,Pod是最小的可被调度的原子单位。其中,在中心master组件满足对边缘集群的管控条件的情况下,由中心master组件通过worker组件对该边缘集群中的Pod进行各种管控操作;而在中心master组件不满足对该边缘集群的管控条件的情况下,由该集群中的边缘master组件通过worker组件对Pod进行各种管控操作。其中,中心master组件和边缘master组件相互配合,在云边网络断开时,边缘侧仍可以对边缘应用进行迁移、扩缩容、升级等管控操作,仍可可保障边缘应用可以不间断提供服务。云边网络断开主要指边缘侧脱离云端管控的情况。另外,采用K8s技术,边缘服务需求方可以在边缘进行原生的K8s运维管理操作, 如查询边缘集群、边缘计算节点的运行状态、日志数据,还可以登录到Pod中执行各种运维操作等。再者,还能充分利用K8s的编排调度能力提升边缘计算场景的生产效率和运维效率。
需要说明的是,中心管控节点101除了可以对边缘集群102中的容器化应用进行管控之外,还可以在资源调度,运维,网络,安全等各方面对边缘集群102进行管控,从而将边缘服务放到各边缘集群102中处理。在部署实施上,中心管控节点101可以部署在一个或多个云计算数据中心中,或者,可以部署在一个或多个传统数据中心中,中心管控节点101也可以部署在其管控的一个或多个边缘集群102中,本实施例对此不做限定。在本实施例的边缘云系统100中,可以将网络转发、存储、计算和/或智能化数据分析等任务放在各边缘集群102中处理,由于各边缘集群102更靠近终端,因此可以降低响应时延,减轻中心云或传统的云计算平台的压力,降低带宽成本。
图2a为本申请示例性实施例提供的一种边缘管控方法的流程示意图。该方法适用于图1a所示的边缘云系统,如图2a所示,该方法包括:
21a、中心管控节点在其满足对目标边缘集群的管控条件期间,对目标边缘集群中的容器化应用进行管控,目标边缘集群是边缘云系统中任一边缘集群。
22a、在中心管控节点不满足对目标边缘集群的管控条件期间,目标边缘集群中的边缘管控节点对目标边缘集群中的容器化应用进行管控,以使容器化应用继续提供服务。
23a、中心管控节点在重新满足对目标边缘集群的管控条件之后,基于其在不满足管控条件之前对目标边缘集群中容器化应用的管控状态,重新对目标边缘集群中的容器化应用进行管控。
图2b为本申请示例性实施例提供的另一种边缘管控方法的流程示意图。该方法是从边缘管控节点的角度进行的描述,如图2b所示,该方法包括:
21b、边缘管控节点确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件,目标边缘集群是边缘云系统中所述边缘管控节点所属的边缘集群。
22b、在中心管控节点不满足对目标边缘集群的管控条件期间,对目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务。
在一可选实施例中,上述在中心管控节点不满足对目标边缘集群的管控条件期间,对目标边缘集群中的容器化应用进行管控的一种实现方式,包括:在确定中心管控节点不满足对目标边缘集群的管控条件时,获取中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态;自第一管控状态开始对目标边缘集群中的容器化应用进行管控。
进一步可选地,上述获取中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态的一种实现方式,包括:根据目标边缘集群中边缘计算节点同步的第一管控数据,确定中心管控节点在不满足管控条件之前对目标边缘集群中容器化 应用的第一管控状态;其中,第一管控数据是中心管控节点在不满足管控条件之前下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,本实施例的方法还包括:在对目标边缘集群中的容器化应用进行管控期间,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对目标边缘集群中的容器化应用进行管控所需的第二管控数据;将第二管控数据下发给目标边缘集群中的边缘计算节点,以供边缘计算节点执行针对容器化应用的管控操作。
进一步可选地,上述确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件的一种实现方式,包括:监测目标边缘集群中向边缘管控节点同步第一管控数据的边缘计算节点的信息,并根据信息,确定中心管控节点不满足对目标边缘集群的管控条件;其中,边缘计算节点在其与中心管控节点之间的网络连接断开的情况下向边缘管控节点同步第一管控数据。或者,
进一步可选地,上述确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件的另一种实现方式,包括:向目标边缘集群中的边缘计算节点发送询问消息,以询问边缘计算节点是否与中心管控节点保持网络连接,根据边缘计算节点返回的响应消息统计与中心管控节点断开网络连接的边缘计算节点的数量;在所述数量大于或等于设定的数量阈值时,确定中心管控节点不满足对目标边缘集群的管控条件。或者,
进一步可选地,上述确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件的又一种实现方式,在接收到边缘服务需求方发送的管控切换指令的情况下,确定中心管控节点不满足对目标边缘集群的管控条件;其中,所述管控切换指令是边缘服务需求方不需要中心管控节点对目标边缘集群进行管控或者在确定中心管控节点无法对目标边缘集群进行管控的情况下发送的。
图2c为本申请示例性实施例提供的又一种边缘管控方法的流程示意图。该方法是从中心管控节点的角度进行的描述,如图2c所示,该方法包括:
21c、中心管控节点在不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件,目标边缘集群是边缘云系统中的任一边缘集群;
22c、基于在不满足管控条件之前对目标边缘集群中容器化应用的管控状态,重新对目标边缘集群中的容器化应用进行管控;其中,在中心管控节点不满足对目标边缘集群的管控条件期间,由目标边缘集群中的边缘管控节点对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,所述方法还包括:中心管控节点在不满足对目标边缘集群的管控条件之前,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对目标边缘集群中的容器化应用进行管控所需的第一管控数据;将第一管控数据下发给目标边缘集群中的边缘计算节点,以供边缘计算节点根据第一管控数据执行针对容器化应用的管控操作。
在一可选实施例中,上述基于在不满足管控条件之前对目标边缘集群中容器化应用的管控状态,重新对目标边缘集群中的容器化应用进行管控的一种实现方式,包括:获取边缘管控节点对目标边缘集群中容器化应用的第二管控状态;若第二管控状态与中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态不一致,将目标边缘集群中的容器化应用从第二管控状态回滚到第一管控状态;自第一管控状态开始,重新对目标边缘集群中的容器化应用进行管控。
进一步可选地,上述将目标边缘集群中的容器化应用从第二管控状态回滚到第一管控状态的一种实现方式,包括:识别出目标边缘集群中待回滚的容器化应用,所述待回滚的容器化应用是目标边缘集群在第二管控状态和第一管控状态下存在差异的容器化应用;判断待回滚的容器化应用所在的边缘计算节点是否故障;若未故障,则将待回滚的容器化应用从第二管控状态回滚到第一管控状态。
进一步可选地,在将待回滚的容器化应用从第二管控状态回滚到第一管控状态,包括以下至少一种情况:
若待回滚的容器化应用是目标边缘集群中新增的容器化应用,则将新增的容器化应用删除;
若待回滚的容器化应用是目标边缘集群中被删除的原有容器化应用,则在删除了原有容器化应用的边缘计算节点上重建被删除的原有容器化应用;
若待回滚的容器化应用是目标边缘集群中资源规格发生变化的原有容器化应用,则将资源规格发生变化的原有容器化应用的资源规格还原为其在第一管控状态时的资源规格。
进一步可选地,上述获取边缘管控节点对目标边缘集群中容器化应用的第二管控状态的一种实现方式,包括:根据目标边缘集群中边缘计算节点同步的第二管控数据,确定边缘管控节点对目标边缘集群中容器化应用的第二管控状态;其中,第二管控数据是边缘管控节点在对目标边缘集群进行管控期间下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,上述在不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对目标边缘集群的管控条件的一种实现方式,包括:在不满足对目标边缘集群的管控条件期间,统计目标边缘集群中与中心管控节点保持网络连接的边缘计算节点的信息;根据信息,确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,上述根据目标边缘集群中与中心管控节点保持网络连接的边缘计算节点的信息,确定中心管控节点重新满足对目标边缘集群的管控条件的一种实现方式,包括:若与中心管控节点保持心跳连接的边缘计算节点的数量大于或等于设定的数量阈值,或者数量比例大于或等于设定的比例阈值,则确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,本实施例的方法还包括:根据接收到的来自边缘计算节点的心跳报文的数量以及边缘管控节点上报的发生故障的边缘计算节点的信息,确定与其保持网络连接的边缘计算节点的数量或数量比例;其中,所述边缘管控节点在收不到边缘计算节点的心跳报文时,向所述中心管控节点上报该边缘计算节点发生故障的信息。
进一步可选地,本实施例的方法还包括:若接收到边缘管控节点上报的发生故障的边缘计算节点的信息,且未在设定时间内收到发生故障的边缘计算节点的心跳报文,则在目标边缘集群中的其它边缘计算节点上重建所述发生故障的边缘计算节点上的容器化应用。在上述方法实施例中,在边缘云系统中采用容器这种云原生技术,实现一种云边融合架构,在该云边融合架构中,采用中心和边缘双重管控的方式,可极大地提高云边融合架构的边缘自治能力,极大地提升边缘容器化应用的服务能力;进一步,在中心管控节点重新满足管控条件之后,不依赖边缘管控节点对边缘集群的管控状态,而是依据其在不满足管控条件之前对边缘集群的管控状态重新对边缘集群中的容器化应用进行管控,这样两个管控节点之间为松耦合,各自独立管控,边缘自治能力更加灵活。
需要说明的是,上述实施例所提供方法的各步骤的执行主体均可以是同一设备,或者,该方法也由不同设备作为执行主体。比如,步骤21a至步骤23a中,步骤21a和23a的执行主体可以为中心管控节点,步骤22a的执行主体可以为边缘管控节点;等等。
另外,在上述实施例及附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如21a、22a等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
图3a为本申请示例性实施例提供的一种边缘管控装置的结构示意图。该边缘管控装置可应用于上述系统中的边缘管控节点中,如图3a所示,该装置包括:确定模块31a和管控模块32a。
其中,确定模块31a,用于确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件,目标边缘集群是边缘云系统中边缘管控节点所属的边缘集群。
管控模块32a,用于在中心管控节点不满足对目标边缘集群的管控条件期间,对目标边缘集群中的容器化应用进行管控,以使容器化应用继续提供服务;
在一可选实施例中,管控模块32a具体用于:在确定中心管控节点不满足对目标边缘集群的管控条件时,获取中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态;自第一管控状态开始对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,管控模块32a在获取第一管控状态时,用于:根据目标边缘集 群中边缘计算节点同步的第一管控数据,确定中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态;其中,第一管控数据是中心管控节点在不满足管控条件之前下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,如图3a所示,该装置还包括:生成模块33a和发送模块34a。
生成模块33a,用于在对目标边缘集群中的容器化应用进行管控期间,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对目标边缘集群中的容器化应用进行管控所需的第二管控数据。发送模块34a,用于将第二管控数据下发给目标边缘集群中的边缘计算节点,以供边缘计算节点执行针对容器化应用的管控操作。
在一可选实施例中,确定模块31a具体用于:监测目标边缘集群中向边缘管控节点同步第一管控数据的边缘计算节点的信息,并根据该信息,确定中心管控节点不满足对目标边缘集群的管控条件;其中,边缘计算节点在其与中心管控节点之间的网络连接断开的情况下向边缘管控节点同步第一管控数据。
在一可选实施例中,确定模块31a具体用于:向目标边缘集群中的边缘计算节点发送询问消息,以询问边缘计算节点是否与中心管控节点保持网络连接,根据边缘计算节点返回的响应消息统计与中心管控节点断开网络连接的边缘计算节点的数量;在所述数量大于或等于设定的数量阈值时,确定中心管控节点不满足对目标边缘集群的管控条件。
在一可选实施例中,确定模块31a具体用于:在接收到边缘服务需求方发送的管控切换指令的情况下,确定中心管控节点不满足对目标边缘集群的管控条件;其中,所述管控切换指令是边缘服务需求方不需要中心管控节点对目标边缘集群进行管控或者在确定中心管控节点无法对目标边缘集群进行管控的情况下发送的。
以上描述了边缘管控装置的内部功能和结构,如图3b所示,实际中,该边缘管控装置可实现为边缘管控节点,包括:存储器31b、处理器32b以及通信组件33b。
存储器31b,用于存储计算机程序,并可被配置为存储其它各种数据以支持在边缘管控节点上的操作。这些数据的示例包括用于在边缘管控节点上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。
处理器32b,与存储器31b耦合,用于执行存储器31b中的计算机程序,以用于:确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件;以及在中心管控节点不满足对目标边缘集群的管控条件期间,对目标边缘集群中的容器化应用进行管控,以使容器化应用继续提供服务;其中,目标边缘集群是边缘云系统中边缘管控节点所属的边缘集群。
在一可选实施例中,处理器32b在对目标边缘集群中的容器化应用进行管控时,具体用于:在确定中心管控节点不满足对目标边缘集群的管控条件时,获取中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态;自第一管控状态开始对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,处理器32b在获取第一管控状态时,具体用于:根据目标边缘集群中边缘计算节点同步的第一管控数据,确定中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态;其中,第一管控数据是中心管控节点在不满足管控条件之前下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,处理器32b还用于:在对目标边缘集群中的容器化应用进行管控期间,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对目标边缘集群中的容器化应用进行管控所需的第二管控数据;通过通信组件33b将第二管控数据下发给目标边缘集群中的边缘计算节点,以供边缘计算节点执行针对容器化应用的管控操作。
在一可选实施例中,处理器32b在确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件时,具体用于:监测目标边缘集群中向边缘管控节点同步第一管控数据的边缘计算节点的信息,并根据信息,确定中心管控节点不满足对目标边缘集群的管控条件;其中,边缘计算节点在其与中心管控节点之间的网络连接断开的情况下向边缘管控节点同步第一管控数据。
在一可选实施例中,处理器32b在确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件时,具体用于:向目标边缘集群中的边缘计算节点发送询问消息,以询问边缘计算节点是否与中心管控节点保持网络连接,根据边缘计算节点返回的响应消息统计与中心管控节点断开网络连接的边缘计算节点的数量;在所述数量大于或等于设定的数量阈值时,确定中心管控节点不满足对目标边缘集群的管控条件。
在一可选实施例中,处理器32b在确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件时,具体用于:在接收到边缘服务需求方发送的管控切换指令的情况下,确定中心管控节点不满足对目标边缘集群的管控条件;其中,所述管控切换指令是边缘服务需求方不需要中心管控节点对目标边缘集群进行管控或者在确定中心管控节点无法对目标边缘集群进行管控的情况下发送的。
进一步,如图3b所示,该边缘管控节点还包括:显示器34b、电源组件35b、音频组件36b等其它组件。图3b中仅示意性给出部分组件,并不意味着边缘管控节点只包括图3b所示组件。另外,图3b中示出的部分组件,如虚线框中的组件,为可选组件,而非必选组件,具体可视边缘管控节点的设备形态而定。
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,计算机程序被处理器执行时,致使处理器能够实现上述方法实施例中可由边缘管控节点执行的各步骤。
相应地,本申请实施例还提供一种计算机程序产品,包括计算机程序/指令,当计算机程序/指令被处理器执行时,致使处理器能够实现上述方法实施例中可由边缘管控节点 执行的各步骤。
图4a为本申请实施例提供的另一种边缘管控装置的结构示意图。该边缘管控装置可应用于上述系统中的中心管控节点中,如图4a所示,该装置包括:确定模块41a和管控模块42a。
确定模块41a,用于在中心管控节点不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对目标边缘集群的管控条件。
管控模块42a,用于基于在中心管控节点不满足管控条件之前对目标边缘集群中容器化应用的管控状态,重新对目标边缘集群中的容器化应用进行管控;其中,其中,目标边缘集群是边缘云系统中的任一边缘集群,在中心管控节点不满足对目标边缘集群的管控条件期间,由目标边缘集群中的边缘管控节点对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,管控模块42a具体用于:获取边缘管控节点对目标边缘集群中容器化应用的第二管控状态;若第二管控状态与中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态不一致,将目标边缘集群中的容器化应用从第二管控状态回滚到第一管控状态;自第一管控状态开始,重新对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,管控模块42a在获取第二管控状态时,具体用于:根据目标边缘集群中边缘计算节点同步的第二管控数据,确定边缘管控节点对目标边缘集群中容器化应用的第二管控状态;其中,第二管控数据是边缘管控节点在对目标边缘集群进行管控期间下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,管控模块42a在将目标边缘集群中的容器化应用从第二管控状态回滚到第一管控状态时,具体用于:识别出目标边缘集群中待回滚的容器化应用,所述待回滚的容器化应用是目标边缘集群在第二管控状态和第一管控状态下存在差异的容器化应用;判断待回滚的容器化应用所在的边缘计算节点是否故障;若未故障,则将待回滚的容器化应用从第二管控状态回滚到第一管控状态。
进一步可选地,管控模块42a在将待回滚的容器化应用从第二管控状态回滚到第一管控状态时,具体用于:若待回滚的容器化应用是目标边缘集群中新增的容器化应用,则将新增的容器化应用删除;若待回滚的容器化应用是目标边缘集群中被删除的原有容器化应用,则在删除了原有容器化应用的边缘计算节点上重建被删除的原有容器化应用;若待回滚的容器化应用是目标边缘集群中资源规格发生变化的原有容器化应用,则将资源规格发生变化的原有容器化应用的资源规格还原为其在第一管控状态时的资源规格。
在一可选实施例中,确定模块41a具体用于:在中心管控节点不满足对目标边缘集群的管控条件期间,统计目标边缘集群中与中心管控节点保持网络连接的边缘计算节点的信息;根据该信息,确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,确定模块41a在确定中心管控节点重新满足对目标边缘集群的管控条件时,具体用于:若与中心管控节点保持心跳连接的边缘计算节点的数量大于或等于设定的数量阈值,或者数量比例大于或等于设定的比例阈值,则确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,确定模块41a还用于:根据接收到的来自边缘计算节点的心跳报文的数量以及边缘管控节点上报的发生故障的边缘计算节点的信息,确定与其保持网络连接的边缘计算节点的数量或数量比例;其中,所述边缘管控节点在收不到边缘计算节点的心跳报文时,向所述中心管控节点上报该边缘计算节点发生故障的信息。
进一步可选地,管控模块42a还用于:在接收到边缘管控节点上报的发生故障的边缘计算节点的信息,且未在设定时间内收到发生故障的边缘计算节点的心跳报文的情况下,在目标边缘集群中的其它边缘计算节点上重建所述发生故障的边缘计算节点上的容器化应用。
以上描述了边缘管控装置的内部功能和结构,如图4b所示,实际中,该边缘管控装置可实现为中心管控节点,包括:存储器41b、处理器42b以及通信组件43b。
存储器41b,用于存储计算机程序,并可被配置为存储其它各种数据以支持在中心管控节点上的操作。这些数据的示例包括用于在中心管控节点上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。
处理器42b,与存储器41b耦合,用于执行存储器41b中的计算机程序,以用于:在中心管控节点不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对目标边缘集群的管控条件;基于中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的管控状态,重新对目标边缘集群中的容器化应用进行管控;其中,目标边缘集群是边缘云系统中的任一边缘集群,在中心管控节点不满足对目标边缘集群的管控条件期间,由目标边缘集群中的边缘管控节点对目标边缘集群中的容器化应用进行管控。
在一可选实施例中,处理器42b在重新对目标边缘集群中的容器化应用进行管控时,具体用于:获取边缘管控节点对目标边缘集群中容器化应用的第二管控状态;若第二管控状态与中心管控节点在不满足管控条件之前对目标边缘集群中容器化应用的第一管控状态不一致,将目标边缘集群中的容器化应用从第二管控状态回滚到第一管控状态;自第一管控状态开始,重新对目标边缘集群中的容器化应用进行管控。
进一步可选地,处理器42b在获取边缘管控节点对目标边缘集群中容器化应用的第二管控状态时,具体用于:根据目标边缘集群中边缘计算节点同步的第二管控数据,确定边缘管控节点对目标边缘集群中容器化应用的第二管控状态;其中,第二管控数据是边缘管控节点在对目标边缘集群进行管控期间下发给边缘计算节点供其对目标边缘集群中的容器化应用进行管控的。
在一可选实施例中,处理器42b在将目标边缘集群中的容器化应用从第二管控状态 回滚到第一管控状态时,具体用于:识别出目标边缘集群中待回滚的容器化应用,所述待回滚的容器化应用是目标边缘集群在第二管控状态和第一管控状态下存在差异的容器化应用;判断待回滚的容器化应用所在的边缘计算节点是否故障;若未故障,则将待回滚的容器化应用从第二管控状态回滚到第一管控状态。
进一步可选地,处理器42b在将待回滚的容器化应用从第二管控状态回滚到第一管控状态时,具体用于:若待回滚的容器化应用是目标边缘集群中新增的容器化应用,则将新增的容器化应用删除;若待回滚的容器化应用是目标边缘集群中被删除的原有容器化应用,则在删除了原有容器化应用的边缘计算节点上重建被删除的原有容器化应用;若待回滚的容器化应用是目标边缘集群中资源规格发生变化的原有容器化应用,则将资源规格发生变化的原有容器化应用的资源规格还原为其在第一管控状态时的资源规格。
进一步可选地,处理器42b在确定重新满足对目标边缘集群的管控条件时,具体用于:在不满足对目标边缘集群的管控条件期间,统计目标边缘集群中与中心管控节点保持网络连接的边缘计算节点的信息;根据该信息,确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,处理器42b在根据上述信息确定中心管控节点重新满足对目标边缘集群的管控条件时,具体用于:若与中心管控节点保持心跳连接的边缘计算节点的数量大于或等于设定的数量阈值,或者数量比例大于或等于设定的比例阈值,则确定中心管控节点重新满足对目标边缘集群的管控条件。
进一步可选地,处理器42b还用于:根据接收到的来自边缘计算节点的心跳报文的数量以及边缘管控节点上报的发生故障的边缘计算节点的信息,确定与其保持网络连接的边缘计算节点的数量或数量比例;其中,所述边缘管控节点在收不到边缘计算节点的心跳报文时,向所述中心管控节点上报该边缘计算节点发生故障的信息。
进一步可选地,处理器42b还用于:在接收到边缘管控节点上报的发生故障的边缘计算节点的信息,且未在设定时间内收到发生故障的边缘计算节点的心跳报文的情况下,在目标边缘集群中的其它边缘计算节点上重建所述发生故障的边缘计算节点上的容器化应用。
进一步,如图4b所示,该中心管控节点还包括:显示器44b、电源组件45b、音频组件46b等其它组件。图4b中仅示意性给出部分组件,并不意味着中心管控节点只包括图4b所示组件。另外,图4b中示出的部分组件,如虚线框中的组件,为可选组件,而非必选组件,具体可视中心管控节点的设备形态而定。
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,计算机程序被处理器执行时,致使处理器能够实现上述方法实施例中可由中心管控节点执行的各步骤。
相应地,本申请实施例还提供一种计算机程序产品,包括计算机程序/指令,当计算 机程序/指令被处理器执行时,致使处理器能够实现上述方法实施例中可由中心管控节点执行的各步骤。
上述实施例中的存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
上述实施例中的通信组件被配置为便于通信组件所在设备和其他设备之间有线或无线方式的通信。通信组件所在设备可以接入基于通信标准的无线网络,如WiFi,2G、3G、4G/LTE、5G等移动通信网络,或它们的组合。在一个示例性实施例中,通信组件经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
上述实施例中的显示器包括屏幕,其屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。
上述实施例中的电源组件,为电源组件所在设备的各种组件提供电力。电源组件可以包括电源管理系统,一个或多个电源,及其他与为电源组件所在设备生成、管理和分配电力相关联的组件。
上述实施例中的音频组件,可被配置为输出和/或输入音频信号。例如,音频组件包括一个麦克风(MIC),当音频组件所在设备处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器或经由通信组件发送。在一些实施例中,音频组件还包括一个扬声器,用于输出音频信号。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些 计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。

Claims (30)

  1. 一种边缘云系统,其特征在于,包括:中心管控节点,以及与所述中心管控节点网络连接的至少一个边缘集群,每个边缘集群包括边缘管控节点和边缘计算节点,所述边缘计算节点上可部署容器化应用;
    所述边缘管控节点,用于在所述中心管控节点不满足对所述边缘管控节点所属目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;
    所述中心管控节点,用于在重新满足对所述目标边缘集群的管控条件之后,基于其在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控。
  2. 根据权利要求1所述的系统,其特征在于,所述边缘管控节点具体用于:
    在确定所述中心管控节点不满足对所述目标边缘集群的管控条件时,获取所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态,自所述第一管控状态开始对所述目标边缘集群中的容器化应用进行管控。
  3. 根据权利要求2所述的系统,其特征在于,所述中心管控节点还用于:
    在不满足对所述目标边缘集群的管控条件之前,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对所述目标边缘集群中的容器化应用进行管控所需的第一管控数据;
    将所述第一管控数据下发给所述目标边缘集群中的边缘计算节点,以供所述边缘计算节点根据所述第一管控数据执行针对容器化应用的管控操作。
  4. 根据权利要求3所述的系统,其特征在于,所述边缘计算节点还用于:将所述第一管控数据缓存于本地,并同步给所述边缘管控节点;
    所述边缘管控节点具体用于:在确定所述中心管控节点不满足对所述目标边缘集群的管控条件时,根据所述边缘计算节点同步的所述第一管控数据,确定所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态。
  5. 根据权利要求1所述的系统,其特征在于,所述中心管控节点具体用于:
    在重新满足对所述目标边缘集群的管控条件之后,获取所述边缘管控节点对所述目标边缘集群中容器化应用的第二管控状态;
    若所述第二管控状态与所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态不一致,将所述目标边缘集群中的容器化应用从所述第二管控状态回滚到所述第一管控状态,并自所述第一管控状态开始,重新对所述目标边缘集群中的容器化应用进行管控。
  6. 根据权利要求5所述的系统,其特征在于,所述边缘管控节点在对所述目标边缘集群中的容器化应用进行管控期间,具体用于:
    根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对所 述目标边缘集群中的容器化应用进行管控所需的第二管控数据;
    将所述第二管控数据下发给所述目标边缘集群中的边缘计算节点,以供所述边缘计算节点根据所述第二管控数据执行针对容器化应用的管控操作。
  7. 根据权利要求6所述的系统,其特征在于,所述目标边缘集群中的边缘计算节点还用于:将所述第二管控数据缓存于本地,并在所述中心管控节点重新满足管控条件之后同步给所述中心管控节点;
    所述中心管控节点具体用于:在重新满足对所述目标边缘集群的管控条件之后,根据所述边缘计算节点同步的所述第二管控数据,确定所述边缘管控节点对所述目标边缘集群中容器化应用的第二管控状态。
  8. 根据权利要求5所述的系统,其特征在于,所述中心管控节点还用于:
    若所述第二管控状态与所述第一管控状态一致的情况下,则自所述第二管控状态开始继续对所述目标边缘集群中的容器化应用进行管控。
  9. 根据权利要求5所述的系统,其特征在于,所述中心管控节点在将所述目标边缘集群中的容器化应用从所述第二管控状态回滚到所述第一管控状态时,具体用于:
    识别出所述目标边缘集群中待回滚的容器化应用,所述待回滚的容器化应用是所述目标边缘集群在所述第二管控状态和所述第一管控状态下存在差异的容器化应用;
    判断所述待回滚的容器化应用所在的边缘计算节点是否故障;若未故障,则将所述待回滚的容器化应用从所述第二管控状态回滚到所述第一管控状态。
  10. 根据权利要求9所述的系统,其特征在于,所述中心管控节点在将所述待回滚的容器化应用从所述第二管控状态回滚到所述第一管控状态时,具体用于:
    若所述待回滚的容器化应用是所述目标边缘集群中新增的容器化应用,则将所述新增的容器化应用删除;
    若所述待回滚的容器化应用是所述目标边缘集群中被删除的原有容器化应用,则在删除了所述原有容器化应用的边缘计算节点上重建被删除的原有容器化应用;
    若所述待回滚的容器化应用是所述目标边缘集群中资源规格发生变化的原有容器化应用,则将资源规格发生变化的原有容器化应用的资源规格还原为其在所述第一管控状态时的资源规格。
  11. 根据权利要求3或6所述的系统,其特征在于,针对容器化应用的管控操作,包括:针对容器化应用的重建、升级、迁移、资源扩容、资源缩容、关停、重启和释放中的至少一种管控操作。
  12. 根据权利要求2-10任一项所述的系统,其特征在于,所述边缘管控节点还用于:
    监测所述目标边缘集群中向其同步所述第一管控数据的边缘计算节点的信息,并根据所述信息,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;其中,所述边缘计算节点在其与所述中心管控节点之间的网络连接断开的情况下向所述边缘管控节点同步第一管控数据;
    或者,
    向所述目标边缘集群中的边缘计算节点发送询问消息,以询问所述边缘计算节点是否与所述中心管控节点保持网络连接,根据所述边缘计算节点返回的响应消息统计与所述中心管控节点断开网络连接的边缘计算节点的数量;在所述数量大于或等于设定的数量阈值时,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;
    或者,
    在接收到边缘服务需求方发送的管控切换指令的情况下,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;其中,所述管控切换指令是边缘服务需求方不需要所述中心管控节点对所述目标边缘集群进行管控或者在确定所述中心管控节点无法对所述目标边缘集群进行管控的情况下发送的。
  13. 根据权利要求12所述的系统,其特征在于,所述边缘管控节点具体用于:
    若向其同步所述第一管控数据的边缘计算节点的数量大于或等于设定的数量阈值,或者数量比例大于或等于设定的比例阈值,则确定所述中心管控节点不满足对所述目标边缘集群的管控条件。
  14. 根据权利要求1-10任一项所述的系统,其特征在于,所述中心管控节点还用于:
    统计所述目标边缘集群中与其保持网络连接的边缘计算节点的信息;根据所述信息,确定所述中心管控节点不满足对所述目标边缘集群的管控条件。
  15. 根据权利要求14所述的系统,其特征在于,所述中心管控节点具体用于:
    若与其保持心跳连接的边缘计算节点的数量小于设定的数量阈值,或者数量比例小于设定的比例阈值,则确定所述中心管控节点不满足对所述目标边缘集群的管控条件。
  16. 根据权利要求15所述的系统,其特征在于,所述中心管控节点具体用于:
    根据接收到的来自边缘计算节点的心跳报文的数量以及所述边缘管控节点上报的发生故障的边缘计算节点的信息,确定与其保持网络连接的边缘计算节点的数量或数量比例;
    其中,所述边缘管控节点在收不到边缘计算节点的心跳报文时,向所述中心管控节点上报该边缘计算节点发生故障的信息。
  17. 根据权利要求16所述的系统,其特征在于,所述中心管控节点还用于:
    若接收到所述边缘管控节点上报的发生故障的边缘计算节点的信息,且未在设定时间内收到所述发生故障的边缘计算节点的心跳报文,则在所述目标边缘集群中的其它边缘计算节点上重建所述发生故障的边缘计算节点上的容器化应用。
  18. 一种边缘管控方法,其特征在于,适用于边缘管控节点,所述方法包括:
    确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件;以及
    在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;
    其中,所述目标边缘集群是所述边缘云系统中所述边缘管控节点所属的边缘集群。
  19. 根据权利要求18所述的方法,其特征在于,在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,包括:
    在确定所述中心管控节点不满足对所述目标边缘集群的管控条件时,获取所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态;
    自所述第一管控状态开始对所述目标边缘集群中的容器化应用进行管控。
  20. 根据权利要求19所述的方法,其特征在于,获取所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态,包括:
    根据所述目标边缘集群中边缘计算节点同步的第一管控数据,确定所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态;
    其中,所述第一管控数据是所述中心管控节点在不满足管控条件之前下发给所述边缘计算节点供其对所述目标边缘集群中的容器化应用进行管控的。
  21. 根据权利要求20所述的方法,其特征在于,还包括:
    在对所述目标边缘集群中的容器化应用进行管控期间,根据边缘服务需求方提交的服务需求信息和/或容器化应用的运行状态,生成对所述目标边缘集群中的容器化应用进行管控所需的第二管控数据;
    将所述第二管控数据下发给所述目标边缘集群中的边缘计算节点,以供所述边缘计算节点执根据所述第二管控数据行针对容器化应用的管控操作。
  22. 根据权利要求20或21所述的方法,其特征在于,确定边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件,包括:
    监测所述目标边缘集群中向所述边缘管控节点同步所述第一管控数据的边缘计算节点的信息,并根据所述信息,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;其中,所述边缘计算节点在其与所述中心管控节点之间的网络连接断开的情况下向所述边缘管控节点同步第一管控数据;
    或者,
    向所述目标边缘集群中的边缘计算节点发送询问消息,以询问所述边缘计算节点是否与所述中心管控节点保持网络连接,根据所述边缘计算节点返回的响应消息统计与所述中心管控节点断开网络连接的边缘计算节点的数量;在所述数量大于或等于设定的数量阈值时,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;
    或者,
    在接收到边缘服务需求方发送的管控切换指令的情况下,确定所述中心管控节点不满足对所述目标边缘集群的管控条件;其中,所述管控切换指令是边缘服务需求方不需要所述中心管控节点对所述目标边缘集群进行管控或者在确定所述中心管控节点无法对所述目标边缘集群进行管控的情况下发送的。
  23. 一种边缘管控方法,其特征在于,适用于中心管控节点,所述方法包括:
    在不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件;
    基于在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控;
    其中,所述目标边缘集群是所述边缘云系统中的任一边缘集群,在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,由所述目标边缘集群中的边缘管控节点对所述目标边缘集群中的容器化应用进行管控。
  24. 根据权利要求23所述的方法,其特征在于,基于在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控,包括:
    获取所述边缘管控节点对所述目标边缘集群中容器化应用的第二管控状态;
    若所述第二管控状态与所述中心管控节点在不满足管控条件之前对所述目标边缘集群中容器化应用的第一管控状态不一致,将所述目标边缘集群中的容器化应用从所述第二管控状态回滚到所述第一管控状态;
    自所述第一管控状态开始,重新对所述目标边缘集群中的容器化应用进行管控。
  25. 根据权利要求24所述的方法,其特征在于,获取所述边缘管控节点对所述目标边缘集群中容器化应用的第二管控状态,包括:
    根据所述目标边缘集群中边缘计算节点同步的第二管控数据,确定所述边缘管控节点对所述目标边缘集群中容器化应用的第二管控状态;
    其中,所述第二管控数据是所述边缘管控节点在对所述目标边缘集群进行管控期间下发给所述边缘计算节点供其对所述目标边缘集群中的容器化应用进行管控的。
  26. 根据权利要求23-25任一项所述的方法,其特征在于,在不满足对边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件,包括:
    在不满足对目标边缘集群的管控条件期间,统计所述目标边缘集群中与所述中心管控节点保持网络连接的边缘计算节点的信息;
    根据所述信息,确定所述中心管控节点重新满足对所述目标边缘集群的管控条件。
  27. 一种边缘管控节点,其特征在于,包括:存储器和处理器;
    所述存储器,用于存储计算机程序;
    所述处理器,与所述存储器耦合,用于执行所述计算机程序,以用于:
    确定其所属边缘云系统中的中心管控节点不满足对目标边缘集群的管控条件;以及在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,对所述目标边缘集群中的容器化应用进行管控,以使所述容器化应用继续提供服务;
    其中,所述目标边缘集群是所述边缘云系统中所述边缘管控节点所属的边缘集群。
  28. 一种中心管控节点,其特征在于,包括:存储器和处理器;
    所述存储器,用于存储计算机程序;
    所述处理器,与所述存储器耦合,用于执行所述计算机程序,以用于:
    在不满足对其所属边缘云系统中目标边缘集群的管控条件期间,确定重新满足对所述目标边缘集群的管控条件;基于在不满足管控条件之前对所述目标边缘集群中容器化应用的管控状态,重新对所述目标边缘集群中的容器化应用进行管控;
    其中,所述目标边缘集群是所述边缘云系统中的任一边缘集群,在所述中心管控节点不满足对所述目标边缘集群的管控条件期间,由所述目标边缘集群中的边缘管控节点对所述目标边缘集群中的容器化应用进行管控。
  29. 一种存储有计算机程序的计算机可读存储介质,其特征在于,当所述计算机程序被处理器执行时,致使所述处理器实现权利要求18-26任一项所述方法中的步骤。
  30. 一种计算机程序产品,包括计算机程序/指令,其中,当所述计算机程序/指令被处理器执行时,致使所述处理器实现权利要求18-26任一项所述方法中的步骤。
PCT/CN2022/074248 2021-02-01 2022-01-27 边缘云系统、边缘管控方法、管控节点及存储介质 WO2022161430A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110139135.6A CN113296903A (zh) 2021-02-01 2021-02-01 边缘云系统、边缘管控方法、管控节点及存储介质
CN202110139135.6 2021-02-01

Publications (1)

Publication Number Publication Date
WO2022161430A1 true WO2022161430A1 (zh) 2022-08-04

Family

ID=77318894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/074248 WO2022161430A1 (zh) 2021-02-01 2022-01-27 边缘云系统、边缘管控方法、管控节点及存储介质

Country Status (2)

Country Link
CN (1) CN113296903A (zh)
WO (1) WO2022161430A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115361389A (zh) * 2022-10-20 2022-11-18 阿里巴巴(中国)有限公司 一种云计算实例创建方法及装置
CN115834388A (zh) * 2022-10-21 2023-03-21 支付宝(杭州)信息技术有限公司 系统控制方法及装置
CN116107564A (zh) * 2023-04-12 2023-05-12 中国人民解放军国防科技大学 面向数据的云原生软件架构及软件平台
CN116887220A (zh) * 2023-08-10 2023-10-13 谷梵科技(青田)有限公司 基于云边协同的v2x服务高可用方法、系统、装置及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113296903A (zh) * 2021-02-01 2021-08-24 阿里巴巴集团控股有限公司 边缘云系统、边缘管控方法、管控节点及存储介质
CN113722109A (zh) * 2021-11-02 2021-11-30 深圳大学 一种容器化边缘计算智能服务引擎系统
CN114301909B (zh) * 2021-12-02 2023-09-22 阿里巴巴(中国)有限公司 边缘分布式管控系统、方法、设备及存储介质
CN114401167A (zh) * 2022-01-28 2022-04-26 中智城信息科技(苏州)有限公司 边缘网关与云服务器的数据传输方法及装置
CN115314363B (zh) * 2022-02-22 2024-04-12 网宿科技股份有限公司 服务恢复方法、服务部署方法、服务器及存储介质
CN114826869B (zh) * 2022-03-04 2023-11-28 阿里巴巴(中国)有限公司 设备管理方法和设备管理系统
CN114743365A (zh) * 2022-03-10 2022-07-12 慧之安信息技术股份有限公司 基于边缘计算的监狱智能监控系统和方法
CN116431188B (zh) * 2023-04-24 2023-09-29 江苏博云科技股份有限公司 一种云边协同策略下边缘应用sota升级方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109510723A (zh) * 2018-11-19 2019-03-22 深圳友讯达科技股份有限公司 网关设备、物联网事务管控系统及方法
CN111800281A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、管控方法、设备及存储介质
CN111800283A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、服务提供与资源调度方法、设备及存储介质
CN111800282A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、实例管控方法、设备及存储介质
CN113296903A (zh) * 2021-02-01 2021-08-24 阿里巴巴集团控股有限公司 边缘云系统、边缘管控方法、管控节点及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109510723A (zh) * 2018-11-19 2019-03-22 深圳友讯达科技股份有限公司 网关设备、物联网事务管控系统及方法
CN111800281A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、管控方法、设备及存储介质
CN111800283A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、服务提供与资源调度方法、设备及存储介质
CN111800282A (zh) * 2019-04-08 2020-10-20 阿里巴巴集团控股有限公司 网络系统、实例管控方法、设备及存储介质
CN113296903A (zh) * 2021-02-01 2021-08-24 阿里巴巴集团控股有限公司 边缘云系统、边缘管控方法、管控节点及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115361389A (zh) * 2022-10-20 2022-11-18 阿里巴巴(中国)有限公司 一种云计算实例创建方法及装置
CN115834388A (zh) * 2022-10-21 2023-03-21 支付宝(杭州)信息技术有限公司 系统控制方法及装置
CN115834388B (zh) * 2022-10-21 2023-11-14 支付宝(杭州)信息技术有限公司 系统控制方法及装置
CN116107564A (zh) * 2023-04-12 2023-05-12 中国人民解放军国防科技大学 面向数据的云原生软件架构及软件平台
CN116887220A (zh) * 2023-08-10 2023-10-13 谷梵科技(青田)有限公司 基于云边协同的v2x服务高可用方法、系统、装置及存储介质
CN116887220B (zh) * 2023-08-10 2024-05-24 谷梵科技(青田)有限公司 基于云边协同的v2x服务高可用方法、系统、装置及存储介质

Also Published As

Publication number Publication date
CN113296903A (zh) 2021-08-24

Similar Documents

Publication Publication Date Title
WO2022161430A1 (zh) 边缘云系统、边缘管控方法、管控节点及存储介质
WO2020207266A1 (zh) 网络系统、实例管控方法、设备及存储介质
WO2020207264A1 (zh) 网络系统、服务提供与资源调度方法、设备及存储介质
WO2020207265A1 (zh) 网络系统、管控方法、设备及存储介质
US20230281037A1 (en) Cross-device task relay method, cloud platform, and non-transitory storage medium
WO2022007552A1 (zh) 处理节点的管理方法、配置方法及相关装置
CN113296882A (zh) 容器编排方法、设备、系统及存储介质
CN111800285B (zh) 实例迁移方法和装置以及电子设备
CN112448858B (zh) 网络通信控制方法及装置、电子设备和可读存储介质
CN107368369B (zh) 分布式容器管理方法及系统
WO2017041525A1 (zh) 重建虚拟网络功能的方法和装置
CN109992373B (zh) 资源调度方法、信息管理方法和装置及任务部署系统
US10802896B2 (en) Rest gateway for messaging
CN112463535A (zh) 多集群异常处理方法及装置
WO2023035374A1 (zh) 一种高可靠视频会议系统及其控制方法、存储介质
CN111399764A (zh) 数据存储方法、读取方法、装置、设备及存储介质
CN114900449B (zh) 一种资源信息管理方法、系统及装置
CN113467873A (zh) 虚拟机的调度方法、装置、电子设备及存储介质
CN117499490A (zh) 基于多集群的网络调度方法及装置
CN111124268B (zh) 数据复制方法、装置、系统及电子设备
CN114301909B (zh) 边缘分布式管控系统、方法、设备及存储介质
CN116820757A (zh) 一种利用算力网络处理任务的方法、装置、设备及介质
CN113138717B (zh) 节点部署方法、设备及存储介质
CN111061723A (zh) 工作流实现方法及装置
CN115567523A (zh) 资源管理方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22745303

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22745303

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 22745303

Country of ref document: EP

Kind code of ref document: A1