KR102320317B1

KR102320317B1 - Method for selecting predict-based migration candidate and target on cloud edge

Info

Publication number: KR102320317B1
Application number: KR1020190143369A
Authority: KR
Inventors: 안재훈; 김영환
Original assignee: 한국전자기술연구원
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2021-11-02
Also published as: KR20210056655A

Abstract

클라우드 엣지 관리방법 및 클라우드 엣지 장치가 제공된다. 본 클라우드 엣지 관리방법에 따르면, 워크로드 데이터가 수신되면 학습된 예측모델을 이용하여 포드(Pod)의 장애발생 가능성을 산출하고, 산출된 장애발생 가능성에 따라 해당 포드에 대해 마이그레이션을 수행할 수 있게 되어, 마이그레이션 후보 관리를 통해 발생 가능한 리스크를 미리 방지할 수 있게 되어 클라우드 엣지 환경에서 운용 데이터의 즉시성 및 서비스 안정성을 지속적으로 제공할 수 있게 된다. A cloud edge management method and a cloud edge device are provided. According to this cloud edge management method, when workload data is received, the probability of failure of a pod is calculated using the learned predictive model, and migration can be performed for the pod according to the calculated probability of failure. As a result, it is possible to prevent possible risks in advance through the management of migration candidates, thereby continuously providing operational data immediacy and service stability in the cloud edge environment.

Description

{Method for selecting predict-based migration candidate and target on cloud edge}

본 발명은 클라우드 엣지 관리방법 및 클라우드 엣지 장치에 관한 것으로, 더욱 상세하게는, 클라우드 엣지 환경에서 예측기반 마이그레이션을 수행하는 클라우드 엣지 관리방법 및 클라우드 엣지 장치에 관한 것이다. The present invention relates to a cloud edge management method and a cloud edge device, and more particularly, to a cloud edge management method and a cloud edge device for performing prediction-based migration in a cloud edge environment.

이 부분에 기술된 내용은 단순히 본 실시 예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information for the present embodiment and does not constitute the prior art.

최근 클라우드 기술은 다양한 분야의 IT 서비스에 적용되고 있다. 하지만, 클라우드 데이터 센터가 물리적으로 먼 거리에 있을 경우에는 속도가 느려지고 병목현상으로 처리시간이 지연되는 문제가 있다. Recently, cloud technology is being applied to IT services in various fields. However, when the cloud data center is physically far away, there is a problem in that the speed is slowed down and processing time is delayed due to a bottleneck.

이러한 점을 보완하기 위해, 클라우드 기술에 엣지(Edge) 컴퓨팅 기술이 더해져 더욱 강력한 클라우드 엣지 기술이 제공되고 있다. 엣지 컴퓨팅은 포그 컴퓨팅(Fog Computing)이라고도 알려져 있으며, 네트워크 가장자리나 사용자 근처에 위치한 엣지(또는 포그, 작은 클라우드 노드)들이 주요 데이터 분석 처리 기능을 수행하고, 메인 클라우드에는 처리된 결과만을 전송하는 방식이다. In order to compensate for this, more powerful cloud edge technology is being provided by adding edge computing technology to cloud technology. Edge computing, also known as fog computing, is a method in which the edge (or fog, small cloud node) located at the edge of the network or near the user performs the main data analysis processing function and transmits only the processed results to the main cloud. .

하지만, 클라우드 엣지 환경에서는 자원에 대한 가용성 지원이 불가하고, 클라우드 엣지 특성에 따른 과부하 제어는 메시지 로그 분석을 통한 후처리로 처리할수 밖에 없다. However, in the cloud edge environment, resource availability support is not possible, and overload control according to cloud edge characteristics can only be processed through post-processing through message log analysis.

또한, 대규모 클라우드 환경에 자원 분석은 진행되고 있었으나, 엣지 환경에 대한 자원 분석은 되지 않는 것이 일반적이다. In addition, although resource analysis was in progress in a large-scale cloud environment, it is common that resource analysis is not performed in an edge environment.

그리고, 클라우드 환경에서의 워크로드 관리 및 예측 또한 대규모 클라우드 환경에 대한 것이 대부분이고, 엣지환경에 대한 워크로드 관리 및 예측은 이루어지지 않고 있는 것이 일반적이다. In addition, most of the workload management and prediction in the cloud environment is for a large-scale cloud environment, and it is common that workload management and prediction for the edge environment are not made.

따라서, 이러한 클라우드 엣지 환경에서의 워크로드 관리 및 예측을 위한 방안의 모색이 요청된다. Therefore, the search for a method for workload management and prediction in such a cloud edge environment is requested.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 워크로드 데이터가 수신되면, 학습된 예측모델을 이용하여 포드(Pod)의 장애발생 가능성을 산출하고, 산출된 장애발생 가능성에 따라 해당 포드에 대해 마이그레이션을 수행하는 클라우드 엣지 관리방법 및 클라우드 엣지 장치를 제공함에 있다. The present invention has been devised to solve the above problems, and an object of the present invention is to calculate the probability of occurrence of a pod failure using a learned predictive model when the workload data is received, and the calculated failure It is to provide a cloud edge management method and cloud edge device that perform migration for the corresponding pod according to the possibility of occurrence.

본 발명에서 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description. will be able

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른, 클라우드 엣지 관리장치에 의한 클라우드 엣지 관리방법은, 워크로드 데이터가 수신되면, 학습된 예측모델을 이용하여 포드(Pod)의 장애발생 가능성을 산출하는 단계; 및 산출된 장애발생 가능성에 따라, 해당 포드에 대해 마이그레이션을 수행하는 단계;를 포함한다. In a cloud edge management method by a cloud edge management device according to an embodiment of the present invention for achieving the above object, when workload data is received, the probability of occurrence of a pod failure using the learned predictive model calculating; and performing migration on the corresponding pod according to the calculated failure probability.

그리고, 마이그레이션을 수행하는 단계는, 산출된 장애발생 가능성이 제1 임계값 이상일 경우, 해당 포드에 저장된 데이터들을 가상 자원이 더 많은 가상머신으로 마이그레이션할 수도 있다. Also, in the migration step, when the calculated failure probability is greater than or equal to the first threshold, data stored in the corresponding pod may be migrated to a virtual machine having more virtual resources.

또한, 마이그레이션을 수행하는 단계는, 산출된 장애발생 가능성이 제2 임계값 이하일 경우, 해당 포드에 저장된 데이터들을 가상 자원이 더 적은 가상머신으로 마이그레이션할 수도 있다. In addition, in the migration step, when the calculated failure probability is less than or equal to the second threshold, data stored in the corresponding pod may be migrated to a virtual machine with fewer virtual resources.

그리고, 워크로드 데이터를 이용하여 장애발생 가능성을 예측하는 예측모델을 딥러닝을 통해 학습시키는 단계;를 더 포함할 수도 있다. And, using the workload data to learn the predictive model for predicting the possibility of failure through deep learning; may further include.

또한, 학습시키는 단계는, LSTM(Long Short Term Memory) 알고리즘을 이용하여 예측모델을 학습시킬 수도 있다. In addition, the learning may include learning the predictive model using a Long Short Term Memory (LSTM) algorithm.

그리고, 워크로드 데이터는, 노드 및 포드의 워크로드 데이터, 가상머신의 워크로드 데이터, 컨테이너의 워크로드 데이터, 및 애플리케이션의 워크로드 데이터를 포함할 수도 있다. In addition, the workload data may include workload data of nodes and pods, workload data of virtual machines, workload data of containers, and workload data of applications.

또한, 워크로드 데이터는, CPU 사용량 데이터 및 메모리 사용량 데이터를 포함할 수도 있다. In addition, the workload data may include CPU usage data and memory usage data.

한편, 본 발명의 일 실시예에 따른, 클라우드 엣지 관리장치는, 포드(Pod)의 워크로드 데이터를 수신하는 통신부; 및 워크로드 데이터가 수신되면, 학습된 예측모델을 이용하여 포드의 장애발생 가능성을 산출하고, 산출된 장애발생 가능성에 따라 해당 포드에 대해 마이그레이션을 수행하는 제어부;를 포함한다. On the other hand, according to an embodiment of the present invention, a cloud edge management apparatus, a communication unit for receiving the workload data of the pod (Pod); and a control unit that, when the workload data is received, calculates the probability of occurrence of failure of the pod using the learned predictive model, and performs migration for the corresponding pod according to the calculated probability of occurrence of failure.

본 발명의 다양한 실시예에 따르면, 워크로드 데이터가 수신되면 학습된 예측모델을 이용하여 포드(Pod)의 장애발생 가능성을 산출하고, 산출된 장애발생 가능성에 따라 해당 포드에 대해 마이그레이션을 수행하는 클라우드 엣지 관리방법 및 클라우드 엣지 장치를 제공할 수 있게 되어, 마이그레이션 후보 관리를 통해 발생 가능한 리스크를 미리 방지할 수 있게 되어 클라우드 엣지 환경에서 운용 데이터의 즉시성 및 서비스 안정성을 지속적으로 제공할 수 있게 된다. According to various embodiments of the present invention, when the workload data is received, the probability of occurrence of a pod is calculated using the learned predictive model, and the cloud performs migration for the corresponding pod according to the calculated probability of occurrence of failure. By being able to provide edge management methods and cloud edge devices, it is possible to prevent possible risks in advance through migration candidate management, thereby continuously providing operational data immediacy and service stability in the cloud edge environment.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에 서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtainable in the present invention are not limited to the above-mentioned effects, and other effects not mentioned may be clearly understood by those of ordinary skill in the art from the following description. will be.

본 발명에 관한 이해를 돕기 위해 상세한 설명의 일부로 포함되는, 첨부 도면은 본 발명에 대한 실시 예를 제공하고, 상세한 설명과 함께 본 발명의 기술적 특징을 설명한다.
도 1은 본 발명의 일 실시예에 따른, 클라우드 엣지 환경을 도시한 도면,
도 2는 본 발명의 일 실시예에 따른, 클라우드 엣지의 구성을 도시한 도면,
도 3은 본 발명의 일 실시예에 따른, 클라우드 엣지 관리장치의 구조를 도시한 도면,
도 4는 본 발명의 일 실시예에 따른, 클라우드 엣지 관리방법을 설명하기 위해 제공되는 흐름도,
도 5는 본 발명의 일 실시예에 따른, 클라우드 엣지 관리방법이 수행되는 과정을 도식화한 도면,
도 6은 본 발명의 일 실시예에 따른, 클라우드 엣지 환경에서 포드별 장애발생 가능성을 예측한 예시를 도시한 도면이다. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are included as a part of the detailed description to help the understanding of the present invention, provide embodiments of the present invention, and together with the detailed description, explain the technical features of the present invention.
1 is a diagram illustrating a cloud edge environment, according to an embodiment of the present invention;
2 is a diagram illustrating a configuration of a cloud edge according to an embodiment of the present invention;
3 is a view showing the structure of a cloud edge management apparatus according to an embodiment of the present invention;
4 is a flowchart provided to explain a cloud edge management method according to an embodiment of the present invention;
5 is a diagram schematically illustrating a process in which a cloud edge management method is performed according to an embodiment of the present invention;
6 is a diagram illustrating an example of predicting the probability of failure for each pod in a cloud edge environment, according to an embodiment of the present invention.

본 발명의 과제 해결 수단의 특징 및 이점을 보다 명확히 하기 위하여, 첨부된 도면에 도시된 본 발명의 특정 실시 예를 참조하여 본 발명을 더 상세하게 설명한다.In order to clarify the characteristics and advantages of the problem solving means of the present invention, the present invention will be described in more detail with reference to specific embodiments of the present invention shown in the accompanying drawings.

다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지기능 또는 구성에 대한 상세한 설명은 생략한다. 또한, 도면 전체에 걸쳐 동일한 구성 요소들은 가능한 한 동일한 도면 부호로 나타내고 있음에 유의하여야 한다.However, detailed descriptions of well-known functions or configurations that may obscure the gist of the present invention in the following description and accompanying drawings will be omitted. Also, it should be noted that throughout the drawings, the same components are denoted by the same reference numerals as much as possible.

이하의 설명 및 도면에서 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위한 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.The terms or words used in the following description and drawings should not be construed as being limited to conventional or dictionary meanings, and the inventor may appropriately define the concept of terms for describing his invention in the best way. It should be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that there is.

따라서 본 명세서에 기재된 실시 예와 도면에 도시된 구성은 본 발명의 가장바람직한 일 실시 예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다.Therefore, the embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiment of the present invention, and do not represent all the technical spirit of the present invention, so at the time of the present application, various It should be understood that there may be equivalents and variations.

또한, 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하기 위해 사용하는 것으로, 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용될 뿐, 상기 구성요소들을 한정하기 위해 사용되지 않는다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제2 구성요소는 제1 구성요소로 명명될 수 있고, 유사하게 제1 구성요소도 제2 구성요소로 명명될 수 있다. In addition, terms including ordinal numbers such as first, second, etc. are used to describe various components, and are used only for the purpose of distinguishing one component from other components, and to limit the components. not used For example, without departing from the scope of the present invention, the second component may be referred to as the first component, and similarly, the first component may also be referred to as the second component.

더하여, 어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급할 경우, 이는 논리적 또는 물리적으로 연결되거나, 접속될 수 있음을의미한다.In addition, when an element is referred to as being “connected” or “connected” to another element, it means that it is logically or physically connected or can be connected.

다시 말해, 구성요소가 다른 구성요소에 직접적으로 연결되거나 접속되어 있을 수 있지만, 중간에 다른 구성요소가 존재할 수도 있으며, 간접적으로 연결되거나 접속될 수도 있다고 이해되어야 할 것이다.In other words, it should be understood that a component may be directly connected or connected to another component, but another component may exist in the middle, and may be indirectly connected or connected.

또한, 본 명세서에서 기술되는 "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In addition, terms such as "comprises" or "have" described in this specification are intended to designate the existence of a feature, number, step, operation, component, part, or combination thereof described in the specification, one or the It should be understood that the above does not preclude the possibility of the existence or addition of other features or numbers, steps, operations, components, parts, or combinations thereof.

또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as “…unit”, “…group”, and “module” described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware or software or a combination of hardware and software. have.

또한, "일(a 또는 an)", "하나(one)", "그(the)" 및 유사어는 본 발명을 기술하는 문맥에 있어서(특히, 이하의 청구항의 문맥에서) 본 명세서에 달리 지시되거나 문맥에 의해 분명하게 반박되지 않는 한, 단수 및 복수 모두를 포함하는 의미로 사용될 수 있다.Also, "a or an", "one", "the" and similar terms are otherwise indicated herein in the context of describing the invention (especially in the context of the following claims). or may be used in a sense including both the singular and the plural unless clearly contradicted by context.

이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다. Hereinafter, the present invention will be described in more detail with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른, 클라우드 엣지 환경을 도시한 도면이다. 1 is a diagram illustrating a cloud edge environment according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 클라우드 엣지(10)는 클라우드 엣지 관리장치(100)에 의해 관리된다. 구체적으로, 클라우드 엣지 관리장치(100)는 클라우드 엣지(10)의 자원 사용 상황, 환경 데이터, 워크로드 데이터 등을 관리하고, 그에 따라 클라우드 엣지의 포드(Pod)의 데이터를 마이그레이션하는 등의 관리를 수행하게 된다. As shown in FIG. 1 , the cloud edge 10 is managed by the cloud edge management device 100 . Specifically, the cloud edge management device 100 manages the resource usage status, environment data, workload data, etc. of the cloud edge 10, and accordingly manages the migration of data of the pod of the cloud edge. will perform

클라우드 엣지 관리장치(100)는 그 자체로 물리적으로 독립된 장치로 구현될 수 있을 뿐만 아니라, 어떤 장치나 시스템의 일부로 포함되어 있는 형태로 구현될 수도 있으며, 스마트폰이나 컴퓨터나 서버 등에 설치된 프로그램 또는 프레임워크 또는 애플리케이션 등의 소프트웨어 형태로 구현될 수도 있음은 물론이다. 또한, 클라우드 엣지 관리장치(100)의 각 구성요소는 물리적 구성요소로 구현될 수도 있고 소프트웨어의 기능 형태의 구성요소로 구현될 수도 있다. The cloud edge management device 100 may be implemented not only as a physically independent device by itself, but may also be implemented in a form included as a part of any device or system, and a program or frame installed on a smartphone, computer, server, etc. Of course, it may be implemented in the form of software such as a work or an application. In addition, each component of the cloud edge management apparatus 100 may be implemented as a physical component or may be implemented as a component in the form of a function of software.

클라우드 엣지(Edge)는 클라우드 기술에 엣지(Edge) 컴퓨팅 기술이 더해져 더욱 강력한 클라우드를 제공하는 기술이다. 엣지 컴퓨팅은 포그 컴퓨팅(Fog Computing)이라고도 알려져 있으며, 네트워크 가장자리나 사용자 근처에 위치한 엣지(또는 포그, 노드)들이 주요 데이터 분석 처리 기능을 수행하고, 메인 클라우드에는 처리된 결과만을 전송하는 방식이다. Cloud Edge is a technology that provides a more powerful cloud by adding edge computing technology to cloud technology. Edge computing, also known as fog computing, is a method in which the edges (or fog, nodes) located at the edge of the network or near the user perform main data analysis and processing functions, and only the processed results are transmitted to the main cloud.

도 1에 도시된 바와 같이, 클라우드 엣지(10)는 복수개의 노드(Node)들을 포함하고, 노드 내에는 적어도 하나의 포드(Pod)를 포함하고 있는 것을 확인할 수 있다. 이하에서는, 도 2를 참고하여 클라우드 엣지의 구조에 대해 더욱 상세히 설명한다. As shown in FIG. 1 , it can be confirmed that the cloud edge 10 includes a plurality of nodes and includes at least one pod in the node. Hereinafter, the structure of the cloud edge will be described in more detail with reference to FIG. 2 .

클라우드 엣지(10)에서 엣지(edge)는 IT 인프라에서 사용자와 가장 가까운 네트워크에 위치한 노드(Node, 서버 등)를 한다. 즉, 클라우드 엣지(10)는 사용자가 요청한 데이터 또는 콘텐츠를 사용자 가장 가까이에서 전송하고 처리할 수 있도록 인프라를 구성하는 것이다. 예를 들어, 콘텐츠 전송 네트워크는 클라우드 엣지 개념을 활용한 네트워크 서비스로, 유튜브, 넷플릭스를 비롯한 대부분의 글로벌 컨텐츠 서비스들은 컨텐츠 전송 네트워크를 통해 콘텐츠를 제공하고 있습니다.In the cloud edge 10 , the edge is a node (node, server, etc.) located in the network closest to the user in the IT infrastructure. That is, the cloud edge 10 configures the infrastructure to transmit and process the data or content requested by the user closest to the user. For example, the content delivery network is a network service that utilizes the cloud edge concept, and most global content services including YouTube and Netflix provide content through the content delivery network.

클라우드 엣지(10)의 엣지 컴퓨팅은 중앙에서 데이터를 집중 처리하는 클라우드 컴퓨팅 방식이 아닌, 여러 지점에서 소규모 설비로 데이터를 처리하는 컴퓨팅 방식을 말한다. 사물 인터넷 시대에 본격 진입하게 되면 다양한 경로를 통해 중앙으로 수집되는 데이터의 양이 많아지고 실시간 처리가 중요한 사물 인터넷 환경에서는 집중된 데이터를 지연 없이 처리해야 하는데 기존의 중앙 집중형 클라우드 컴퓨팅 환경에서는 빠르게 결과를 받아 보는 것이 불가능하다. 이런 문제를 보완하기 위해 클라우드 엣지(10)는 여러 곳에 설치한 엣지 서버인 '노드'에서 바로 데이터를 처리하고, 결과를 메인 클라우드에 알려주게 된다. Edge computing of the cloud edge 10 refers to a computing method that processes data with small facilities at multiple points, not a cloud computing method that centrally processes data. As we enter the era of the Internet of Things, the amount of data collected centrally through various routes increases, and in the Internet of Things environment where real-time processing is important, concentrated data must be processed without delay. It is impossible to accept In order to compensate for this problem, the cloud edge 10 processes data directly from 'nodes', which are edge servers installed in various places, and informs the main cloud of the results.

클라우드 엣지(10)는 기존의 클라우드 컴퓨팅에 비해 지연 시간(latency)이 짧다. 가장 가까운 단말 혹은 사물 인터넷 기기에서 직접 데이터를 처리하기 때문에 지연 시간 없이 상황에 대응할 수 있고, 빠른 응답속도를 보장하므로 자율 주행 산업 등에서도 사용될 수 있다. The cloud edge 10 has a shorter latency than the conventional cloud computing. Since data is processed directly from the nearest terminal or Internet of Things device, it can respond to situations without delay, and it can be used in the autonomous driving industry because it guarantees a fast response speed.

또한, 클라우드 엣지(10)는 클라우드의 보안 문제를 어느 정도 완화할 수 있다. 기존의 클라우드는 중앙의 데이터센터에 문제가 발생하면 그와 연결된 모든 웹/모바일 서비스에 영향을 미친다. 하지만, 클라우드 엣지(10)는 각각의 장비에서 대부분의 연산을 처리하기 때문에, 어느 한 시스템을 공략한다고 해서 전체 서비스에 피해를 주지는 않는다는 점에서 안전하다고 볼 수 있다. In addition, the cloud edge 10 can alleviate the security problem of the cloud to some extent. In the existing cloud, when a problem occurs in the central data center, all web/mobile services connected to it are affected. However, since the cloud edge 10 processes most of the calculations in each device, it can be considered safe in that attacking one system does not damage the entire service.

도 2는 본 발명의 일 실시예에 따른, 클라우드 엣지(10)의 구성을 도시한 도면이다. 도 2에 도시된 바와 같이, 클라우드 엣지(10)는 복수개의 노드(Node)(200)들을 포함하고, 노드 내에는 적어도 하나의 포드(Pod)(210)를 포함하고 있다. 2 is a diagram illustrating a configuration of a cloud edge 10 according to an embodiment of the present invention. As shown in FIG. 2 , the cloud edge 10 includes a plurality of nodes 200 , and at least one pod 210 is included in the node.

도 2에 도시된 클라우드 엣지(10)는 쿠버네티스를 이용하여 구현될 수도 있다. 이 경우, 클라우드 엣지 관리장치(100)는 전체 클러스터를 관리하는 마스터와 같은 기능을 수행하게 된다. 모든 명령은 마스터인 클라우드 엣지 관리장치(100)의 API 서버를 호출하고 노드(200)는 클라우드 엣지 관리장치(100)와 통신하면서 필요한 작업을 수행한다. 특정 노드(200)의 컨테이너에 명령하거나 로그를 조회할 때도 노드(200)에 직접 명령하는 게 아니라 클라우드 엣지 관리장치(100)에 명령을 내리고 클라우드 엣지 관리장치(100)가 노드(200)에 접속하여 대신 결과를 응답하게 된다. The cloud edge 10 shown in FIG. 2 may be implemented using Kubernetes. In this case, the cloud edge management device 100 performs the same function as the master managing the entire cluster. All commands call the API server of the cloud edge management device 100 that is the master, and the node 200 performs necessary tasks while communicating with the cloud edge management device 100 . When commanding a container of a specific node 200 or inquiring a log, instead of giving a command to the node 200 directly, the cloud edge management device 100 gives a command and the cloud edge management device 100 accesses the node 200 . Instead, it responds with a result.

노드(200)는 하나의 서버 또는 복수개의 서버로 구성된 작은 클라우드로 구성된다. 노드(200)는 클라우드 엣지 관리장치(100)와 통신하면서 필요한 포드(210)를 생성하고 네트워크와 스토리지를 설정한다. The node 200 is composed of one server or a small cloud composed of a plurality of servers. The node 200 creates a necessary pod 210 while communicating with the cloud edge management device 100 and configures a network and storage.

포드(210)는 실제 컨테이너들이 생성되는 곳으로 수백, 수천개로 확장할 수 있다. 포드(210)는 각각에 라벨을 붙여 사용목적(GPU 특화, SSD 서버 등)을 정의할 수도 있다. 포드(210)는 쿠버네티스에서 배포할 수 있는 가장 작은 단위로 한 개 이상의 컨테이너(211)와 스토리지(213)와 네트워크(215) 속성을 가진다. 포드(210) 속한 적어도 하나의 컨테이너(211)는 스토리지(213)와 네트워크(215)를 공유하고 서로 로컬호스트(localhost)로 접근할 수 있다. The pod 210 is where actual containers are created and can be expanded to hundreds or thousands. The pod 210 may define a purpose of use (GPU-specific, SSD server, etc.) by attaching a label to each. A pod 210 is the smallest unit that can be deployed in Kubernetes and has one or more container 211 , storage 213 , and network 215 properties. At least one container 211 belonging to the pod 210 may share the storage 213 and the network 215 and may access each other through a localhost.

클라우드 엣지(10)는 이와 같은 구조의 복수개의 노드 및 포드를 포함하게 된다. The cloud edge 10 includes a plurality of nodes and pods having such a structure.

이하에서는, 도 3을 참고하여, 클라우드 엣지 관리장치(100)의 구성에 대해 더욱 상세히 설명한다. 도 3은 본 발명의 일 실시예에 따른, 클라우드 엣지 관리장치(100)의 구조를 도시한 도면Hereinafter, the configuration of the cloud edge management apparatus 100 will be described in more detail with reference to FIG. 3 . 3 is a diagram illustrating the structure of a cloud edge management apparatus 100 according to an embodiment of the present invention.

도 3에 도시된 바와 같이, 클라우드 엣지 관리장치(100)는 입력부(110)와 제어부(120)를 포함한다. As shown in FIG. 3 , the cloud edge management apparatus 100 includes an input unit 110 and a control unit 120 .

통신부(110)는 클라우드 엣지(10)와 통신 가능하도록 연결되며, 클라우드 엣지(10)의 포드(Pod)의 워크로드 데이터가 수신된다. 여기에서, 워크로드 데이터는 클라우드 엣지(10)에서 사용되고 있는 가상 자원의 양에 대한 데이터를 나타내는 것으로, CPU 사용량 데이터, 메모리 사용량 데이터, 스토리지 사용량 데이터, 네트워크 부하량 데이터 등을 포함할 수도 있다. 워크로드 데이터는 노드 및 포드의 워크로드 데이터, 가상머신의 워크로드 데이터, 컨테이너의 워크로드 데이터, 및 애플리케이션의 워크로드 데이터를 포함할 수도 있다. The communication unit 110 is communicatively connected to the cloud edge 10 , and workload data of a pod of the cloud edge 10 is received. Here, the workload data represents data on the amount of virtual resources used in the cloud edge 10 , and may include CPU usage data, memory usage data, storage usage data, network load data, and the like. The workload data may include workload data of nodes and pods, workload data of virtual machines, workload data of containers, and workload data of applications.

통신부(110)는 블루투스, 와이파이(WIFI), 근거리무선통신(NFC), 셀룰러, LTE(Long-Term Evolution) 등 다양한 무선 통신 방식으로 통신을 수행할 수 있으며, 유선랜 등의 유선 통신으로 통신을 할 수도 있음은 물론이다. The communication unit 110 may perform communication in various wireless communication methods such as Bluetooth, Wi-Fi (WIFI), short-range wireless communication (NFC), cellular, and LTE (Long-Term Evolution). Of course it could be.

제어부(120)는 클라우드 엣지 관리장치(100)의 전반적인 동작을 제어한다. 구체적으로, 제어부(120)는 워크로드 데이터가 수신되면, 학습된 예측모델을 이용하여 포드의 장애발생 가능성을 산출하고, 산출된 장애발생 가능성에 따라 해당 포드에 대해 마이그레이션을 수행한다. 여기에서, 장애발생 가능성은 데이터 센터에 포함된 서버나 장비들에서 장애가 발생될 가능성을 나타내는 값으로, 장애발생 가능성 값이 높을수록 일정 시간 내에 장애가 발생될 가능성이 높다는 것을 의미한다. The controller 120 controls the overall operation of the cloud edge management apparatus 100 . Specifically, when the workload data is received, the controller 120 calculates the failure probability of the pod using the learned predictive model, and performs migration for the corresponding pod according to the calculated failure probability. Here, the failure probability is a value indicating the probability that a failure occurs in servers or equipment included in the data center, and the higher the failure probability value, the higher the probability of failure occurring within a certain period of time.

이외에 제어부(120)의 상세한 동작은 도 4를 참고하여 설명한다. 도 4는 본 발명의 일 실시예에 따른, 클라우드 엣지 관리 방법을 설명하기 위해 제공되는 흐름도이다. In addition, detailed operations of the control unit 120 will be described with reference to FIG. 4 . 4 is a flowchart provided to explain a cloud edge management method according to an embodiment of the present invention.

우선, 제어부(120)는 포드 별 워크로드 데이터를 이용하여 포드 별로 장애발생 가능성을 예측하는 예측모델을 딥러닝을 통해 학습시킨다. First, the controller 120 learns a predictive model for predicting the probability of failure for each pod through deep learning using workload data for each pod.

이 때, 제어부(120)는 다양한 딥러닝 알고리즘을 이용하여 예측모델을 학습시킬 수 있으며, 예를 들어, 제어부(120)는 LSTM(Long Short Term Memory) 알고리즘을 이용하여 예측모델을 학습시킬 수도 있다. 또한, 제어부(120)는, Inference Pipeline, Policy, Training Pipeline 과정을 적용하여 예측모델을 학습시킬 수도 있다. 예를 들어, 제어부(120)는 일정 기간동안 장애가 발생되지 않았을 때의 워크로드 데이터 세트와 일정 기간 중에 장애가 발생되었을 때의 워크로드 데이터 세트를 예측모델에 입력하고, 딥러닝 알고리즘을 이용해 예측모델을 학습시키게 된다. In this case, the controller 120 may train the predictive model using various deep learning algorithms, for example, the controller 120 may train the predictive model using a Long Short Term Memory (LSTM) algorithm. . Also, the controller 120 may train the predictive model by applying the Inference Pipeline, Policy, and Training Pipeline processes. For example, the control unit 120 inputs a workload data set when a failure does not occur for a certain period and a workload data set when a failure occurs during a certain period into the predictive model, and uses a deep learning algorithm to generate the predictive model. will learn

그리고, 제어부(120)는 학습이 완료되면 학습이 완료된 해당 예측모델을 이용하게 되며, 구체적으로, 제어부(120)는 워크로드 데이터가 수신되면, 학습된 예측모델을 이용하여 포드(Pod)의 장애발생 가능성을 포드 별로 산출하게 된다(S420). 예측모델은 포드별로 워크로드 데이터가 입력되면 포드 각각에 대한 장애발생 가능성을 산출하여 출력하는 인공지능 딥러닝 학습 모델이다. And, when the learning is completed, the controller 120 uses the learned predictive model. Specifically, when the workload data is received, the controller 120 uses the learned predictive model to cause a failure of the pod. The probability of occurrence is calculated for each pod (S420). The predictive model is an artificial intelligence deep learning learning model that calculates and outputs the probability of failure for each pod when workload data is input for each pod.

그 후에, 제어부(120)는 산출된 장애발생 가능성에 따라, 해당 포드에 대해 마이그레이션 수행여부를 결정하고 기설정된 조건이 만족되면 마이그레이션을 수행한다(S430). 구체적으로, 제어부(120)는 산출된 장애발생 가능성이 제1 임계값 이상일 경우, 해당 포드에 저장된 데이터들을 가상 자원이 더 많은 가상머신으로 마이그레이션할 수 있다. 그리고, 제어부(120)는 산출된 장애발생 가능성이 제2 임계값 이하일 경우, 해당 포드에 저장된 데이터들을 가상 자원이 더 적은 가상머신으로 마이그레이션할 수도 있다. Thereafter, the controller 120 determines whether to perform migration for the corresponding pod according to the calculated failure probability, and performs migration when a preset condition is satisfied (S430). Specifically, when the calculated failure probability is greater than or equal to the first threshold, the controller 120 may migrate data stored in the corresponding pod to a virtual machine having more virtual resources. Also, when the calculated failure probability is less than or equal to the second threshold, the controller 120 may migrate data stored in the corresponding pod to a virtual machine having fewer virtual resources.

예를 들어, 제어부(120)는 장애발생 가능성이 제1 임계값인 80% 이상인 경우, 해당 포드에 저장된 데이터들을 가상 자원(CPU, 메모리, 스토리지 등)이 2배 이상 더 많은 가상머신으로 마이그레이션할 수 있다. 또한, 제어부(120)는 장애발생 가능성이 제2 임계값인 20% 이하인 경우, 해당 포드에 저장된 데이터들을 가상 자원(CPU, 메모리, 스토리지 등)이 1/2배 이하로 더 적은 가상머신으로 마이그레이션할 수 있다. 제1 임계값과 제2 임계값은 사용자에 의해 정책에 따라 설정될 수 있으며 변경 가능한 값이다. 이외에도, 제어부(120)는 다양한 방식으로 장애발생 가능성에 따라 클라우드 엣지(10)의 포드를 마이그레이션할 수 있다. For example, the control unit 120 may migrate data stored in the corresponding pod to a virtual machine having more than twice as many virtual resources (CPU, memory, storage, etc.) can In addition, the controller 120 migrates the data stored in the pod to a virtual machine with less than 1/2 times less virtual resources (CPU, memory, storage, etc.) when the failure probability is less than or equal to the second threshold of 20%. can do. The first threshold value and the second threshold value can be set by a user according to a policy and are changeable values. In addition, the controller 120 may migrate the pods of the cloud edge 10 according to the possibility of failure in various ways.

이와 같은 과정을 통해, 클라우드 엣지 관리장치(100)는 장애발생 가능성을 학습 및 예측하여 클라우드 엣지의 포드의 데이터를 마이그레이션함으로써, 클라우드 엣지 환경에서도 장애 관리 리스크를 최소화 할 수 있게 되며 마이그레이션 후보 관리를 통해 발생 가능한 리스크를 미리 방지할 수 있게 되어 클라우드 엣지 환경에서 운용 데이터의 즉시성 및 서비스 안정성을 지속적으로 제공할 수 있게 된다. Through this process, the cloud edge management device 100 learns and predicts the possibility of failure and migrates the data of the pod of the cloud edge, so that it is possible to minimize the failure management risk even in the cloud edge environment, and through migration candidate management By being able to prevent possible risks in advance, it will be possible to continuously provide the immediacy of operational data and service stability in the cloud edge environment.

도 5는 본 발명의 일 실시예에 따른, 클라우드 엣지 관리방법이 수행되는 과정을 도식화한 도면이다. 5 is a diagram schematically illustrating a process in which a cloud edge management method is performed according to an embodiment of the present invention.

도 5에 도시된 바와 같이, 워크로드 데이터(510)는 노드 및 포드의 워크로드 데이터, 가상머신의 워크로드 데이터, 컨테이너의 워크로드 데이터, 및 애플리케이션의 워크로드 데이터를 포함할 수도 있다. 노드 및 포드의 워크로드 데이터는 해당 포드에 대응되는 노드와 해당 포드에서 각각 사용 중인 CPU 사용량, 메모리 사용량, 스토리지 사용량, 네트워크 사용량 등을 나타낸다. 컨테이너의 워크로드 데이터는 해당 포드에 포함된 컨테이너들 각각에서 사용 중인 CPU 사용량, 메모리 사용량, 스토리지 사용량, 네트워크 사용량 등을 나타낸다. 가상머신의 워크로드 데이터는 해당 포드에 포함된 가상머신들 각각에서 사용 중인 CPU 사용량, 메모리 사용량, 스토리지 사용량, 네트워크 사용량 등을 나타낸다. 애플리케이션의 워크로드 데이터는 해당 포드에 포함된 애플리케이션들이 각각 사용 중인 CPU 사용량, 메모리 사용량, 스토리지 사용량, 네트워크 사용량 등을 나타낸다.5 , the workload data 510 may include workload data of nodes and pods, workload data of virtual machines, workload data of containers, and workload data of applications. The workload data of a node and a pod indicates the CPU usage, memory usage, storage usage, network usage, etc. being used by the node corresponding to the corresponding pod and the corresponding pod, respectively. Container workload data indicates CPU usage, memory usage, storage usage, network usage, etc. being used by each of the containers included in the pod. The workload data of the virtual machine indicates CPU usage, memory usage, storage usage, network usage, etc. being used by each of the virtual machines included in the pod. The workload data of the application indicates the CPU usage, memory usage, storage usage, network usage, etc. each application included in the pod is using.

이와 같은, 워크로드 데이터(510)는 클라우드 엣지 장치(100)에 입력된다. 그러면, 클라우드 엣지 장치(100)는 LSTM(Long Short Term Memory) 알고리즘을 이용하여 예측모델을 학습시킬 수 있으며, 구체적으로, Inference Pipeline, Policy, Training Pipeline 과정을 적용하여 예측모델을 학습시킬 수도 있다.Such workload data 510 is input to the cloud edge device 100 . Then, the cloud edge device 100 may learn the predictive model by using a Long Short Term Memory (LSTM) algorithm, and specifically, may train the predictive model by applying the Inference Pipeline, Policy, and Training Pipeline processes.

그리고, 클라우드 엣지 장치(100)는 장애 발생 가능성을 포드별 게이지(520, 530)로 표시할 수도 있다. Also, the cloud edge device 100 may display the probability of occurrence of a failure with gauges 520 and 530 for each pod.

이를 통해, 클라우드 엣지 장치(100)는 사용자에게 더욱 직관적으로 장애 발생 가능성을 표시할 수 있게 된다. Through this, the cloud edge device 100 can more intuitively display the possibility of occurrence of a failure to the user.

도 6은 본 발명의 일 실시예에 따른, 클라우드 엣지 환경에서 포드별 장애발생 가능성을 예측한 예시를 도시한 도면이다. 6 is a diagram illustrating an example of predicting the probability of failure for each pod in a cloud edge environment, according to an embodiment of the present invention.

도 6에는 2개의 노드(Node1, Node2)가 포함된 클라우드 엣지(10) 환경을 도시하고 있다. 그리고, 노드1(Node1)에는 4개의 포드(Pod11, Pod12, Pod13, Pod14)가 포함되어 있고, 노드2(Node2)에는 4개의 포드(Pod21, Pod22, Pod23)가 포함되어 있다. 6 shows an environment of the cloud edge 10 including two nodes (Node1 and Node2). And, node 1 (Node1) includes four pods (Pod11, Pod12, Pod13, Pod14), and node 2 (Node2) includes four pods (Pod21, Pod22, Pod23).

그리고, Pod11의 장애발생 가능성은 90%, Pod12의 장애발생 가능성은 55%, Pod13의 장애발생 가능성은 10%, Pod14의 장애발생 가능성은 75%, Pod21의 장애발생 가능성은 50%, Pod22의 장애발생 가능성은 80%, Pod23의 장애발생 가능성은 15%인 것을 확인할 수 있다. 즉, 제1 임계값이 80%이고 제2 임계값이 20%인 경우, Pod11과 Pod22는 더 큰 규모의 가상머신으로 마이그레이션을 수행할 대상이 되고 Pod13과 Pod23은 더 작은 규모의 가상머신으로 마이그레이션을 수행할 대상이 되게 된다. And, the probability of failure of Pod11 is 90%, the probability of failure of Pod12 is 55%, the probability of failure of Pod13 is 10%, the probability of failure of Pod14 is 75%, the probability of failure of Pod21 is 50%, and the probability of failure of Pod22 is 50%. It can be seen that the probability of occurrence is 80%, and the probability of failure of Pod23 is 15%. That is, if the first threshold is 80% and the second threshold is 20%, Pod11 and Pod22 are targets to be migrated to larger virtual machines, and Pod13 and Pod23 are migrated to smaller virtual machines. become a target for performing

따라서, 클라우드 엣지 장치(100)는 Pod11과 Pod22를 가상자원이 더 많은 가상머신으로 마이그레이션 하고, Pod13과 Pod23은 가상 자원이 더 적은 가상머신으로 마이그레이션하게 된다. Accordingly, the cloud edge device 100 migrates Pod11 and Pod22 to a virtual machine with more virtual resources, and Pod13 and Pod23 migrates to a virtual machine with fewer virtual resources.

이를 통해, 클라우드 엣지 장치(100)는 자원이 부족하여 장애가 발생될 가능성이 높은 포드는 가상자원이 더 많은 가상머신으로, 자원이 충분하여 장애가 발생될 가능성이 낮은 포드는 가상자원이 더 적은 가상머신으로 마이그레이션을 하게 되어, 클라우드 엣지의 가상 자원을 효율적으로 이용하고 예측 기반으로 장애발생 가능성을 최소화할 수 있게 된다. Through this, the cloud edge device 100 is a virtual machine with more virtual resources for a pod that is likely to fail due to insufficient resources, and a virtual machine with fewer virtual resources for a pod that is less likely to fail due to sufficient resources. By migrating to a new location, it is possible to efficiently use the virtual resources of the cloud edge and minimize the possibility of failure based on prediction.

한편, 본 실시예에 따른 장치의 기능 및 방법을 수행하게 하는 컴퓨터 프로그램을 수록한 컴퓨터로 읽을 수 있는 기록매체에도 본 발명의 기술적 사상이 적용될 수 있음은 물론이다. 또한, 본 발명의 다양한 실시예에 따른 기술적 사상은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 프로그래밍 언어 코드 형태로 구현될 수도 있다. 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고 데이터를 저장할 수 있는 어떤 데이터 저장 장치이더라도 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광디스크, 하드 디스크 드라이브, 플래시 메모리, 솔리드 스테이트 디스크(SSD) 등이 될 수 있음은 물론이다. 또한, 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터간에 연결된 네트워크를 통해 전송될 수도 있다. On the other hand, it goes without saying that the technical idea of the present invention can also be applied to a computer-readable recording medium containing a computer program for performing the function and method of the device according to the present embodiment. In addition, the technical ideas according to various embodiments of the present invention may be implemented in the form of computer-readable programming language codes recorded on a computer-readable recording medium. The computer-readable recording medium may be any data storage device readable by the computer and capable of storing data. For example, the computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, flash memory, solid state disk (SSD), or the like. In addition, the computer-readable code or program stored in the computer-readable recording medium may be transmitted through a network connected between computers.

본 명세서와 도면에서는 예시적인 장치 구성을 기술하고 있지만, 본 명세서에서 설명하는 기능적인 동작과 주제의 구현물은 다른 유형의 디지털 전자 회로로구현되거나, 본 명세서에서 개시하는 구조 및 그 구조적인 등가물들을 포함하는 컴퓨터 소프트웨어, 펌웨어 혹은 하드웨어로 구현되거나, 이들 중 하나 이상의 결합으로 구현 가능하다.Although this specification and drawings describe exemplary device configurations, implementations of the functional operations and subject matter described herein may be implemented in other types of digital electronic circuits, or include structures disclosed herein and structural equivalents thereof. may be implemented as computer software, firmware, or hardware, or a combination of one or more of these.

따라서, 상술한 예를 참조하여 본 발명을 상세하게 설명하였지만, 본 발명이속하는 분야의 통상의 기술자라면 본 발명의 범위를 벗어나지 않으면서도 본 예들에 대한 개조, 변경 및 변형을 가할 수 있다.Accordingly, although the present invention has been described in detail with reference to the above-described examples, those skilled in the art to which the present invention pertains can make modifications, changes and modifications to the examples without departing from the scope of the present invention.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention as claimed in the claims Various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

10 : 클라우드 엣지
100 : 클라우드 엣지 관리장치
110 : 통신부
120 : 제어부10 : Cloud Edge
100: cloud edge management device
110: communication department
120: control unit

Claims

In the cloud edge management method by the cloud edge management device,
When the workload data is received, calculating the probability of occurrence of a failure of the pod using the learned predictive model; and
Including; according to the calculated failure probability, performing migration for the pod;
The steps to perform the migration are:
When the calculated probability of failure is greater than or equal to the first threshold, the data stored in the pod is migrated to a virtual machine with more virtual resources. Migrate to fewer virtual machines,
A cloud edge management method, characterized in that the calculated failure probability is displayed as a gauge for each pod.

delete

The method according to claim 1,
Further comprising; training a predictive model that predicts the possibility of failure using workload data through deep learning;
The calculation step is
A cloud edge management method, characterized in that the probability of occurrence of a pod failure is calculated using the predictive model learned through the deep learning.

5. The method according to claim 4,
The learning steps are
A cloud edge management method, characterized in that the predictive model is trained using an LSTM (Long Short Term Memory) algorithm.

The method according to claim 1,
workload data,
A cloud edge management method, comprising: workload data of nodes and pods, workload data of virtual machines, workload data of containers, and workload data of applications.

7. The method of claim 6,
workload data,
Cloud edge management method comprising CPU usage data and memory usage data.

Communication unit for receiving the workload data of the pod (Pod); and
When the workload data is received, a control unit that calculates the probability of occurrence of failure of the pod using the learned predictive model, and performs migration for the corresponding pod according to the calculated probability of occurrence of failure;
the control unit,
When the calculated probability of failure is greater than or equal to the first threshold, the data stored in the pod is migrated to a virtual machine with more virtual resources. Migrate to fewer virtual machines,
A cloud edge management device, characterized in that the calculated failure probability is displayed as a gauge for each pod.