KR20230008321A

KR20230008321A - Method and system for optimal partial offloading in wireless edge computing network

Info

Publication number: KR20230008321A
Application number: KR1020210088829A
Authority: KR
Inventors: 조성래; 이충현; 슈메이에 라코우 데메케
Original assignee: 중앙대학교 산학협력단
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2023-01-16
Also published as: KR102589554B1

Abstract

The present invention relates to a method for optimal partial offloading in a wireless edge computing-based network comprising: a second base station located within the coverage of a first base station; and a server connected to the first base station. The method comprises the steps of: calculating offloading data for offloading a task created from at least one user terminal connected to the second base station; calculating the waiting time and energy consumption required to offload the offloading data; and processing the offloading data based on the calculated waiting time and energy consumption. Therefore, the method can schedule task offloading and resource allocation problems between wireless backhaul and wireless fronthaul in a mobile edge computing-based network.

Description

Optimal partial offloading method and system for offloading in wireless edge computing based network

본 발명은 무선 엣지 컴퓨팅 기반 네트워크에서 최적의 부분 오프로딩 방법 및 오프로딩을 위한 시스템에 관한 것으로서, 더욱 상세하게는 대기시간 및 에너지 소비량의 관점에서 계산되는 오버헤드를 최소화하기 위한 무선 엣지 컴퓨팅 기반 네트워크에서 최적의 부분 오프로딩 방법 및 오프로딩을 위한 시스템에 관한 것이다.The present invention relates to an optimal partial offloading method and a system for offloading in a wireless edge computing-based network, and more particularly, to a wireless edge computing-based network for minimizing overhead calculated in terms of latency and energy consumption. to an optimal partial offloading method and a system for offloading.

모바일 데이터의 폭증은 최적의 성능을 달성하고 향후 무선 네트워크의 다양한 측면에 더 많은 문제를 도입하기 위해 높은 안정성 및 낮은 대기 시간과 같은 중요한 요구 사항을 부과 할 것으로 예상된다.The explosion of mobile data is expected to impose critical requirements such as high reliability and low latency to achieve optimal performance and introduce more challenges to various aspects of wireless networks in the future.

지능형 모바일 장치(예: 스마트 워치, 스마트 폰, 가상 현실 유리 등)가 크게 증가하면서 대화 형 온라인 게임, 자연어 처리, 증강 현실 및 얼굴 인식과 같은 다양한 계산 집약적 응용 프로그램이 등장하고 인기가 높아지고 있다.The significant increase in intelligent mobile devices (e.g., smart watches, smartphones, virtual reality glasses, etc.) has resulted in the emergence and growing popularity of various computationally intensive applications such as interactive online gaming, natural language processing, augmented reality and facial recognition.

그러나, 모바일 장치는 배터리 에너지, 메모리 크기 및 처리 속도와 같은 계산 리소스가 제한되어 있어 허용 가능한 체감 품질(QoE) 및 서비스 품질 (QoS)을 달성하는 기능이 제한된다.However, mobile devices have limited computational resources such as battery energy, memory size and processing speed, which limits their ability to achieve acceptable quality of experience (QoE) and quality of service (QoS).

따라서, 적시에 무선 액세스와 사용자 단말에 대한 계산 오프 로딩 프로비저닝을 모두 지원하기 위해 무선 네트워크(5G 이상)가 필요하며, 이러한 무선 액세스 요구 사항을 충족시키기 위한 5G 네트워크의 기술 중 하나로 소형셀 통신이 널리 알려져 있다.Therefore, a wireless network (5G or higher) is required to support both timely wireless access and computational offloading provisioning for user terminals, and small cell communication is widely used as one of the technologies of 5G networks to meet these wireless access requirements. It is known.

관련 선행기술로는 대한민국 공개특허공보 제10-2020-0017589호(발명의 명칭: 무선 통신 시스템에서 모바일 노드의 태스크를 오프로딩하기 위한 클라우드 서버 및 그의 동작 방법, 공개일자: 2020년 2월 19일)가 있다.Related prior art is Republic of Korea Patent Publication No. 10-2020-0017589 (title of invention: cloud server for offloading tasks of mobile nodes in a wireless communication system and its operation method, publication date: February 19, 2020) ) is there.

이에, 본 발명의 일 실시예는 모바일 엣지 컴퓨팅 기반 네트워크에서 무선 백홀과 무선 프런트홀 간의 태스크 오프로딩 및 자원 할당 문제를 스케쥴링하기 위한 무선 엣지 컴퓨팅 기반 네트워크에서 최적의 부분 오프로딩 방법 및 오프로딩을 위한 시스템을 제공하는데 목적이 있다.Accordingly, an embodiment of the present invention provides an optimal partial offloading method and method for offloading in a wireless edge computing-based network for scheduling task offloading and resource allocation problems between wireless backhaul and wireless fronthaul in a mobile edge computing-based network. The purpose is to provide a system.

본 발명이 해결하고자 하는 과제는 이상에서 언급한 과제(들)로 제한되지 않으며, 언급되지 않은 또 다른 과제(들)을 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The problem to be solved by the present invention is not limited to the above-mentioned problem (s), and another problem (s) not mentioned above will be clearly understood by those skilled in the art from the description below.

상술한 목적을 달성하기 위한 본 발명의 일 실시예에 따른 제1 기지국의 커버리지 내 위치하는 적어도 하나의 제2 기지국 및 상기 제1 기지국과 연결되는 서버를 포함하는 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩 방법에 있어서, 상기 제2 기지국과 연결된 적어도 하나의 사용자 단말로부터 생성된 태스크를 오프로딩 하기 위한 오프로딩 데이터를 산출하는 단계, 상기 오프로딩 데이터를 오프로딩 하기 위해 필요한 대기시간 및 에너지 소비량을 연산하는 단계 및 상기 연산된 대기시간 및 에너지 소비량에 기초하여 상기 오프로딩 데이터를 처리하는 단계를 포함한다.Partial off in a wireless edge computing-based network including at least one second base station located within the coverage of the first base station and a server connected to the first base station according to an embodiment of the present invention for achieving the above object. A loading method comprising: calculating offloading data for offloading a task generated from at least one user terminal connected to the second base station; calculating waiting time and energy consumption required for offloading the offloading data; and processing the offloading data based on the calculated waiting time and energy consumption.

본 발명의 일 실시예에서, 상기 제1 기지국 및 상기 제2 기지국은 무선 백홀을 통해 연동되고, 상기 제2 기지국 및 상기 적어도 하나의 사용자 단말은 무선 프런트홀을 통해 연동되고, 상기 무선 백홀 및 상기 무선 프런트홀은 주파수 대역폭을 공유할 수 있다.In one embodiment of the present invention, the first base station and the second base station interwork through a wireless backhaul, the second base station and the at least one user terminal interwork through a wireless fronthaul, and the wireless backhaul and the Wireless fronthaul can share frequency bandwidth.

본 발명의 일 실시예에서, 상기 산출하는 단계는 상기 제1 기지국과 상기 제2 기지국 간의 채널상태 및 상기 제2 기지국과 상기 사용자 단말 간의 채널상태를 기반으로 설정된 오프로딩 속도에 관한 변수에 기초하여 태스크를 오프로딩 하기 위한 오프로딩 데이터를 산출할 수 있다.In one embodiment of the present invention, the calculating step is based on a variable related to an offloading rate set based on a channel state between the first base station and the second base station and a channel state between the second base station and the user terminal Offloading data for offloading a task may be calculated.

본 발명의 일 실시예에서, 상기 산출하는 단계는 상기 오프로딩 속도에 관한 변수가 0 내지 1 사이의 값을 가지는 경우, 태스크의 일부로서 오프로딩 되는 제1 데이터 및 태스크의 일부를 제외한 나머지로서 상기 사용자 단말에서 처리되는 제2 데이터를 산출할 수 있다.In one embodiment of the present invention, the calculating may include, when the variable related to the offloading speed has a value between 0 and 1, the first data offloaded as part of the task and the remainder except for part of the task. Second data processed by the user terminal may be calculated.

본 발명의 일 실시예에서, 상기 연산하는 단계는 상기 제1 데이터 및 상기 제2 데이터를 처리하기 위해 필요한 대기시간을 나타낸 하기 수학식 1에 기초하여 상기 대기시간을 연산하고, 상기 제1 데이터 및 상기 제2 데이터를 처리하기 위해 필요한 에너지 소비량을 나타낸 수학식 2에 기초하여 상기 에너지 소비량을 연산할 수 있다.In one embodiment of the present invention, the calculating step calculates the waiting time based on the following Equation 1 showing the waiting time required to process the first data and the second data, and the first data and The amount of energy consumption may be calculated based on Equation 2 representing the amount of energy required to process the second data.

[수학식 1][Equation 1]

[수학식 2][Equation 2]

여기서, y^t _ij는 상기 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 대기시간을, y^t _ij,l은 상기 제2 데이터를 처리하기 위해 필요한 대기시간을, y^t _ij,r은 상기 제1 데이터를 처리하기 위해 필요한 대기시간을, ε^t _ij은 상기 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 에너지 소비량을, e^t _ij,l은 상기 제2 데이터를 처리하기 위해 필요한 에너지 소비량을, e^t _ij,r은 상기 제1 데이터를 처리하기 위해 필요한 에너지 소비량을 의미함.Here, y ^t _ij is the total waiting time required to offload the offloading data, y ^t _ij,l is the waiting time required to process the second data, and y ^t _ij,r is the first data , ε ^t _ij is the total energy consumption required to offload the offloading data, e ^t _ij,l is the energy consumption required to process the second data, e ^t _{ij ,r} means energy consumption required to process the first data.

본 발명의 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서 최적의 부분 오프로딩 방법은 상기 연산된 대기시간 및 에너지 소비량을 미리 학습된 학습 알고리즘에 적용하여 오프로딩을 최적화하는 단계를 더 포함하고, 상기 학습 알고리즘은 오프로딩 환경의 상태 스페이스, 행동 스페이스 및 유틸리티 함수로 구성된 마르코프 결정 프로세스(Markov Decision Process)에 강화 학습이 적용될 수 있다.An optimal partial offloading method in a wireless edge computing-based network according to an embodiment of the present invention further includes optimizing offloading by applying the calculated latency and energy consumption to a pre-learned learning algorithm, wherein the As the learning algorithm, reinforcement learning may be applied to a Markov Decision Process composed of a state space, an action space, and a utility function of an offloading environment.

본 발명의 일 실시예에서, 상기 상태 스페이스는 제1 기지국과 제2 기지국 간의 채널상태, 상기 제2 기지국과 상기 사용자 단말 사이의 채널 상태, 상기 제1 기지국 및 상기 제2 기지국이 커버하는 셀 간의 간섭, 상기 서버 내 오프로딩 된 태스크가 저장되는 버퍼상태의 집합을 포함하고, 상기 행동 스페이스는 오프로딩 속도에 관한 변수 및 오프로딩 대역폭 할당에 관한 변수의 집합을 포함하고, 상기 유틸리티 함수는 하기 수학식 3에 기초하여 나타낼 수 있다.In one embodiment of the present invention, the state space is a channel state between a first base station and a second base station, a channel state between the second base station and the user terminal, and a state between cells covered by the first base station and the second base station. interference, a set of buffer states in which offloaded tasks in the server are stored, the action space includes a set of variables related to offloading speed and a variable related to offloading bandwidth allocation, and the utility function has the following mathematical expression It can be expressed based on Equation 3.

[수학식 3][Equation 3]

여기서, X^t는 상기 상태 스페이스를, a^t는 상기 행동 스페이스를, ε^t _ij은 상기 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 에너지 소비량을, y^t _ij는 상기 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 대기시간을, λ는 태스크 드랍에 관한 가중치 변수를, Г^t는 상기 사용자 단말의 대기시간 및 에너지 소비량에 관한 가중치 변수를, r^t _jm은 상기 제1 기지국 및 상기 제2 기지국 간에 연동되는 무선 백홀의 데이터 전송율을 의미함.Here, X ^t is the state space, a ^t is the action space, ε ^t _ij is the total energy consumption required to offload the offloading data, and y ^t _ij is the offloading data The total waiting time required, λ is a weight variable related to task drop, Г ^t is a weight variable related to the waiting time and energy consumption of the user terminal, r ^t _jm is the first base station and the second base station Refers to the data transfer rate of the wireless backhaul.

본 발명의 일 실시예에서, 상기 강화 학습은 액터(actor) 및 크리틱(critic) 기반의 DDPG(Deep Deterministic Policy Gradient)를 이용할 수 있다.In an embodiment of the present invention, the reinforcement learning may use deep deterministic policy gradient (DDPG) based on actors and critics.

본 발명의 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서 최적의 부분 오프로딩을 위한 시스템은 제1 기지국, 상기 제1 기지국의 커버리지 내 위치하는 제2 기지국, 상기 제2 기지국과 연결되는 적어도 하나의 사용자 단말, 및 상기 사용자 단말로부터 오프로딩 되는 태스크를 처리하기 위한 서버를 포함할 수 있다.A system for optimal partial offloading in a wireless edge computing-based network according to an embodiment of the present invention includes a first base station, a second base station located within the coverage of the first base station, and at least one connected to the second base station. It may include a user terminal and a server for processing tasks offloaded from the user terminal.

본 발명의 일 실시예에 따르면, 모바일 엣지 컴퓨팅 기반 네트워크에서 무선 백홀과 무선 프런트홀 간의 태스크 오프로딩 및 자원 할당 문제를 스케쥴링할 수 있다.According to an embodiment of the present invention, task offloading and resource allocation problems between wireless backhaul and wireless fronthaul may be scheduled in a mobile edge computing-based network.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 일 실시예에 따른 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩을 위한 시스템을 나타낸 개념도이다.
도 2는 본 발명의 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩 방법을 나타낸 흐름도이다.
도 3은 본 발명의 일 실시예에 있어서, DDPG 학습의 프레임워크를 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 있어서, DDPG 학습이 반영된 부분 오프로딩을 처리하기 위한 수행코드를 나타낸 도면이다.
도 5a는 본 발명의 일 실시예에 따른 대기시간으로 인한 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.
도 5b는 본 발명의 일 실시예에 따른 에너지 소비량으로 인한 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.
도 6은 본 발명의 일 실시예에 따른 사용자 단말의 평균 태스크 드롭률을 비교예들과 비교하여 나타낸 그래프이다.
도 7은 본 발명의 일 실시예에 따른 사용자 단말 수 대비 총 시스템 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.
도 8은 본 발명의 일 실시예에 따른 태스크 크기 대비 총 시스템 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.1 is a conceptual diagram illustrating a system for partial offloading in a wireless edge computing-based network according to an embodiment according to an embodiment of the present invention.
2 is a flowchart illustrating a partial offloading method in a wireless edge computing-based network according to an embodiment of the present invention.
3 is a diagram illustrating a framework of DDPG learning according to an embodiment of the present invention.
4 is a diagram illustrating execution codes for processing partial offloading in which DDPG learning is reflected in an embodiment of the present invention.
5A is a graph showing overhead due to waiting time compared to comparative examples according to an embodiment of the present invention.
5B is a graph showing overhead due to energy consumption compared to comparative examples according to an embodiment of the present invention.
6 is a graph showing average task drop rates of user terminals compared to comparative examples according to an embodiment of the present invention.
7 is a graph showing total system overhead versus the number of user terminals compared to comparative examples according to an embodiment of the present invention.
8 is a graph showing total system overhead versus task size compared to comparative examples according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다.Since the present invention can make various changes and have various embodiments, specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. Like reference numerals have been used for like elements throughout the description of each figure.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.It is understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle. It should be. On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "include" or "have" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present application, they should not be interpreted in an ideal or excessively formal meaning. don't

이하, 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩을 위한 시스템을 나타낸 개념도이다.1 is a conceptual diagram illustrating a system for partial offloading in a wireless edge computing-based network according to an embodiment according to an embodiment of the present invention.

설명에 앞서, 본 발명의 시스템을 구성하는 네트워크는 5세대(5G) 무선 엣지 컴퓨팅(MEC, Mobile Edge Computing) 네트워크로서 클라우드 컴퓨팅 기능과 네트워크 가장자리에 IT 서비스 환경을 제공한다.Prior to description, the network constituting the system of the present invention is a 5th generation (5G) wireless edge computing (MEC) network, which provides a cloud computing function and an IT service environment at the edge of the network.

이러한 5G 무선 엣지 컴퓨팅 네트워크에서는 다수의 사용자 단말에서 요구되는 태스크를 실시간으로 빠르게 처리하기 위해 태스크 할당의 최적화가 필요하며, 이를 위해 대기시간 및 에너지 소비량의 최소화 하는 태스크의 오프로딩 기법이 중요해지고 있다.In such a 5G wireless edge computing network, it is necessary to optimize task allocation to quickly process tasks required by multiple user terminals in real time, and for this, task offloading techniques that minimize standby time and energy consumption are becoming important.

따라서, 본 발명에서는 최적화된 태스크 할당을 위해 부분적으로 수행되는 오프로딩 기법에 관하여 제시하고자 한다.Therefore, the present invention proposes a partially performed offloading technique for optimized task allocation.

도 1을 참조하면, 본 발명의 일 실시예에 따른 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩을 위한 시스템은 제1 기지국, 제2 기지국, 사용자 단말 및 서버를 포함한다.Referring to FIG. 1 , a system for partial offloading in a wireless edge computing-based network according to an embodiment of the present invention includes a first base station, a second base station, a user terminal, and a server.

제1 기지국은 5G 셀룰러 통신에서 무선 단말들을 위한 네트워크로의 액세스를 제공하는 전자 디바이스일 수 있다. 제1 기지국은 소정 크기의 커버리지 셀을 관리할 수 있으며, 보다 작은 크기의 커버리지 셀인 적어도 하나의 소형셀과 중첩될 수 있다. 예컨대, 제1 기지국은 중앙 기지국인 gNodeB(gNB)일 수 있다.A first base station may be an electronic device that provides access to a network for wireless terminals in 5G cellular communication. The first base station may manage a coverage cell of a predetermined size and may overlap with at least one small cell that is a coverage cell of a smaller size. For example, the first base station may be a central base station gNodeB (gNB).

제2 기지국은 제1 기지국의 커버리지 내 위치하는 기지국으로서 소형셀을 관리할 수 있다. 예컨대, 제2 기지국은 스몰셀 기지국(sbs, small cell base station)일 수 있다.The second base station may manage a small cell as a base station located within the coverage of the first base station. For example, the second base station may be a small cell base station (sbs).

사용자 단말은 제2 기지국과 연결될 수 있다. 이때, 제2 기지국에 하나 또는 다수의 사용자 단말이 연결될 수 있으며, 사용자 단말은 연결된 제2 기지국이 관리하는 소형셀의 커버리지 대역을 통해 무선 통신 할 수 있다. 예컨대, 사용자 단말은 UE일 수 있다.The user terminal may be connected to the second base station. At this time, one or more user terminals may be connected to the second base station, and the user terminals may perform wireless communication through a coverage band of a small cell managed by the connected second base station. For example, a user terminal may be a UE.

일 실시예로, 제1 기지국 및 제2 기지국은 무선 백홀 링크를 통해 연동되고, 제2 기지국 및 적어도 하나의 사용자 단말은 무선 프런트홀 링크를 통해 연동될 수 있다.In one embodiment, the first base station and the second base station may interwork through a wireless backhaul link, and the second base station and at least one user terminal may interwork through a wireless fronthaul link.

이때, 무선 백홀 링크와 무선 프런트홀 링크는 주파수 대역폭을 공유할 수 있다. 본 실시예에서는 대역폭 공유 계수 k^t _jm(where 0<= k^t _jm<=1)을 적용할 수 있는데, 자세하게는 k^t _jm가 제1 기지국과 제2 기지국 사이의 무선 백홀 링크에 할당된 총 대역폭을 의미하고, (1-k^t _jm)가 제2 기지국과 사용자 단말 사이의 무선 프런트홀 링크에 할당된 총 대역폭을 의미할 수 있다.In this case, the wireless backhaul link and the wireless fronthaul link may share a frequency bandwidth. In this embodiment, a bandwidth sharing coefficient k ^t _jm (where 0 <= k ^t _jm <= 1) may be applied. In detail, k ^t _jm is the total number allocated to the wireless backhaul link between the first base station and the second base station Bandwidth, and (1-k ^t _jm ) may mean the total bandwidth allocated to the wireless fronthaul link between the second base station and the user terminal.

서버는 제1 기지국과 연결되는 중앙 서버로서 제1 기지국, 제1 기지국과 연동 가능한 제2 기지국 및 제1 기지국 또는 제2 기지국과 연동 가능한 사용자 단말과 통신 가능할 수 있다. 예컨대, 서버는 모바일 엣지 컴퓨팅 서버(MEC server)일 수 있다.The server is a central server connected to the first base station and may be capable of communicating with the first base station, a second base station capable of interworking with the first base station, and a user terminal capable of interworking with the first base station or the second base station. For example, the server may be a mobile edge computing server (MEC server).

서버는 제2 기지국을 통해 사용자 단말로부터 오프로딩 되는 태스크를 전달 받아 처리할 수 있다. 이를 위해, 서버에는 사용자 단말로부터 오프로딩 되는 태스크를 저장하기 위한 버퍼가 구비될 수 있다.The server may receive and process a task offloaded from the user terminal through the second base station. To this end, a buffer for storing a task offloaded from a user terminal may be provided in the server.

이하에서는, 도 2를 참조하여, 전술한 구조의 시스템을 기반으로 수행되는 부분 오프로딩 방법에 관하여 설명하기로 한다.Hereinafter, with reference to FIG. 2, a partial offloading method performed based on the system having the above structure will be described.

도 2는 본 발명의 일 실시예에 따른 무선 엣지 컴퓨팅 기반 네트워크에서의 부분 오프로딩 방법을 나타낸 흐름도이다.2 is a flowchart illustrating a partial offloading method in a wireless edge computing-based network according to an embodiment of the present invention.

참고로, 후술하는 부분 오프로딩 방법은 사용자 단말에 의해 수행되는 것으로 기재하였으나, 이에 한정되지 않고 사용자 단말과 통신 가능한 별도의 장치에 의해 수행될 수 있다. 또한, 설명의 편의를 위해, 태스크의 오프로딩 대상은 서버로 한정하여 설명하고자 한다.For reference, the partial offloading method described later has been described as being performed by a user terminal, but is not limited thereto and may be performed by a separate device capable of communicating with the user terminal. Also, for convenience of explanation, the offloading target of the task is limited to the server.

도 2를 참조하면, 단계(S100)에서 제2 기지국과 연결된 적어도 하나의 사용자 단말에서 생성된 태스크를 오프로딩 하기 위한 오프로딩 데이터를 산출할 수 있다.Referring to FIG. 2 , in step S100 , offloading data for offloading a task generated in at least one user terminal connected to the second base station may be calculated.

사용자 단말에서는 1bit 당 크기가

인 태스크가 생성될 수 있다. In the user terminal, the size per 1 bit is

An in-task can be created.

이후, 상기 단계(S100)에서는 제1 기지국과 제2 기지국 간의 채널상태 및 제2 기지국과 사용자 단말 간의 채널상태를 기반으로 설정된 오프로딩 속도에 관한 변수에 기초하여 태스크를 오프로딩 하기 위한 오프로딩 데이터를 산출할 수 있다. 즉, 무선 백홀 링크 및 무선 프런트홀 링크의 각 채널상태를 고려하여 미리 설정된 오프로딩 속도에 관한 변수에 따라 사용자 단말에서 생성된 태스크를 서버로 오프로딩 하기 위한 오프로딩 데이터를 산출할 수 있다.Thereafter, in the step S100, offloading data for offloading a task based on a variable related to an offloading speed set based on a channel state between the first base station and the second base station and a channel state between the second base station and the user terminal. can be calculated. That is, offloading data for offloading a task generated in a user terminal to a server may be calculated according to variables related to a preset offloading speed in consideration of channel conditions of the wireless backhaul link and the wireless fronthaul link.

이때, 사용자 단말에서 생성된 태스크는 일부만 오프로딩 되고 나머지는 사용자 단말에서 처리될 수 있다. 다시 말해, 각 채널 상태에 따른 오프로딩 속도에 관한 변수에 따라 태스크가 부분적으로 오프로딩 될 수 있다.In this case, only a part of the task generated in the user terminal may be offloaded, and the rest may be processed in the user terminal. In other words, tasks may be partially offloaded according to variables related to offloading speed according to each channel state.

이를 위해, 상기 단계(S100)에서는 오프로딩 속도에 관한 변수가 0 내지 1 사이의 값을 가지는 경우, 태스크의 일부로서 오프로딩 되는 제1 데이터 및 태스크의 일부를 제외한 나머지로서 사용자 단말에서 처리되는 제2 데이터를 산출할 수 있다. 이에 따라, 제1 데이터는 오프로딩 되되 제2 데이터는 사용자 단말에서 처리될 수 있다. 참고로, 제1 데이터 및 제2 데이터는 각각

및

로 나타낼 수 있으며, 두 데이터의 크기 합은 생성된 태스크의 크기와 동일할 수 있다.To this end, in the step S100, when the offloading speed variable has a value between 0 and 1, the first data offloaded as part of the task and the third data processed by the user terminal as the remainder except for part of the task. 2 data can be calculated. Accordingly, the first data may be offloaded while the second data may be processed in the user terminal. For reference, the first data and the second data are respectively

and

, and the sum of the sizes of the two data may be equal to the size of the created task.

물론, 오프로딩 데이터는 상기 부분 오프로딩을 위한 데이터 이외에 생성된 태스크를 모두 오프로딩 하기 위한 데이터(제3 데이터) 및 생성된 태스크를 모두 사용자 단말에서 처리하기 위한 데이터(제4 데이터)를 더 포함할 수 있다.Of course, the offloading data further includes data for offloading all generated tasks (third data) and data for processing all generated tasks in the user terminal (fourth data) in addition to the data for partial offloading. can do.

이와 관련하여, 상기 단계(S100)에서는 오프로딩 속도에 관한 변수가 1의 값을 가지는 경우, 생성된 태스크를 모두 오프로딩 하기 위한 제3 데이터를 산출할 수 있으며, 오프로딩 속도에 관한 변수가 0의 값을 가지는 경우, 생성된 태스크를 모두 사용자 단말에서 처리하기 위한 제4 데이터를 산출할 수 있다. 이에 따라, 제3 데이터는 서버로 오프로딩 되되 제4 데이터는 사용자 단말에서 처리될 수 있다. 제3 데이터 및 제4 데이터의 태스크 크기는 동일하며, 오프로딩 여부에 따라 구분될 수 있다.In this regard, in the step S100, when the offloading speed variable has a value of 1, third data for offloading all generated tasks may be calculated, and the offloading speed variable is 0. If it has a value of , fourth data for processing all of the generated tasks in the user terminal may be calculated. Accordingly, while the third data is offloaded to the server, the fourth data may be processed in the user terminal. Task sizes of the third data and the fourth data are the same, and may be distinguished according to offloading.

다음으로, 단계(S200)에서 오프로딩 데이터를 오프로딩 하기 위해 필요한 대기시간 및 에너지 소비량을 연산할 수 있다.Next, in step S200, waiting time and energy consumption required for offloading offloading data may be calculated.

대기시간은 총 대기시간으로서 태스크가 오프로딩 되는 경우의 대기시간과 태스크가 사용자 단말에서 처리되는 경우의 대기시간을 포함할 수 있다. 즉, 대기시간은 제1 데이터를 처리하기 위해 필요한 대기시간과 제1 데이터를 처리하기 위해 필요한 대기시간을 포함할 수 있다.The waiting time is the total waiting time and may include a waiting time when the task is offloaded and a waiting time when the task is processed in the user terminal. That is, the waiting time may include a waiting time required to process the first data and a waiting time required to process the first data.

이와 관련하여, 상기 단계(S200)에서는 하기 수학식 1에 기초하여 총 대기시간을 연산할 수 있다.In this regard, in the step S200, the total waiting time may be calculated based on Equation 1 below.

[수학식 1][Equation 1]

여기서, Y^t _ij는 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 대기시간을, y^t _ij,l은 제2 데이터를 처리하기 위해 필요한 대기시간을, y^t _ij,r은 제1 데이터를 처리하기 위해 필요한 대기시간을 의미함.Here, Y ^t _ij is the total waiting time required to offload the offloading data, y ^t _ij,l is the waiting time required to process the second data, and y ^t _ij,r is the waiting time required to process the first data. Indicates the waiting time required for

일 실시예로, 제1 데이터를 처리하기 위해 필요한 대기시간은 4종류의 대기시간 즉, 사용자 단말에서 제2 기지국으로 오프로딩 되는 데이터의 전송시간(y^t _ij), 제2 기지국에서 제1 기지국으로 오프로딩 되는 데이터의 전송시간(y^t _jm,i), 서버에서 발생되는 큐 딜레이(y^t _ij,q) 및 서버에서 오프로딩 된 데이터를 처리하는 시간(y^t _ij,r)을 포함할 수 있다.In one embodiment, the waiting time required to process the first data is four types of waiting time, that is, transmission time (y ^t _ij ) of data offloaded from the user terminal to the second base station, from the second base station to the first base station The transmission time of the offloaded data (y ^t _jm,i ), the queue delay (y ^t _ij,q ) occurring in the server, and the time to process the offloaded data in the server (y ^t _ij,r ). can

상기 대기시간 연산 시, 무선 프런트홀 링크의 데이터 전송율은 무선 백홀 링크의 데이터 전송율 보다 작거나 같은 조건을 만족하는 것이 바람직하다.When calculating the latency, it is preferable that the data transfer rate of the wireless fronthaul link satisfies a condition equal to or smaller than the data transfer rate of the wireless backhaul link.

제1 데이터를 처리하기 위해 필요한 대기시간인 y^t _ij,r는 하기 수학식들과 같이 나타낼 수 있다.The waiting time y ^t _ij,r required to process the first data can be expressed by the following equations.

[수학식 1a][Equation 1a]

[수학식 1b][Equation 1b]

[수학식 1c][Equation 1c]

[수학식 1d][Equation 1d]

[수학식 1e][Equation 1e]

일 실시예로, 제2 데이터를 처리하기 위해 필요한 대기시간인 y^t _ij,l는 하기 수학식과 같이 나타낼 수 있다.As an embodiment, y ^t _ij,l, which is a waiting time required to process the second data, may be expressed as in the following equation.

[수학식 1b][Equation 1b]

에너지 소비량은 총 에너지 소비량으로서 태스크가 오프로딩 되는 경우의 에너지 소비량과 태스크가 사용자 단말에서 처리되는 경우의 에너지 소비량을 포함할 수 있다. 즉, 에너지 소비량은 제1 데이터를 처리하기 위해 필요한 에너지 소비량과 제2 데이터를 처리하기 위해 필요한 에너지 소비량을 포함할 수 있다.The energy consumption amount is a total energy consumption amount and may include an energy consumption amount when the task is offloaded and an energy consumption amount when the task is processed in the user terminal. That is, the amount of energy consumption may include the amount of energy required to process the first data and the amount of energy required to process the second data.

이와 관련하여, 상기 단계(S200)에서는 하기 수학식 2에 기초하여 총 에너지 소비량을 연산할 수 있다.In this regard, in step S200, the total energy consumption may be calculated based on Equation 2 below.

[수학식 2][Equation 2]

여기서, ε^t _ij은 오프로딩 데이터를 오프로딩 하기 위해 필요한 총 에너지 소비량을, e^t _ij,l은 제2 데이터를 처리하기 위해 필요한 에너지 소비량을, e^t _ij,r은 제1 데이터를 처리하기 위해 필요한 에너지 소비량을 의미함.Here, ε ^t _ij is the total energy consumption required to offload the offloading data, e ^t _ij,l is the energy consumption required to process the second data, and e ^t _ij,r is the energy consumption required to process the first data. refers to the amount of energy required for

일 실시예로, 제1 데이터를 처리하기 위해 필요한 에너지 소비량인 e^t _ij,r은 하기 수학식과 같이 나타낼 수 있다.In an embodiment, e ^t _ij,r, which is the amount of energy consumed to process the first data, may be expressed as in the following equation.

[수학식 2a][Equation 2a]

일 실시예로, 제2 데이터를 처리하기 위해 필요한 에너지 소비량인 e^t _ij,l은 하기 수학식과 같이 나타낼 수 있다.In an embodiment, e ^t _ij,l, which is the amount of energy consumed to process the second data, may be expressed by the following equation.

[수학식 2b][Equation 2b]

참고로, 상기 수학식들에 적용되는 파라미터는 다음과 같이 정리할 수 있다.For reference, the parameters applied to the above equations can be summarized as follows.

다음으로, 단계(S300)에서 연산된 대기시간 및 에너지 소비량에 기초하여 상기 오프로딩 데이터를 처리할 수 있다.Next, the offloading data may be processed based on the waiting time and energy consumption calculated in step S300.

다시 말해, 오프로딩 속도에 관한 변수에 따라 사용자 단말에서 생성된 태스크의 오프로딩 방식이 결정되고, 부분 오프로딩이 가능한 경우 오프로딩 및 비오프로딩(사용자 단말에서 처리되는 경우)에 따라 연산되는 대기시간 및 에너지 소비량을 기반으로 태스크의 일부는 오프로딩 되되 나머지는 사용자 단말에서 처리될 수 있다.In other words, the offloading method of the task created in the user terminal is determined according to the variable related to the offloading speed, and if partial offloading is possible, the standby calculated according to offloading and non-offloading (processing in the user terminal) Based on time and energy consumption, some of the tasks can be offloaded while the rest can be handled by the user terminal.

한편, 단계(S200) 이후 연산된 대기시간 및 에너지 소비량을 미리 학습된 학습 알고리즘에 적용하여 오프로딩 대역폭을 최적화하는 단계가 수행될 수 있다.Meanwhile, after step S200, a step of optimizing the offloading bandwidth by applying the calculated waiting time and energy consumption to a pre-learned learning algorithm may be performed.

즉, 부분 오프로딩 수행 시 연산되는 대기시간 및 에너지 소비량을 최소화하여 MEC 네트워크 상에서 발생되는 오버헤드를 줄이기 위해 후술하는 학습 알고리즘을 적용하여 오프로딩 대역폭을 최적화할 수 있다.That is, the offloading bandwidth can be optimized by applying a learning algorithm described later in order to reduce overhead generated on the MEC network by minimizing the latency and energy consumption calculated during partial offloading.

이하에서는, 도 3 및 도 4를 참조하여 부분 오프로딩에 적용되는 학습 알고리즘에 관하여 설명하고자 한다.Hereinafter, a learning algorithm applied to partial offloading will be described with reference to FIGS. 3 and 4 .

도 3은 본 발명의 일 실시예에 있어서, DDPG 학습의 프레임워크를 나타낸 도면이고, 도 4는 본 발명의 일 실시예에 있어서, DDPG 학습이 반영된 부분 오프로딩을 처리하기 위한 수행코드를 나타낸 도면이다.3 is a diagram showing a framework of DDPG learning in an embodiment of the present invention, and FIG. 4 is a diagram showing execution codes for processing partial offloading in which DDPG learning is reflected in an embodiment of the present invention. to be.

본 발명의 부분 오프로딩에 적용되는 학습 알고리즘은 오프로딩 환경의 상태 스페이스, 행동 스페이스 및 유틸리티 함수로 구성된 마르코프 결정 프로세스(MDP, Markov Decision Process)에 강화 학습이 적용될 수 있다.In the learning algorithm applied to the partial offloading of the present invention, reinforcement learning may be applied to a Markov Decision Process (MDP) composed of a state space, an action space, and a utility function of an offloading environment.

강화 학습 시, 어떠한 상태(state)에서 가장 유리한 행동(action)을 선택하고, 최적의 행동(action)을 결정하기 위해 규칙(policy)을 사용하며, 행동(action)에 따른 보상(reward)을 주게 되는데, 이러한 행동, 규칙 또는 보상을 구하는데 있어서 마르코프 결정 프로세스를 활용할 수 있다. 마르코프 결정 프로세스는 행동을 중심으로 가치 평가가 이루어지며, 마르코프 결정 프로세스의 가장 큰 목적은 우수한 의사결정규칙과 최대의 보상, 즉 액션에 따른 가치의 합이 가장 큰 의사결정규칙과 보상을 찾아내는 것이다.In reinforcement learning, it selects the most advantageous action in a certain state, uses a policy to determine the optimal action, and gives a reward according to the action. However, the Markov Decision Process can be used to find these actions, rules, or rewards. The Markov decision process is based on action-oriented value evaluation, and the main purpose of the Markov decision process is to find the best decision-making rule and the maximum reward, that is, the decision-making rule and reward with the greatest sum of values according to actions.

마르코프 결정 프로세스는 상태 스페이스, 행동 스페이스, 보상의 개념으로서 유틸리티 함수 및 규칙으로 구성될 수 있으며, 이를 본 발명의 부분 오프로딩 방법에 적용하면 다음과 같다.The Markov decision process can be composed of utility functions and rules as concepts of state space, action space, and reward, and applying this to the partial offloading method of the present invention is as follows.

일 실시예로, 상태 스페이스는 제1 기지국과 제2 기지국 간의 채널상태, 제2 기지국과 사용자 단말 사이의 채널 상태, 제1 기지국 및 상기 제2 기지국이 커버하는 셀 간의 간섭, 서버 내 오프로딩 된 태스크가 저장되는 버퍼상태의 집합을 포함할 수 있다.In one embodiment, the state space is a channel state between the first base station and the second base station, a channel state between the second base station and the user terminal, interference between cells covered by the first base station and the second base station, and offloaded in the server. It can contain a set of buffer states where tasks are stored.

일 실시예로, 행동 스페이스는 오프로딩 속도에 관한 변수 및 오프로딩 대역폭 할당에 관한 변수의 집합을 포함할 수 있다.In one embodiment, the action space may include a set of variables related to offloading speed and variables related to offloading bandwidth allocation.

일 실시예로, 유틸리티 함수는 하기 수학식 3에 기초하여 나타낼 수 있다.In one embodiment, the utility function may be expressed based on Equation 3 below.

[수학식 3][Equation 3]

일 실시예로, 규칙은 상기 상태 스페이스, 행동 스페이스 및 유틸리티 함수를 기반으로 하기 수학식 4에 기초하여 나타낼 수 있다.As an embodiment, the rule may be expressed based on Equation 4 below based on the state space, action space, and utility function.

[수학식 4][Equation 4]

상기와 같이 적용된 마르코프 결정 프로세스에 강화 학습이 적용될 수 있다. 이때, 강화 학습은 액터(actor) 및 크리틱(critic) 기반의 DDPG(Deep Deterministic Policy Gradient)를 이용할 수 있다.Reinforcement learning may be applied to the Markov decision process applied as described above. In this case, reinforcement learning may use deep deterministic policy gradient (DDPG) based on actors and critics.

DDPG는 일반적으로 알려진 강화 학습 기법으로 그 원리에 대해서는 생략하겠으나, 이를 본 발명의 부분 오프로딩 방법에 적용하여 다음과 같이 설명하고자 한다.DDPG is a generally known reinforcement learning technique, and its principle will be omitted, but it will be applied to the partial offloading method of the present invention and described as follows.

도 3에 도시된 바와 같이, 프라이머리 네트워크(Actor PN 및 Critic PN), 타겟 네트워크(Actor TN 및 Critic TN) 및 리플레이 버퍼는 본 발명의 오프로딩을 위한 강화 학습의 세 가지 주요 구성 요소에 해당한다. 크리틱(critic)의 행동 가치 함수를 계산하기 위해 프라이머리 및 타겟 네트워크 모두에 대해 심층 신경망(DNN)을 사용하는 함수 근사치를 사용한다.As shown in Figure 3, the primary network (Actor PN and Critic PN), the target network (Actor TN and Critic TN), and the replay buffer correspond to the three main components of reinforcement learning for offloading in the present invention. . We use function approximation using a deep neural network (DNN) for both the primary and target networks to compute the action value function of the critic.

로 표시되는 Q-함수를 근사화하기 위해 매개 변수 벡터 θ를 사용하며, 이는 리플레이 버퍼에 저장된 전환 경험(transition experiences)을 사용하여 업데이트 된다. 경험 튜플(experience tuple) e^t=(X^t, a^t, U^t, X^t+1)을 저장하기 위해 제한된 크기의 리플레이 버퍼를 사용하고, 경험 풀(experience pool)은 R로 표시된다. 이를 통해, 학습 프로세스 중 샘플 상관 관계를 분리하고 에이전트(agent)가 다양한 경험에서 효율적으로 학습 할 수 있다. 경험 리플레이 기술에 따르면, 각 시간 슬롯 C의 경험 풀 R에서 R0 경험의 미니 배치(mini-batch)를 무작위로 샘플링하여 프라이머리 및 타켓 네트워크의 네트워크 매개 변수를 업데이트 할 수 있다.

We use the parameter vector θ to approximate the Q-function denoted by , which is updated using the transition experiences stored in the replay buffer. We use a finite size replay buffer to store the experience tuple e ^t = (X ^t , a ^t , U ^t , X ^t+1 ), and the experience pool is denoted by R. Through this, sample correlation can be separated during the learning process and the agent can learn efficiently from various experiences. According to the experience replay technique, a mini-batch of R0 experiences from experience pool R at each time slot C is randomly sampled to update the network parameters of the primary and target networks.

심층 강화 학습의 크리틱(critic) 네트워크는 Q-가치 함수를 기반으로 액터(actor)가 규정한 정책을 평가한다. 프라이머리 크리틱(critic) 네트워크인 Critic PN은 손실을 줄임으로써 훈련되고 업데이트 된다. 도 3에서와 같이, 손실 함수(Loss function)는 타겟 네트워크 유틸리티의 타겟값과 Critic PN의 크리틱값으로 구성된다. 따라서, Critic PN의 매개변수는 경사 하강법 알고리즘을 사용하여 샘플링 된 미니 배치(mini-batch) 경험에서 손실

을 줄임으로써 업데이트 될 수 있다.The critic network of deep reinforcement learning evaluates the policy specified by the actor based on the Q-value function. The primary critical network, Critic PN, is trained and updated by reducing the loss. As in FIG. 3, the loss function is composed of a target value of the target network utility and a critical value of the Critic PN. Therefore, the parameters of Critic PN are lost in mini-batch experiences sampled using the gradient descent algorithm.

can be updated by reducing

수학식

에서 타겟값 함수 y^t는 경험 유틸리티와 디스카운트된 미래 유틸리티의 합이며, τ는 디스카운트 인자이다.

는 Critic TN에서 얻은 타겟 Q-값이다. Critic PN 변수 θ는 θ에 대한 손실 L(θ)에 대해 미니 배치(mini-batch) 경사 하강법 알고리즘을 수행하여 업데이트 된다.math formula

In , the target value function y ^t is the sum of the experience utility and the discounted future utility, and τ is the discount factor.

is the target Q-value obtained from Critic TN. The critical PN variable θ is updated by performing a mini-batch gradient descent algorithm on the loss L(θ) for θ.

Actor PN은 전환(transition) X에 대한 상태를 입력으로 취하고 신경망 매개 변수 ω에 의해 결정된 행동을 생성한다. 현재 상태를 기반으로 Critic PN에서 취한 행동에 대한 크리틱값이 표시된다. 경사 상승법 알고리즘을 사용하여 현재 규칙에 대한 크리티시즘(criticism)을 기반으로 규칙

로 유도되는 Actor PN이 훈련된다.Actor PN takes as input the state for transition X and generates an action determined by the neural network parameter ω. Based on the current state, the critical value for the action taken in the Critic PN is displayed. Rules based on the criticism of the current rule using the gradient ascent algorithm.

Actor PN derived from is trained.

이후, 성능이 개선되는 방향으로 규칙 매개 변수가 업데이트 된다. 특히, Actor PN은 샘플링된 규칙 기울기를 사용 J를 최대화하도록 매개 변수 ω를 업데이트하며, ω는

와 같이 나타낼 수 있다. 여기서, ηω는 액터 네트워크의 학습율을 의미함.Then, rule parameters are updated in the direction of improving performance. In particular, Actor PN uses the sampled rule gradient to update the parameter ω to maximize J, where ω is

can be expressed as Here, ηω is the learning rate of the actor network.

도 3에 도시된 바와 같이, 타겟 크리틱(critic) 및 액터(actor) 네트워크는 각각 프라이머리 크리틱(critic) 및 액터(actor) 네트워크의 시간 지연된 복사본으로 정의될 수 있으며, 타겟값을 계산하는데 사용된다. 타겟 네트워크의 Actor TN 및 Critic TN의 매개 변수

와

는 매개 변수를 직접 복사하는 대신 소프트 업데이트 후에 업데이트 된다.타겟 크리틱(critic)/액터(actor) 네트워크는

,

와 같이 업데이트 된다.As shown in Figure 3, the target critical and actor networks can be defined as time-delayed copies of the primary critical and actor networks, respectively, and are used to calculate target values. . Parameters of Actor TN and Critic TN of target network

Wow

is updated after the soft update instead of directly copying the parameters.Target critic/actor network

,

updated as

전술한 강화 학습 기반의 부분 오프로딩을 처리하기 위한 수행코드를 살펴보면 도 4와 같다.Execution code for processing partial offloading based on reinforcement learning described above is shown in FIG. 4 .

이하에서는, 본 발명 및 비교예들 간 태스크 처리에 관한 비교를 통해 부분 오프로딩 방법에 대한 평가결과를 설명하고자 한다. 비교예는 태스크 전체를 모두 사용자 단말에서 처리하는 경우(UE execution), 태스크 전체를 모두 서버로 오프로딩 하는 경우(MEC execution) 및 태스크를 무작위로 오프로딩 하는 경우(Random offloading)를 포함한다.Hereinafter, evaluation results for the partial offloading method will be described through comparison of task processing between the present invention and comparative examples. Comparative examples include a case in which all tasks are processed by a user terminal (UE execution), a case in which all tasks are offloaded to a server (MEC execution), and a case in which tasks are randomly offloaded (Random offloading).

도 5a는 본 발명의 일 실시예에 따른 대기시간으로 인한 오버헤드를 비교예들과 비교하여 나타낸 그래프이고, 도 5b는 본 발명의 일 실시예에 따른 에너지 소비량으로 인한 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.5A is a graph showing overhead due to standby time according to an embodiment of the present invention compared to comparative examples, and FIG. 5B shows overhead due to energy consumption according to an embodiment of the present invention compared to comparative examples. This is a graph for comparison.

도 5a 및 도 5b에 도시된 바와 같이, 실험 횟수와 무관하게 본 발명 및 비교예들 모두 발생되는 오버헤드의 경향을 일정한 것을 알 수 있다. As shown in FIGS. 5A and 5B , it can be seen that the tendency of overhead generated in both the present invention and comparative examples is constant regardless of the number of experiments.

한편, 대기시간으로 인한 오버헤드 값은 UE execution, MEC execution, Random offloading 및 DDPG-PTORA 순으로 크고, 에너지 소비량으로 인한 오버헤드 값은 UE execution, MEC execution 순으로 크되 Random offloading 및 DDPG-PTORA 방식에서는 거의 유사한 것을 확인할 수 있으며, 이는 사용자 단말의 경우 제한된 계산 능력을 가지고 있기 때문에 다른 태스크 처리 방식에 비해 발생되는 오버헤드가 가장 높다고 할 수 있다.Meanwhile, the overhead value due to latency is large in the order of UE execution, MEC execution, random offloading, and DDPG-PTORA, and the overhead value due to energy consumption is large in the order of UE execution and MEC execution. It can be confirmed that it is almost similar, and since the user terminal has limited computing power, it can be said that the overhead generated is the highest compared to other task processing methods.

도 6은 본 발명의 일 실시예에 따른 사용자 단말의 평균 태스크 드롭률을 비교예들과 비교하여 나타낸 그래프이다.6 is a graph showing average task drop rates of user terminals compared to comparative examples according to an embodiment of the present invention.

도 6에 도시된 바와 같이, 본 발명의 부분 오프로딩 방식인 DDPG-PTORA이 나머지 MEC execution 및 Random offloading 방식에 비해 태스크 드롭률이 가장 낮다는 것을 확인할 수 있다. 구체적으로, DDPG-PTORA, Random offloading, MEC execution의 총 태스크 드롭량은 22.7964MB, 91.8372MB, 229.5930MB로 DDPG-PTORA가 Random offloading 보다 75 % 낮고 MEC execution 보다 90 % 적다는 것을 알 수 있다.As shown in FIG. 6, it can be seen that the partial offloading method of the present invention, DDPG-PTORA, has the lowest task drop rate compared to the other MEC execution and random offloading methods. Specifically, the total task drops of DDPG-PTORA, Random offloading, and MEC execution were 22.7964MB, 91.8372MB, and 229.5930MB, indicating that DDPG-PTORA is 75% lower than Random offloading and 90% lower than MEC execution.

도 7은 본 발명의 일 실시예에 따른 사용자 단말 수 대비 총 시스템 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.7 is a graph showing total system overhead versus the number of user terminals compared to comparative examples according to an embodiment of the present invention.

도 7에 도시된 바와 같이, 사용자 단말이 증가함에 따라 본 발명 및 비교예들 모두 총 시스템 오버헤드 값이 증가하는 것을 알 수 있다.As shown in FIG. 7 , it can be seen that as the number of user terminals increases, the total system overhead value increases in both the present invention and comparative examples.

한편, 총 시스템 오버헤드 값은 UE execution, Random offloading, MEC execution 및 DDPG-PTORA 순으로 큰 것을 확인할 수 있다. 다만, 사용자 단말의 개수가 90개 이상이 되는 경우 MEC execution의 총 시스템 오버헤드 값은 Random offloading의 총 시스템 오버헤드 보다 크게 증가한다. 이는, 사용자 단말의 수가 지속적으로 증가하면 서버의 큐 딜레이와 각 사용자 단말에 할당되는 대역폭이 감소하여 오프로딩 시간을 증가시키기 때문이다.On the other hand, it can be seen that the total system overhead value is larger in the order of UE execution, Random offloading, MEC execution, and DDPG-PTORA. However, when the number of user terminals is 90 or more, the total system overhead value of MEC execution increases more than the total system overhead of random offloading. This is because, if the number of user terminals continuously increases, the queue delay of the server and the bandwidth allocated to each user terminal decrease, thereby increasing the offloading time.

도 8은 본 발명의 일 실시예에 따른 태스크 크기 대비 총 시스템 오버헤드를 비교예들과 비교하여 나타낸 그래프이다.8 is a graph showing total system overhead versus task size compared to comparative examples according to an embodiment of the present invention.

도 8의 실험에서는 사용자 단말의 개수를 50, 제2 기지국의 개수를 5로 가정하여 수행되었다.The experiment of FIG. 8 was performed assuming that the number of user terminals is 50 and the number of second base stations is 5.

도 8에 도시된 바와 같이, 태스크의 크기가 증가함에 따라 본 발명 및 비교예들 모두 총 시스템 오버헤드 값이 증가하는 것을 알 수 있다.As shown in FIG. 8 , it can be seen that the total system overhead value of both the present invention and the comparative examples increases as the size of the task increases.

한편, 총 시스템 오버헤드 값은 UE execution, Random offloading, MEC execution 및 DDPG-PTORA 순으로 큰 것을 확인할 수 있다.On the other hand, it can be seen that the total system overhead value is larger in the order of UE execution, Random offloading, MEC execution, and DDPG-PTORA.

이로써, 본 발명의 일 실시예에 따르면, 모바일 엣지 컴퓨팅 기반 네트워크에서 무선 백홀과 무선 프런트홀 간의 태스크 오프로딩 및 자원 할당 문제를 스케쥴링할 수 있다.Accordingly, according to an embodiment of the present invention, task offloading and resource allocation between wireless backhaul and wireless fronthaul can be scheduled in a mobile edge computing-based network.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 사람이라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 실행된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present invention, and various modifications and variations can be made to those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments implemented in the present invention are not intended to limit the technical idea of the present invention, but to explain, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed according to the following claims, and all technical ideas within the equivalent range should be construed as being included in the scope of the present invention.

Claims

A partial offloading method in a wireless edge computing-based network comprising at least one second base station located within coverage of a first base station and a server connected to the first base station,
calculating offloading data for offloading a task generated from at least one user terminal connected to the second base station;
calculating standby time and energy consumption required to offload the offloading data; and
processing the offloading data based on the calculated waiting time and energy consumption;
Partial offloading method in a wireless edge computing-based network comprising a.

According to claim 1,
The first base station and the second base station interwork through a wireless backhaul, and the second base station and the at least one user terminal interwork through a wireless fronthaul;
Partial offloading method in a wireless edge computing-based network, characterized in that the wireless backhaul and the wireless fronthaul share a frequency bandwidth.

According to claim 1,
The calculation step is
Calculating offloading data for offloading a task based on a variable related to an offloading rate set based on a channel state between the first base station and the second base station and a channel state between the second base station and the user terminal. Partial offloading method in a wireless edge computing-based network characterized by.

According to claim 3,
The calculation step is
When the variable related to the offloading speed has a value between 0 and 1, calculating first data offloaded as part of a task and second data processed by the user terminal as the rest except for a part of the task A partial offloading method in a wireless edge computing-based network with

According to claim 1,
The calculation step is
Calculating the waiting time based on Equation 1 below showing the waiting time required to process the first data and the second data, and showing the energy consumption required to process the first data and the second data Partial offloading method in a wireless edge computing-based network, characterized in that for calculating the energy consumption based on Equation 2.
[Equation 1]

[Equation 2]

Here, y ^t _ij is the total waiting time required to offload the offloading data, y ^t _ij,l is the waiting time required to process the second data, and y ^t _ij,r is the first data , ε ^t _ij is the total energy consumption required to offload the offloading data, e ^t _ij,l is the energy consumption required to process the second data, e ^t _{ij ,r} means energy consumption required to process the first data.

According to claim 1,
Further comprising optimizing offloading by applying the calculated waiting time and energy consumption to a pre-learned learning algorithm,
The learning algorithm is
A partial offloading method in a wireless edge computing-based network, characterized in that reinforcement learning is applied to a Markov Decision Process composed of a state space, an action space, and a utility function of an offloading environment.

According to claim 6,
The state space is a channel state between a first base station and a second base station, a channel state between the second base station and the user terminal, interference between cells covered by the first base station and the second base station, and offloaded information in the server. Contains a set of buffer states in which tasks are stored;
The action space includes a set of variables related to offloading speed and variables related to offloading bandwidth allocation;
The utility function is a partial offloading method in a wireless edge computing-based network, characterized in that it can be expressed based on Equation 3 below.
[Equation 3]

Here, X ^t is the state space, a ^t is the action space, ε ^t _ij is the total energy consumption required to offload the offloading data, and y ^t _ij is the offloading data The total waiting time required, λ is a weight variable related to task drop, Г ^t is a weight variable related to the waiting time and energy consumption of the user terminal, r ^t _jm is the first base station and the second base station Refers to the data transfer rate of the wireless backhaul.

According to claim 7,
The reinforcement learning method of partial offloading in a wireless edge computing-based network, characterized in that using an actor and critic-based Deep Deterministic Policy Gradient (DDPG).

a first base station;
a second base station located within the coverage of the first base station;
at least one user terminal connected to the second base station; and
A server for processing tasks offloaded from the user terminal
A system for partial offloading in a wireless edge computing-based network comprising a.