KR102542901B1

KR102542901B1 - Method and Apparatus of Beamforming Vector Design in Over-the-Air Computation for Real-Time Federated Learning

Info

Publication number: KR102542901B1
Application number: KR1020210184713A
Authority: KR
Inventors: 박대영; 김민식
Original assignee: 인하대학교 산학협력단
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2023-06-14

Abstract

A method for beamforming vector design in an over-the-air (OTA) computation technique for real-time federated learning and a device therefor are suggested. The method for beamforming vector design in an over-the-air (OTA) computation technique for real-time federated learning, suggested in the present invention, comprises: a step in which a cloud server transmits a global model by selecting multiple local devices which are to participate in federated learning, and obtains channel information for each local device; a step in which each local device obtains the global model from the cloud server, learns a local model in each local device based on local data held by each local device, and transmits the learned local model to the cloud server; a step in which the cloud server designs a beamforming vector to aggregate signals received from each local device and determines whether an objective function is converged; and a step of updating, when the objective function is converged, the global model in the cloud server with the weighted average of the learned local model received from each local device, determining whether the global model is converged, and repeatedly updating the global model until the global model is converged. According to the present invention, the method can reduce the number of errors even when a large number of devices participate in learning to increase the performance of artificial intelligence models.

Description

Method and Apparatus for Beamforming Vector Design in Over-the-Air Computation for Real-Time Federated Learning

본 발명은 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법 및 장치에 관한 것이다.The present invention relates to a beamforming vector design method and apparatus in an OTA calculation technique for real-time joint learning.

과학 기술의 발전으로 많은 IoT(Internet-of-Things) 장치가 등장하였고 다양한 장치로부터 수집되는 데이터가 급속도로 증가하고 있다. 인공지능 분야에서는 이렇게 수집된 다양하고 많은 데이터를 기반으로 더 정확하고 현실적인 모델을 만들기 위해 노력하고 있다. 현재 인공지능 모델을 학습하는 방법으로는 클라우드 서버에서 모든 데이터를 수집해서 학습하는 중앙집중식 방법이 주로 사용되고 있지만 몇 가지 문제가 존재한다. 먼저 IoT 장치가 가지고 있는 데이터를 서버로 전송하기 위해서는 많은 무선통신 자원을 사용해야 한다. 그리고 각 장치의 사용자는 자신의 개인 정보가 직접적으로 유출되는 것을 꺼리며 의료 정보와 같은 데이터는 법적으로 얻을 수 없는 문제가 있다. With the development of science and technology, many Internet-of-Things (IoT) devices have appeared, and data collected from various devices is rapidly increasing. In the field of artificial intelligence, efforts are being made to create more accurate and realistic models based on the diverse and diverse data collected in this way. Currently, as a method of learning an artificial intelligence model, a centralized method of collecting and learning all data from a cloud server is mainly used, but there are several problems. First, in order to transmit the data of the IoT device to the server, a lot of wireless communication resources must be used. In addition, users of each device are reluctant to directly leak their personal information, and data such as medical information cannot be obtained legally.

이를 해결하기 위해서 최근 연합 학습(federated learning)이라는 분산형 방법이 등장하였다[1]. 연합 학습에서는 먼저 각각의 장치가 자신이 가지고 있는 로컬 데이터로 장치에 있는 로컬 모델을 학습한 후에 학습된 모델을 클라우드 서버로 전송한다. 클라우드 서버는 각각의 장치로부터 전송받은 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신한다. 마지막으로 클라우드 서버는 갱신된 글로벌 모델을 각 장치에 전송하며 모델이 수렴할 때까지 이 과정을 반복한다. 따라서 연합학습은 직접적인 데이터 전송 없이 진행됨으로 개인 정보, 의료 정보 등이 가지는 문제를 해결하고 인공지능 모델에 대한 정보만을 전송하므로 통신 트래픽을 낮출 수 있다. 그러나 연합 학습의 성능을 높이기 위해서는 많은 장치가 학습에 참여해야 하며 기존의 무선통신 기법에서는 학습에 참여하는 장치의 수에 비례하여 필요한 무선 자원의 양이 늘어나는 문제가 있다. 다중 접속 채널(multiple access channel; MAC)에서 중첩의 원리를 활용한 Aircomp(over-the-air computation)[2]은 기존의 무선통신 기법이 가지는 문제를 해결할 수 있어서 연합 학습 분야에 많이 응용되고 있다. To solve this problem, a distributed method called federated learning has recently appeared [1]. In federated learning, each device first learns a local model in the device with its own local data, and then transmits the learned model to the cloud server. The cloud server updates the global model in the cloud server with the weighted average of the models transmitted from each device. Finally, the cloud server sends the updated global model to each device and repeats this process until the model converges. Therefore, as federated learning proceeds without direct data transmission, it solves problems with personal information and medical information, and transmits only information about artificial intelligence models, so communication traffic can be reduced. However, in order to increase the performance of federated learning, many devices must participate in learning, and in existing wireless communication techniques, the amount of radio resources required increases in proportion to the number of devices participating in learning. Aircomp (over-the-air computation) [2], which utilizes the principle of superposition in multiple access channels (MAC), is widely applied in the field of federated learning because it can solve the problems of existing wireless communication techniques. .

연합 학습에서 많은 장치가 학습에 참여하면 인공지능 모델의 성능이 증가하지만 많은 장치가 참여할수록 더 많은 오류가 발생하여 모델의 성능이 감소한다. 따라서 모든 장치를 선택해서 많은 오류를 가지고 모델을 업데이트 하는 것 보다 제한된 에러 내에서 최대 한 많은 장치를 선택하는 것이 좋다[3]. 하지만 제한된 에러 내에서 많은 장치를 선택하는 문제는 변수가 결합되어 있고 비볼록 제한조건을 가지기 때문에 해결하기 어렵다. 기존의 방식들은 매트릭스 리프팅(matrix lifting) 기법을 사용하여 SDR(Semi-Definite Relaxation) 기반의 알고리즘으로 최적화 문제를 해결하였다[3][4]. 하지만 SDR로 구한 해는 랭크가 1이 되지 않으면 원래 풀고자 한 문제의 최적해가 아니게 되어 장치가 많아지는 경우 잘 동작하지 않는다. 또한 너무 높은 계산 복잡도로 인해서 실시간 연합 학습에 적합하지 않다.In federated learning, when many devices participate in training, the performance of the AI model increases, but as more devices participate, more errors occur and the performance of the model decreases. Therefore, rather than selecting all devices and updating the model with many errors, it is better to select as many devices as possible within a limited error [3]. However, the problem of selecting many devices within a limited error is difficult to solve because the variables are coupled and have non-convex constraints. Existing methods used matrix lifting techniques to solve optimization problems with SDR (Semi-Definite Relaxation)-based algorithms [3] [4]. However, if the rank is not 1, the solution obtained by SDR is not an optimal solution to the original problem to be solved, so it does not work well when there are many devices. Also, it is not suitable for real-time federated learning due to its too high computational complexity.

[1] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Arcas, "Communication-efficient learning of deep networks from decentralized data," in Proc. Int Conf. Artif. Intell. Stat. (AISTATS), vol. 54, 2017, pp.1273-1282.[1] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Arcas, "Communication-efficient learning of deep networks from decentralized data," in Proc. Int Conf. Artif. Intell. Stat. (AISTATS), vol. 54, 2017, pp.1273-1282. [2] B. Nazer and M. Gastpar, "Computation over multiple-access channels," IEEE Trans. Inf. Theory, vol. 53, no. 10, pp. 3498-3516, Oct. 2007.[2] B. Nazer and M. Gastpar, "Computation over multiple-access channels," IEEE Trans. Inf. Theory, vol. 53, no. 10, p. 3498-3516, Oct. 2007. [3] K. Yang, T. Jiang, Y. Shi, and Z. Ding, "Federated learning via over-the-air computation," IEEE Trans. Wireless Commun., vol. 19, no. 3, pp.2022-2035, Mar. 2020.[3] K. Yang, T. Jiang, Y. Shi, and Z. Ding, "Federated learning via over-the-air computation," IEEE Trans. Wireless Commun., vol. 19, no. 3, pp. 2022-2035, Mar. 2020. [4] Z. Wang, J. Qiu, Y. Zhou, Y. Shi, L. Fu, W. Chen, and K. B. Letaief, "Federated learning via intelligent reflecting surface," to appear in IEEE Trans. Wireless Commun., 2021.[4] Z. Wang, J. Qiu, Y. Zhou, Y. Shi, L. Fu, W. Chen, and K. B. Letaief, "Federated learning via intelligent reflecting surface," to appear in IEEE Trans. Wireless Commun., 2021. [5] Y. Shi, J. Cheng, J. Zhang, B. Bai, W. Chen, and K. B. Letaief, "Smoothed-minimization for green cloud-RAN with device admission control," IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 1022--1036, Apr. 2016.[5] Y. Shi, J. Cheng, J. Zhang, B. Bai, W. Chen, and K. B. Letaief, "Smoothed-minimization for green cloud-RAN with device admission control," IEEE J. Sel. Areas Commun., vol. 34, no. 4, p. 1022--1036, Apr. 2016. [6] S. Boyd, Convex Optimization II, Lecture Note. Available at http://www.stanford.edu/class/ee364b/.[6] S. Boyd, Convex Optimization II, Lecture Note. Available at http://www.stanford.edu/class/ee364b/. [7] M. Grant and S. Boyd, CVX: Matlab software for disciplined convex programming, http://cvxr.com/cvx, Sep. 2013.[7] M. Grant and S. Boyd, CVX: Matlab software for disciplined convex programming, http://cvxr.com/cvx, Sep. 2013. [8] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278--2324, Nov. 1998.[8] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, p. 2278--2324, Nov. 1998.

본 발명이 이루고자 하는 기술적 과제는 연합 학습에서 인공지능 모델의 성능을 증가시키기 위해 많은 장치가 학습에 참여하면서도 오류는 감소시키기 위한 방법 및 장치를 제공하는데 있다. 장치의 개수가 증가하여도 선택하는 장치의 수가 선형적으로 증가하므로 장치의 수가 많아도 안정적으로 동작하며, 많은 장치를 선택할수록 높은 성능을 달성할 수 있는 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법 및 장치를 제공하고자 한다.The technical problem to be achieved by the present invention is to provide a method and apparatus for reducing errors while many devices participate in learning in order to increase the performance of an artificial intelligence model in federated learning. Even if the number of devices increases, the number of selected devices increases linearly, so it operates stably even when the number of devices is large, and beamforming in OTA calculation technique for real-time combined learning that can achieve high performance as more devices are selected. It is intended to provide a vector design method and device.

일 측면에 있어서, 본 발명에서 제안하는 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법은 클라우드 서버가 연합 학습에 참여할 복수의 로컬 장치를 선택하여 글로벌 모델을 전송하고, 각각의 로컬 장치에 대한 채널 정보를 획득하는 단계, 각각의 로컬 장치가 클라우드 서버로부터 상기 글로벌 모델을 획득하여 각각의 로컬 장치가 가지고 있는 로컬 데이터에 기초하여 각각의 로컬 장치에 있는 로컬 모델을 학습한 후 학습된 로컬 모델을 클라우드 서버로 전송하는 단계, 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하고, 목적 함수의 수렴 여부를 판단하는 단계 및 상기 목적 함수가 수렴하는 경우, 상기 각각의 로컬 장치로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신하고, 상기 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복하는 단계를 포함한다. In one aspect, in the beamforming vector design method in the OTA calculation technique for real-time federated learning proposed in the present invention, a cloud server selects a plurality of local devices to participate in federated learning, transmits a global model, and transmits a global model to each local device. Acquiring channel information for each local device, after each local device acquires the global model from a cloud server and learns a local model in each local device based on local data that each local device has, the learned local Transmitting the model to a cloud server, designing a beamforming vector for the cloud server to aggregate signals received from each local device, and determining whether the objective function converges, and if the objective function converges, the Updating the global model in the cloud server with the weighted average of the learned local models received from each local device, determining whether the global model converges, and repeating the update of the global model until the global model converges. includes

상기 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하고, 목적 함수의 수렴 여부를 판단하는 단계는 각각의 로컬 장치로부터 수신되는 신호가 집계된 정보를 얻기 위해 빔포밍 벡터와 수신신호를 곱하고, 각각의 로컬 장치로부터 수신되는 신호가 집계된 정보는 연합 학습에서 각각의 로컬 모델이 집계된 정보와 같아야 하므로 집계에 대한 평균 제곱 오차(Mean Squared Error; MSE)를 최소화하며, 상기 평균 제곱 오차를 최소화하기 위해 영 강압(zero-forcing)으로 구한 값과 각각의 로컬 장치의 전송 전력의 제한 조건으로 구한 값을 이용하여 집계 오류를 구한다. The cloud server designing a beamforming vector for aggregating signals received from each local device and determining whether an objective function converges may include performing beamforming to obtain aggregated information of signals received from each local device. Since the vector and the received signal are multiplied, and the aggregated information of the signals received from each local device must be the same as the aggregated information of each local model in federated learning, the mean squared error (MSE) for aggregation is minimized. , Aggregation error is obtained using a value obtained by zero-forcing to minimize the mean square error and a value obtained as a limiting condition of transmission power of each local device.

상기 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하고, 목적 함수의 수렴 여부를 판단하는 단계는 상기 목적 함수가 수렴하지 않는 경우, 상기 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하는 과정을 반복한다. The cloud server designing a beamforming vector for aggregating signals received from each local device and determining whether the objective function converges may include determining whether the objective function converges, the cloud server determines whether the objective function converges or not, the cloud server determines whether the objective function converges or not. The process of designing a beamforming vector for aggregating signals received from is repeated.

상기 목적 함수가 수렴하는 경우, 상기 각각의 로컬 장치로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신하고, 상기 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복하는 단계는 상기 집계 오류에 대하여 미리 정해진 제한조건 내에서 연합 학습에 참여하는 로컬 장치의 수를 최대화하기 위해 보조 변수를 이용하고, 상기 집계 오류에 대하여 미리 정해진 제한조건을 위반하는 로컬 장치의 수를 최소화 하기 위해 상기 제한조건을 볼록 함수로 변환하며, 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복한다. When the objective function converges, the global model in the cloud server is updated with the weighted average of the learned local models received from each local device, and whether or not the global model converges is determined. When the global model converges The step of repeating the update of the global model up to, uses an auxiliary variable to maximize the number of local devices participating in federated learning within predetermined constraints for the aggregation error, and uses the predetermined constraint for the aggregation error. In order to minimize the number of violating local devices, the constraint is converted into a convex function, and updating of the global model is repeated until the global model converges.

일 측면에 있어서, 본 발명에서 제안하는 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계를 위한 클라우드 서버는 연합 학습에 참여할 복수의 로컬 장치를 선택하여 글로벌 모델을 전송하고, 각각의 로컬 장치에 대한 채널 정보를 획득하는 채널 정보 획득부, 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하는 빔포밍 벡터 설계부, 상기 빔포밍 벡터의 목적 함수의 수렴 여부를 판단하고, 글로벌 모델 업데이트부가 글로벌 모델을 갱신한 후, 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복하도록 하는 판단부 및 각각의 로컬 장치가 클라우드 서버로부터 상기 글로벌 모델을 획득하여 각각의 로컬 장치가 가지고 있는 로컬 데이터에 기초하여 각각의 로컬 장치에 있는 로컬 모델을 학습한 후 학습된 로컬 모델을 수신하고, 상기 판단부의 판단 결과에 따라 상기 목적 함수가 수렴하는 경우, 상기 각각의 로컬 장치로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신하는 글로벌 모델 업데이트부를 포함한다.In one aspect, the cloud server for beamforming vector design in the OTA calculation technique for real-time federated learning proposed in the present invention selects a plurality of local devices to participate in federated learning, transmits a global model, and transmits a global model to each local device. A channel information acquisition unit for obtaining channel information for , a beamforming vector design unit for designing a beamforming vector for aggregating signals received from each local device, determining whether the objective function of the beamforming vector converges, After the model update unit updates the global model, the determination unit determines whether the global model converges and repeats the update of the global model until the global model converges, and each local device obtains the global model from the cloud server. After learning the local model in each local device based on the local data possessed by each local device, the learned local model is received, and when the objective function converges according to the determination result of the determination unit, each of the above and a global model update unit that updates the global model in the cloud server with the weighted average of the learned local models received from the local device of the device.

본 발명의 실시예들에 따른 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법 및 장치는 연합 학습에서 인공지능 모델의 성능을 증가시키기 위해 많은 장치가 학습에 참여하면서도 오류는 감소시킬 수 있다. 장치의 개수가 증가하여도 선택하는 장치의 수가 선형적으로 증가하므로 장치의 수가 많아도 안정적으로 동작하며, 많은 장치를 선택할수록 높은 성능을 달성할 수 있다.The beamforming vector design method and apparatus in the OTA calculation technique for real-time federated learning according to embodiments of the present invention can reduce errors while many devices participate in learning to increase the performance of an artificial intelligence model in federated learning. there is. Even if the number of devices increases, since the number of selected devices increases linearly, it operates stably even when the number of devices is large, and high performance can be achieved as more devices are selected.

도 1은 본 발명의 일 실시예에 따른 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법을 설명하기 위한 흐름도이다.
도 2는 본 발명의 일 실시예에 따른 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계를 위한 클라우드 서버의 구성을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따른 요구되는 MSE(

)에 따른 선택된 장치의 수에 대한 그래프이다.
도 4는 본 발명의 일 실시예에 따른 전체 장치 개수에 따른 선택된 장치의 수에 대한 그래프이다.
도 5는 본 발명의 일 실시예에 따른 통신 라운드에 따른 인공지능 모델의 학습 성능에 대한 그래프이다.
도 6은 본 발명의 일 실시예에 따른 각 알고리즘에 대한 실행시간을 보여준다.1 is a flowchart for explaining a beamforming vector design method in an OTA calculation technique for real-time joint learning according to an embodiment of the present invention.
2 is a diagram for explaining the configuration of a cloud server for beamforming vector design in an OTA calculation technique for real-time federated learning according to an embodiment of the present invention.
3 is a required MSE according to an embodiment of the present invention (

) is a graph of the number of selected devices according to.
4 is a graph of the number of selected devices according to the total number of devices according to an embodiment of the present invention.
5 is a graph of learning performance of an artificial intelligence model according to communication rounds according to an embodiment of the present invention.
6 shows the execution time for each algorithm according to an embodiment of the present invention.

이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법을 설명하기 위한 흐름도이다. 1 is a flowchart for explaining a beamforming vector design method in an OTA calculation technique for real-time joint learning according to an embodiment of the present invention.

제안하는 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계 방법은 클라우드 서버(다시 말해, 기지국)가 연합 학습에 참여할 복수의 로컬 장치를 선택하여 글로벌 모델을 전송하고, 각각의 로컬 장치에 대한 채널 정보를 획득하는 단계(110), 각각의 로컬 장치가 클라우드 서버로부터 상기 글로벌 모델을 획득하여 각각의 로컬 장치가 가지고 있는 로컬 데이터에 기초하여 각각의 로컬 장치에 있는 로컬 모델을 학습한 후 학습된 로컬 모델을 클라우드 서버로 전송하는 단계(140), 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하고, 목적 함수의 수렴 여부를 판단하는 단계(120) 및 상기 목적 함수가 수렴하는 경우, 상기 각각의 로컬 장치로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신하고, 상기 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복하는 단계(130)를 포함한다. In the proposed beamforming vector design method in the OTA calculation technique for real-time federated learning, a cloud server (ie, a base station) selects a plurality of local devices to participate in federated learning, transmits a global model, and transmits a global model for each local device. Acquiring channel information (110), each local device acquires the global model from a cloud server, learns a local model in each local device based on local data that each local device has, and then learns the Transmitting the local model to the cloud server (140), designing a beamforming vector for the cloud server to aggregate signals received from each local device, and determining whether the objective function converges (120) When the function converges, the global model in the cloud server is updated with the weighted average of the learned local models received from each local device, and whether or not the global model converges is determined to determine whether the global model converges. and repeating the update of the model (130).

단계(110)에서, 클라우드 서버가 연합 학습에 참여할 복수의 로컬 장치를 선택하여 글로벌 모델을 전송하고, 각각의 로컬 장치에 대한 채널 정보를 획득한다. In step 110, the cloud server selects a plurality of local devices to participate in federated learning, transmits the global model, and acquires channel information for each local device.

연합 학습에서 FedAvg(Federated Averaging) 방법[1]은 먼저 클라우드 서버(다시 말해, 기지국)이 학습에 참여할 로컬 장치를 선택(111)하고, 각각의 로컬 장치로 글로벌 모델을 전송하며(112), 채널 정보를 획득한다(113). In federated learning, the Federated Averaging (FedAvg) method [1] first selects (111) the local devices that the cloud server (ie base station) will participate in learning, sends a global model to each local device (112), Obtain information (113).

이후, 단계(140)에서 각 로컬 장치로부터 업데이트된 로컬 모델을 수집한다. 그리고 수집된 로컬 모델

의 가중 평균으로 기지국의 글로벌 모델

을 업데이트한다. 따라서 각 장치로부터 집계된 글로벌 모델은 식(1)과 같이 나타낼 수 있다: Then, in step 140, updated local models are collected from each local device. and the collected local model

A global model of base stations as a weighted average of

update Therefore, the global model aggregated from each device can be expressed as Equation (1):

식(1)

Equation (1)

여기서

는 전체

개의 장치 중

번째 장치가 가지고 있는 데이터의 개수에 비례하는 가중치이며,

는 전체 장치의 집합

내에서 연합 학습에 참여하기 위해 선택된 장치의 집합이다. 로컬 모델

는 단위 분산으로 정규화 되었으며

번째 시간에서는

가 전송된다. 설명을 간단하게 하기 위해서 시간에 관한 인덱스는 생략하였다.here

is the whole

of the devices

It is a weight proportional to the number of data that the th device has,

is the set of whole devices

A set of devices selected to participate in federated learning within local model

is normalized to unit variance and

in the second hour

is sent In order to simplify the explanation, the index on time is omitted.

단계(120)에서, 클라우드 서버가 각각의 로컬 장치로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계(121)하고, 목적 함수의 수렴 여부를 판단한다(122). In step 120, the cloud server designs beamforming vectors for aggregating signals received from each local device (121), and determines whether the objective function converges (122).

본 발명의 실시예에 따르면, 각각의 로컬 장치로부터 수신되는 신호가 집계된 정보를 얻기 위해 빔포밍 벡터와 수신신호를 곱한다. 각각의 로컬 장치로부터 수신되는 신호가 집계된 정보는 연합 학습에서 각각의 로컬 모델이 집계된 정보와 같아야 하므로 집계에 대한 평균 제곱 오차(Mean Squared Error; MSE)를 최소화한다. 상기 평균 제곱 오차를 최소화하기 위해 영 강압(zero-forcing)으로 구한 값과 각각의 로컬 장치의 전송 전력의 제한 조건으로 구한 값을 이용하여 집계 오류를 구한다. According to an embodiment of the present invention, a beamforming vector is multiplied by a received signal to obtain aggregated information of signals received from each local device. Aggregated information of signals received from each local device should be the same as aggregated information of each local model in federated learning, thus minimizing mean squared error (MSE) for aggregation. In order to minimize the mean square error, an aggregation error is obtained using a value obtained by zero-forcing and a value obtained as a limiting condition of transmission power of each local device.

OTA 계산 기법에서 각각의 로컬 장치들은 제한된 전력 내에서 간섭 제거를 위한 추가 전력을 가지고 신호를 송신하며

개의 안테나를 가진 클라우드 서버(다시 말해, 기지국)에서는 빔포밍 벡터를 설계하여 수신 신호들이 집계된 정보를 얻는다. 이 때 클라우드 서버에서 수신된 신호는 식(2)와 같다: In the OTA calculation technique, each local device transmits a signal with additional power for interference cancellation within a limited power

In a cloud server (that is, a base station) having two antennas, a beamforming vector is designed to obtain aggregated information of received signals. At this time, the signal received from the cloud server is as Equation (2):

식(2)

Equation (2)

는

번째 장치와 클라우드 서버 간의 채널 정보,

는 간섭제거를 위한 스칼라 변수,

은 평균이

, 분산이

인 가우시안 잡음이다. 클라우드 서버에서는 각 장치로부터 전송된 신호가 집계된 정보를 얻기 위해 빔포밍 벡터

와 수신신호

를 곱한다:

Is

Channel information between the first device and the cloud server,

is a scalar variable for interference removal,

is the average

, the variance is

is Gaussian noise. In the cloud server, a beamforming vector is used to obtain information in which signals transmitted from each device are aggregated.

and receive signal

Multiply by:

식(3)

Equation (3)

여기에서

는 정규화를 위한 스칼라 변수이다. 각 장치로부터 전송된 신호가 집계된 정보

는 연합 학습의 FedAvg에서 각 로컬 모델이 집계된 정보

와 같아야 하므로 집계에 대한 평균 제곱 오차(Mean Squared Error; MSE)를 최소화해야 한다: From here

is a scalar variable for normalization. Aggregated information of signals transmitted from each device

is the aggregated information of each local model in FedAvg of federated learning

must be equal to, thus minimizing the Mean Squared Error (MSE) for the aggregation:

식(4)

Equation (4)

여기에서 MSE를 최소화하기 위해 영 강압(zero-forcing)으로 구한

와 각 장치의 전송 전력의 제한 조건

으로 구한

를 적용해서 얻은 집계 오류는 다음과 같다: In order to minimize the MSE here, the value obtained by zero-forcing is

and the limiting condition of the transmission power of each device

saved by

The aggregation error obtained by applying is:

식(5)

Equation (5)

여기에서

는 목표로 하는 MSE이다. MSE에서

에 0이 아닌 상수 배를 해도 무관하다는 점을 사용하면

이라는 조건을 얻을 수 있다. From here

is the targeted MSE. at MSE

Using the fact that it doesn't matter if you multiply by a non-zero constant

condition can be obtained.

목적 함수가 수렴하지 않는 경우, 단계(120)를 반복한다. If the objective function does not converge, step 120 is repeated.

상기 목적 함수가 수렴하는 경우 단계(130)에서, 상기 각각의 로컬 장치로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신하고(131), 상기 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복한다(132). When the objective function converges, in step 130, the global model in the cloud server is updated with the weighted average of the learned local models received from each local device (131), and whether the global model converges is determined. Thus, updating of the global model is repeated until the global model converges (132).

본 발명의 실시예에 따르면, 집계 오류에 대하여 미리 정해진 제한조건 내에서 연합 학습에 참여하는 로컬 장치의 수를 최대화하기 위해 보조 변수를 이용한다. 그리고, 상기 집계 오류에 대하여 미리 정해진 제한조건을 위반하는 로컬 장치의 수를 최소화 하기 위해 상기 제한조건을 볼록 함수로 변환하며, 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복한다. According to an embodiment of the present invention, auxiliary variables are used to maximize the number of local devices participating in federated learning within predetermined constraints on aggregation errors. Then, in order to minimize the number of local devices violating the predetermined constraints on the aggregation error, the constraints are converted into a convex function, and the global model is repeatedly updated until the global model converges.

연합 학습의 성능 열화를 방지하기 위해서 집계 오류는 목표로 하는

보다 작아야 한다. 이러한 제한조건 내에서 연합 학습에 참여하는 장치를 최대화하는 문제는 다음과 같이 나타낼 수 있다: To avoid performance degradation of federated learning, the aggregation error is

should be smaller than The problem of maximizing the number of devices participating in federated learning within these constraints can be represented as:

식(6)

Equation (6)

여기에서

이고

는

의 집합의 원소 수(cardinality)로 학습에 참여하기로 선택된 장치의 수이다. 학습에 참여하는 장치의 수를 최대화하는 문제는 새로운 보조 변수

를 도입해서 다음과 같이 변환할 수 있다[5]: From here

ego

Is

is the number of devices selected to participate in learning with the cardinality of the set of . The problem of maximizing the number of devices participating in training is a new auxiliary variable

can be converted as follows by introducing [5]:

식(7)

Equation (7)

여기에서

는

의

으로 벡터

에서 영이 아닌 원소의 개수를 의미한다. 만약 [수식6]에서 첫 번째 제한조건

을 만족하면

는 음수보다 커야 하며

를 만족하기 위해서 0이된다. 그렇지 않은 경우

는 양수보다 큰 수가 되어야 하기에

이 증가한다. 즉, 식(6)이 요구하는 MSE 조건을 만족하는 장치의 수를 최대화하는 문제라면 식(7)은 요구하는 MSE 조건을 위반하는 장치의 수를 최소화하는 문제로 동일한 결과를 얻는다. 또한 식(7)에서

은 비 볼록 함수이기 때문에 볼록 함수인

으로 안정화[6]를 하였다: From here

Is

of

as vector

is the number of non-zero elements in . If the first constraint in [Equation 6]

if it satisfies

must be greater than negative and

becomes 0 to satisfy if not

must be greater than a positive number

this increases That is, if Equation (6) is a problem of maximizing the number of devices that satisfy the required MSE condition, Equation (7) is a problem of minimizing the number of devices that violate the required MSE condition, and the same result is obtained. Also in equation (7)

Since is a non-convex function, the convex function

Stabilized [6] with:

식(8)

Equation (8)

식(8)에서

은 벡터

의 절대 값의 합을 의미한다. 첫 번째 제한조건을 만족하지 못하는 경우 식(7)과 마찬가지로

가 0이 되지만 위반하는 경우

만큼 목적함수의 값이 증가한다. 또한

가 MSE 조건을 위반하는 값보다 크면서 목적함수를 최소화하기 위해서는 첫 번째 제한조건이 등호를 만족해야 한다. 따라서 식(8)은 제한 조건을 위반 하는 양을 최소화하는 문제이며 최적의 해

는 다음을 만족한다: in equation (8)

silver vector

means the sum of the absolute values of If the first constraint is not satisfied, as in Equation (7)

becomes 0 but violates

The value of the objective function increases as much as also

is greater than the value that violates the MSE condition, and the first constraint must satisfy the equal sign in order to minimize the objective function. Therefore, Equation (8) is the problem of minimizing the quantity that violates the constraint, and is the optimal solution.

satisfies:

식(9)

Equation (9)

식(8)에서 첫 번째 제한 조건은 아직 비 볼록으로 해결하기 어려운 문제가 있다. 따라서 본 발명에서는 MM(Majorization Minimization) 기법에 기반하여 연속된 볼록 문제들로 재구성하였다. 첫 번째 제한 조건에서

은

에 대해서 볼록 함수이므로 테일러 급수를 1차항까지 전개한 값을 하한으로 가진다: The first constraint in Equation (8) is still non-convex and difficult to solve. Therefore, in the present invention, it is reconstructed into a series of convex problems based on the MM (Majorization Minimization) technique. in the first constraint

silver

Since it is a convex function for , the lower bound is the expansion of the Taylor series to the first order:

식(10)

Eq(10)

여기에서

은 반복 알고리즘으로부터 구한

번째의 값이다. 본 발명에서는 MM(Majorization Minimization) 기법을 적용하여 다음과 같이 볼록 문제를 반복적으로 푸는 것으로 원래의 문제를 해결한다: From here

is obtained from the iterative algorithm

is the value of the second In the present invention, the original problem is solved by repeatedly solving the convex problem as follows by applying the Majorization Minimization (MM) technique:

식(11)

Equation (11)

식(11)은 볼록 최적화 문제로 CVX toolbox[7]를 사용해서 내부점법(interior-point method)[6]으로 풀 수 있다. 또한 식(11)을 반복적으로 풀면서 구한 해는 식(8)의 KKT 조건(Karush-Kuhn-Tucker condition)으로 수렴하는 최적해를 구할 수 있다. 빔포밍 벡터가 수렴할 때까지 반복적으로 해를 구하는 알고리즘은 표 1과 같다.Equation (11) is a convex optimization problem and can be solved by the interior-point method [6] using the CVX toolbox [7]. In addition, the optimal solution that converges to the KKT condition (Karush-Kuhn-Tucker condition) of equation (8) can be obtained by repeatedly solving equation (11). Table 1 shows an algorithm that repeatedly obtains a solution until the beamforming vectors converge.

<표 1><Table 1>

본 발명에서는 알고리즘 복잡도를 줄이기 위해서 projected subgradient method[6]에 기반해서 식(11)을 해결하였다. 식(11)의 라그랑지안(Lagrangian)은 다음과 같이 나타낼 수 있다: In the present invention, equation (11) is solved based on the projected subgradient method [6] in order to reduce algorithmic complexity. The Lagrangian of Eq. (11) can be expressed as:

식(12)

Equation (12)

여기서

와

는 각 제한조건에 대한 쌍대 변수(dual variable)이다. 식(12)를 원 문제의 변수(primal variable)에 대해서 최대화를 하며 쌍대 변수에 대해서 최소화를 하는 것으로 해를 구할 수 있다. 식(11)의 KKT 조건을 간소화하고 쌍대 변수를 서브 그라이언트 기법(subgradient method)를 사용해서 갱신함으로써 얻은 저 복잡도 알고리즘은 표 2와 같다. here

and

is a dual variable for each constraint. The solution can be obtained by maximizing equation (12) for the primal variable and minimizing the dual variable. Table 2 shows the low-complexity algorithm obtained by simplifying the KKT condition of Eq. (11) and updating the dual variable using the subgradient method.

<표 2><Table 2>

표 2에서

는 쌍대 변수 갱신을 위한 스텝 사이즈(step size)이고, 제안하는 반복 알고리즘들은

이하의 값으로 수렴할 때까지 동작한다.in table 2

is the step size for dual variable update, and the proposed iterative algorithms are

It operates until it converges to the following value.

단계(140)에서, 각각의 로컬 장치가 클라우드 서버로부터 상기 글로벌 모델을 획득하여 각각의 로컬 장치가 가지고 있는 로컬 데이터에 기초하여 각각의 로컬 장치에 있는 로컬 모델을 학습한 후 학습된 로컬 모델을 클라우드 서버로 전송한다. In step 140, each local device obtains the global model from the cloud server, learns a local model in each local device based on local data possessed by each local device, and transfers the learned local model to the cloud. send to server

본 발명의 실시예에 따른 연합 학습에서는 먼저 각각의 로컬 장치가 클라우드 서버로부터 글로벌 모델을 획득한다(141). 각각의 로컬 장치는 자신이 가지고 있는 로컬 데이터로 해당 장치에 있는 로컬 모델을 학습한 후 로컬 모델을 갱신한다(142). 이후, 학습된 로컬 모델을 클라우드 서버로 전송한다(143). In federated learning according to an embodiment of the present invention, each local device first obtains a global model from a cloud server (141). Each local device learns a local model in the corresponding device with its own local data and then updates the local model (142). Then, the learned local model is transmitted to the cloud server (143).

클라우드 서버는 각각의 장치로부터 전송받은 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신한다. 마지막으로 클라우드 서버는 갱신된 글로벌 모델을 각 로컬 장치에 전송하며 글로벌 모델이 수렴할 때까지 이 과정을 반복한다. The cloud server updates the global model in the cloud server with the weighted average of the models transmitted from each device. Finally, the cloud server sends the updated global model to each local device and repeats this process until the global model converges.

도 2는 본 발명의 일 실시예에 따른 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계를 위한 클라우드 서버의 구성을 설명하기 위한 도면이다. 2 is a diagram for explaining the configuration of a cloud server for beamforming vector design in an OTA calculation technique for real-time federated learning according to an embodiment of the present invention.

제안하는 실시간 연합 학습을 위한 OTA 계산 기법에서의 빔포밍 벡터 설계를 위한 클라우드 서버(200)는 채널 정보 획득부(210), 빔포밍 벡터 설계부(220), 판단부(230) 및 글로벌 모델 업데이트부(240)를 포함한다. The cloud server 200 for beamforming vector design in the proposed OTA calculation technique for real-time joint learning includes a channel information acquisition unit 210, a beamforming vector design unit 220, a determination unit 230, and a global model update unit. (240).

본 발명의 실시예에 따른 채널 정보 획득부(210)는 연합 학습에 참여할 복수의 로컬 장치(250)를 선택하여 글로벌 모델을 전송하고, 각각의 로컬 장치(250)에 대한 채널 정보를 획득한다. The channel information acquisition unit 210 according to an embodiment of the present invention selects a plurality of local devices 250 to participate in federated learning, transmits a global model, and acquires channel information for each local device 250 .

연합 학습에서 FedAvg(Federated Averaging) 방법은 먼저 클라우드 서버(다시 말해, 기지국)이 학습에 참여할 로컬 장치를 선택하고, 각각의 로컬 장치(250)로 글로벌 모델을 전송하며, 채널 정보를 획득한다. In the federated averaging (FedAvg) method in federated learning, a cloud server (ie, a base station) first selects local devices to participate in learning, transmits a global model to each local device 250, and acquires channel information.

본 발명의 실시예에 따른 빔포밍 벡터 설계부(220)는 각각의 로컬 장치(250)로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계한다. The beamforming vector design unit 220 according to an embodiment of the present invention designs beamforming vectors for counting signals received from each local device 250 .

본 발명의 실시예에 따르면, 각각의 로컬 장치(250)로부터 수신되는 신호가 집계된 정보를 얻기 위해 빔포밍 벡터와 수신신호를 곱한다. 각각의 로컬 장치(250)로부터 수신되는 신호가 집계된 정보는 연합 학습에서 각각의 로컬 모델(250)이 집계된 정보와 같아야 하므로 집계에 대한 평균 제곱 오차(Mean Squared Error; MSE)를 최소화한다. 상기 평균 제곱 오차를 최소화하기 위해 영 강압(zero-forcing)으로 구한 값과 각각의 로컬 장치(250)의 전송 전력의 제한 조건으로 구한 값을 이용하여 집계 오류를 구한다. According to an embodiment of the present invention, a beamforming vector is multiplied by a received signal to obtain aggregated information of signals received from each local device 250 . Aggregated information of signals received from each local device 250 should be the same as information aggregated by each local model 250 in federated learning, thus minimizing a mean squared error (MSE) for aggregation. In order to minimize the mean square error, an aggregation error is obtained using a value obtained by zero-forcing and a value obtained as a limiting condition of transmission power of each local device 250 .

본 발명의 실시예에 따른 빔포밍 벡터 판단부(230)는 빔포밍 벡터의 목적 함수의 수렴 여부를 판단하고, 글로벌 모델 업데이트부(240)가 글로벌 모델을 갱신한 후, 글로벌 모델의 수렴 여부를 판단하여 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복하도록 한다. The beamforming vector determination unit 230 according to an embodiment of the present invention determines whether the objective function of the beamforming vector converges, and after the global model update unit 240 updates the global model, whether the global model converges or not is determined. It is determined to repeat the update of the global model until the global model converges.

본 발명의 실시예에 따른 빔포밍 벡터 판단부(230)는 목적 함수가 수렴하지 않는 경우, 클라우드 서버가 각각의 로컬 장치(250)로부터 수신되는 신호를 집계하기 위한 빔포밍 벡터를 설계하는 과정을 반복하도록 한다. When the objective function does not converge, the beamforming vector determiner 230 according to an embodiment of the present invention performs a process of designing a beamforming vector for a cloud server to aggregate signals received from each local device 250. let it repeat

본 발명의 실시예에 따른 글로벌 모델 업데이트부(240)는 각각의 로컬 장치(250)가 클라우드 서버로부터 상기 글로벌 모델을 획득하여 각각의 로컬 장치(250)가 가지고 있는 로컬 데이터에 기초하여 각각의 로컬 장치(250)에 있는 로컬 모델을 학습한 후 학습된 로컬 모델을 수신하고, 상기 판단부의 판단 결과에 따라 상기 목적 함수가 수렴하는 경우, 상기 각각의 로컬 장치(250)로부터 수신 받은 학습된 로컬 모델의 가중 평균으로 클라우드 서버에 있는 글로벌 모델을 갱신한다. The global model update unit 240 according to an embodiment of the present invention allows each local device 250 to acquire the global model from a cloud server and to obtain each local device 250 based on local data that each local device 250 has. After learning the local model in the device 250, the learned local model is received, and when the objective function converges according to the determination result of the determination unit, the learned local model received from each local device 250 Update the global model in the cloud server with the weighted average of .

본 발명의 실시예에 따른 글로벌 모델 업데이트부(240)는 상기 집계 오류에 대하여 미리 정해진 제한조건 내에서 연합 학습에 참여하는 로컬 장치의 수를 최대화하기 위해 보조 변수를 이용한다. 그리고, 상기 집계 오류에 대하여 미리 정해진 제한조건을 위반하는 로컬 장치의 수를 최소화 하기 위해 상기 제한조건을 볼록 함수로 변환하며, 상기 글로벌 모델이 수렴할 때까지 글로벌 모델의 갱신을 반복한다. The global model updater 240 according to an embodiment of the present invention uses an auxiliary variable to maximize the number of local devices participating in federated learning within a predetermined constraint with respect to the aggregation error. Then, in order to minimize the number of local devices violating the predetermined constraints on the aggregation error, the constraints are converted into a convex function, and the global model is repeatedly updated until the global model converges.

도 3은 본 발명의 일 실시예에 따른 요구되는 MSE(

)에 따른 선택된 장치의 수에 대한 그래프이다. 3 is a required MSE according to an embodiment of the present invention (

) is a graph of the number of selected devices according to.

제안하는 기법의 성능 검증하기 위해 종래기술에 따른 알고리즘 l₁+SDR [3], DC[3]와 집계오류가 없다고 가정한 이상적인 알고리즘 Oracle-Aircomp, 성능의 하한을 위해 랜덤하게 빔포밍 벡터를 선택하는 랜덤 빔포밍(Random beamforming), 그리고 제안하는 알고리즘들인 CVX(Algorithm 1), 서브 그라이언트(Subgradient)(Algorithm 2)의 성능을 비교하였다. In order to verify the performance of the proposed technique, the algorithm l ₁ +SDR [3], DC [3] according to the prior art, and the ideal algorithm Oracle-Aircomp assuming no aggregation error, randomly selects a beamforming vector for the lower limit of the performance. The performance of random beamforming, and the proposed algorithms, CVX (Algorithm 1) and Subgradient (Algorithm 2), were compared.

도 3은 기지국 안테나의 개수가 6개, 전체 장치의 개수가 60개일 때 요구되는 MSE(

)에 따른 선택된 장치의 수에 대한 그래프이다. l₁+SDR은 랜덤하게 선택하는 랜덤 빔포밍보다 적은 장치만을 선택하며 가장 나쁜 성능을 가진다. CVX는 가장 많은 장치를 선택하며 DC는 CVX보다 최대 10개 이상의 장치를 덜 선택하지만 서브 그라이언트는 CVX와 1개 이내의 차이로 장치를 선택한다. 3 is an MSE required when the number of base station antennas is 6 and the total number of devices is 60 (

) is a graph of the number of selected devices according to. l ₁ +SDR selects fewer devices than randomly selected random beamforming and has the worst performance. CVX picks the most devices, DC picks up to 10 or more less devices than CVX, but subgrant picks less than 1 device than CVX.

도 4는 본 발명의 일 실시예에 따른 전체 장치 개수에 따른 선택된 장치의 수에 대한 그래프이다. 4 is a graph of the number of selected devices according to the total number of devices according to an embodiment of the present invention.

도 4는 기지국의 안테나의 개수가 6개,

가 4dB일 때 전체 장치 개수에 따른 선택된 장치의 수에 대한 그래프이다. Oracle-Aircomp는 모든 장치가 집계 오류 없이 학습에 참여한다고 가정하기 때문에 선택된 장치의 개수는 전체 장치의 개수와 동일하다. DC는 장치의 개수가 증가할수록 성능 열화가 심해지며 장치의 개수가 90개 일 때 하한인 랜덤 빔포밍의 성능에 근접한다. 제안하는 알고리즘들은 장치의 개수가 증가하여도 선택하는 장치의 수가 선형적으로 증가하므로 장치의 수가 많아도 안정적으로 동작하는 것을 확인할 수 있다. 4 shows that the number of antennas of the base station is 6,

When is 4dB, it is a graph of the number of selected devices according to the total number of devices. Since Oracle-Aircomp assumes that all devices participate in training without aggregation errors, the number of devices selected is equal to the total number of devices. DC performance deteriorates as the number of devices increases, and when the number of devices is 90, it approaches the performance of random beamforming, which is the lower limit. Even if the number of devices increases, it can be confirmed that the proposed algorithm operates stably even if the number of devices increases because the number of selected devices increases linearly.

도 5는 본 발명의 일 실시예에 따른 통신 라운드에 따른 인공지능 모델의 학습 성능에 대한 그래프이다. 5 is a graph of learning performance of an artificial intelligence model according to communication rounds according to an embodiment of the present invention.

도 5는 기지국 안테나의 개수가 6개, 전체 장치의 개수가 60개,

가 4dB일 때 통신 라운드에 따른 인공지능 모델의 학습 성능을 보여준다. 이 실험에서는 MNIST 데이터 셋을 사용하였으며 서포트 벡터 머신(Ssupport Vector Machine; SVM)으로 모델의 분류 성능을 평가하였다[8]. 1번의 통신 라운드는 연합학습이 1번 진행되는 모든 과정을 포함한다. 알고리즘 별 성능은 Oracle-Aircomp, CVX, 서브 그라이언트, DC, 랜덤 빔포밍 순서로 높은 성능을 달성한다. 이는 도 4를 기준으로 많은 장치를 선택할수록 높은 성능을 달성한다는 것을 보인다. 게다가 CVX는 25번째 라운드에서 이상적인 성능인 Oracle-Aircomp 성능에 도달하는 것을 확인할 수 있다. 5 shows that the number of base station antennas is 6, the total number of devices is 60,

When is 4dB, it shows the learning performance of the artificial intelligence model according to the communication round. In this experiment, the MNIST data set was used and the classification performance of the model was evaluated with a support vector machine (SVM) [8]. One communication round includes all processes in which associative learning is performed once. Performance by algorithm achieves high performance in the order of Oracle-Aircomp, CVX, subgrant, DC, and random beamforming. This shows that as more devices are selected based on FIG. 4 , higher performance is achieved. Moreover, it can be seen that CVX reaches the ideal Oracle-Aircomp performance in the 25th round.

도 6은 본 발명의 일 실시예에 따른 각 알고리즘에 대한 실행시간을 보여준다.6 shows the execution time for each algorithm according to an embodiment of the present invention.

도 6은 기지국의 안테나의 개수가 6개,

가 4dB일 때 전체 장치 개수에 따른 알고리즘 동작 시간을 보여주는 그래프이다. DC는 장치의 개수가 증가함에 따라 급속도로 실행시간이 늘어나며 장치의 개수가 60개 일 때 26초, 90개일 때 70초 이상의 시간이 소요된다. 반면에 제안하는 알고리즘들은 장치의 개수에 선형적으로 시간이 증가하며 장치의 개수가 60개일 때 CVX는 2.8초, 서브 그라이언트는 0.037초 정도의 시간을 소요한다. 따라서 서브 그라이언트는 실시간 연합학습을 하기에 가장 적합하다. 6 shows that the number of antennas of the base station is 6,

It is a graph showing the algorithm operating time according to the total number of devices when is 4dB. DC's execution time increases rapidly as the number of devices increases, and it takes 26 seconds when the number of devices is 60 and more than 70 seconds when the number of devices is 90. On the other hand, the proposed algorithms increase the time linearly with the number of devices, and when the number of devices is 60, CVX takes 2.8 seconds and sub-grant takes about 0.037 seconds. Therefore, subgrant is most suitable for real-time federated learning.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. The device can be commanded. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device. can be embodied in Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

Claims

selecting, by a cloud server, a plurality of local devices to participate in federated learning, transmitting a global model, and obtaining channel information for each local device;
each local device acquiring the global model from a cloud server, learning a local model in each local device based on local data possessed by each local device, and then transmitting the learned local model to the cloud server;
Designing a beamforming vector for aggregating signals received from each local device by a cloud server and determining whether an objective function converges; and
When the objective function converges, the global model in the cloud server is updated with the weighted average of the learned local models received from each local device, and whether or not the global model converges is determined. When the global model converges Iterating the update of the global model until
including,
The cloud server designing a beamforming vector for aggregating signals received from each local device and determining whether the objective function converges,
The beamforming vector is multiplied by the received signal to obtain information in which the signals received from each local device are aggregated;
Since the aggregated information of signals received from each local device must be the same as the aggregated information of each local model in federated learning, the mean squared error (MSE) for aggregation is minimized,
In order to minimize the mean square error, an aggregation error is obtained using a value obtained by zero-forcing and a value obtained as a limiting condition of transmission power of each local device,
When the objective function converges, the global model in the cloud server is updated with the weighted average of the learned local models received from each local device, and whether or not the global model converges is determined. When the global model converges The step of repeating the update of the global model until
Using auxiliary variables to maximize the number of local devices participating in federated learning within predetermined constraints for the aggregation error;
In order to minimize the number of local devices that violate the predetermined constraints on the aggregation error, the convex constraints are convex by reconstructing the non-convex problem of optimizing the auxiliary variable into a continuous convex problem based on the Majorization Minimization (MM) technique. convert it into a function to stabilize it,
Repeating the update of the global model until the global model converges
Beamforming vector design method.

delete

According to claim 1,
The cloud server designing a beamforming vector for aggregating signals received from each local device and determining whether the objective function converges,
If the objective function does not converge, the cloud server repeats the process of designing a beamforming vector for aggregating signals received from each local device.
Beamforming vector design method.

delete

In the cloud server for beamforming vector design,
a channel information acquisition unit that selects a plurality of local devices to participate in federated learning, transmits a global model, and acquires channel information for each local device;
a beamforming vector design unit for designing a beamforming vector for aggregating signals received from each local device;
Determining whether or not the objective function of the beamforming vector converges, and after the global model updater updates the global model, determines whether or not the global model converges, and repeats updating the global model until the global model converges. wealth; and
Each local device obtains the global model from the cloud server, learns a local model in each local device based on local data possessed by each local device, receives the learned local model, and determines the determination unit When the objective function converges according to the result, a global model update unit for updating a global model in the cloud server with a weighted average of the learned local models received from each of the local devices
including,
The beamforming vector design unit,
The beamforming vector is multiplied by the received signal to obtain information in which signals received from each local device are aggregated;
Since the aggregated information of signals received from each local device must be the same as the aggregated information of each local model in federated learning, the mean squared error (MSE) for aggregation is minimized,
In order to minimize the mean square error, an aggregation error is obtained using a value obtained by zero-forcing and a value obtained as a limiting condition of transmission power of each local device,
The global model update unit,
Using auxiliary variables to maximize the number of local devices participating in federated learning within predetermined constraints for the aggregation error;
In order to minimize the number of local devices that violate the predetermined constraints on the aggregation error, the convex constraints are convex by reconstructing the non-convex problem of optimizing the auxiliary variable into a continuous convex problem based on the Majorization Minimization (MM) technique. convert it into a function to stabilize it,
Repeating the update of the global model until the global model converges
Cloud server for beamforming vector design.

delete

According to claim 5,
The judge,
When the objective function does not converge, the cloud server repeats the process of designing a beamforming vector for aggregating signals received from each local device
Cloud server for beamforming vector design.

delete