KR102486795B1

KR102486795B1 - Method and Apparatus for Data Augmentation in Frequency Domain for High Performance Training in Deep Learning

Info

Publication number: KR102486795B1
Application number: KR1020200041502A
Authority: KR
Inventors: 이상철; 남주현
Original assignee: 인하대학교 산학협력단
Priority date: 2020-04-06
Filing date: 2020-04-06
Publication date: 2023-01-10
Also published as: KR20210123824A

Abstract

딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법 및 장치가 제시된다. 본 발명에서 제안하는 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법은 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 단계, 필터링된 영상을 이용하여 새로운 영상을 생성하는 단계 및 생성된 영상에 기초하여 학습을 수행하는 단계를 포함한다. A method and apparatus for augmenting data in the frequency domain for improving performance of deep learning are presented. The data augmentation method in the frequency domain for improving the performance of deep learning proposed in the present invention includes performing filtering in the frequency domain by applying a Fourier transform to remove bias in data, and generating a new image using the filtered image. and performing learning based on the generated image.

Description

Method and Apparatus for Data Augmentation in Frequency Domain for High Performance Training in Deep Learning}

본 발명은 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for augmenting data in a frequency domain for improving performance of deep learning.

다중 이미지 분류 모델을 학습할 때 데이터의 불균형은 분류 성능 향상에 있어서 지금까지 중요한 주제이다. 특정 데이터의 분포가 다른 데이터에 비해 과도하게 편향된 경우, 학습된 모델은 시험 데이터에 대해서 편향된 데이터에 대한 정확도는 높지만, 다른 데이터는 낮은 정확도를 보이게 된다. 이를 해결하기 위해 모델을 일반화하기 위한 연구가 다양하게 진행되었다. 대표적으로 Dropout, Batch Normalization, 데이터 증강 기법을 통해 편향성을 줄이기도 한다. The imbalance of data when training multi-image classification models has been an important topic so far in improving classification performance. If the distribution of specific data is excessively biased compared to other data, the trained model shows high accuracy for the biased data with respect to the test data, but low accuracy for the other data. To solve this problem, various studies have been conducted to generalize the model. Typically, bias is reduced through dropout, batch normalization, and data augmentation techniques.

먼저, Dropout은 모델이 학습되는 동안 무작위로 특정 계층의 신경을 비활성화하는 기법이다. 이를 통해, 흔히 앙상블 효과를 얻을 수 있으며, 모델의 일반화를 달성할 수 있다. First, dropout is a technique for randomly inactivating neurons in a specific layer while the model is being trained. Through this, an ensemble effect can often be obtained, and generalization of the model can be achieved.

그 다음으로, 데이터 증강은 입력 영상에 확대, 축소, 회전, 수평 반전, 수직 반전 등의 과정을 거친 후에 모델을 학습하게 된다. 이를 통해, 데이터의 편향성을 줄여 정확도를 높이게 된다. Next, data augmentation learns the model after going through processes such as enlargement, reduction, rotation, horizontal inversion, and vertical inversion of the input image. Through this, the bias of the data is reduced and the accuracy is increased.

하지만, 위 방법을 사용하더라도 해결되지 않은 경우가 있다. 이 현상을 해결하기 위해서 데이터의 수가 적은 영상에 대해서 주파수 도메인에서의 필터링 전처리를 통해 새로운 영상을 얻는 방법을 제안한다. 또한, 다양한 실험을 통해서 주어진 데이터의 분류 정확도 성능이 가장 잘 오르도록 하는 필터를 찾는다.However, there are cases where the above method does not solve the problem. To solve this phenomenon, we propose a method of obtaining a new image through filtering preprocessing in the frequency domain for an image with a small number of data. In addition, through various experiments, we find a filter that best improves the classification accuracy performance of the given data.

본 발명이 이루고자 하는 기술적 과제는 주파수 공간에서 버터워스(ButterWorth) 차단 필터를 특정 이미지에 적용하여 새로운 이미지를 만드는 방법 및 장치를 제공하는데 있다. 이를 통해 기존 데이터의 불균형 문제를 해결하여 과적합을 방지하기 위한 방법 및 장치를 제안한다.A technical problem to be achieved by the present invention is to provide a method and apparatus for creating a new image by applying a ButterWorth cut-off filter to a specific image in a frequency space. Through this, we propose a method and device to prevent overfitting by solving the imbalance problem of existing data.

일 측면에 있어서, 본 발명에서 제안하는 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법은 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 단계, 필터링된 영상을 이용하여 새로운 영상을 생성하는 단계 및 생성된 영상에 기초하여 학습을 수행하는 단계를 포함한다. In one aspect, the data augmentation method in the frequency domain for improving the performance of deep learning proposed in the present invention includes the steps of performing filtering in the frequency domain by applying a Fourier transform to remove bias in data, and generating the filtered image. It includes generating a new image using the image and performing learning based on the generated image.

데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 단계는 푸리에 변환을 적용한 영상에 미리 정해진 필터를 이용하여 화소 별 곱셈을 통해 필터링된 영상을 얻기 위한 전처리를 수행한다. In the step of filtering in the frequency domain by applying Fourier transform to remove data bias, preprocessing is performed to obtain a filtered image through pixel-by-pixel multiplication using a predetermined filter on the image to which the Fourier transform is applied.

데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 단계는 영상에 2차원 이산 푸리에 변환을 적용하여 변환된 영상의 중심을 이동시키고, 주파수 도메인에서 변환된 영상과 필터에 대하여 화소 별 곱셈을 수행한다. The step of performing filtering in the frequency domain by applying a Fourier transform to remove the bias of the data is to move the center of the transformed image by applying a 2D discrete Fourier transform to the image, and to filter the image transformed in the frequency domain and filter. Perform pixel-by-pixel multiplication for

필터링된 영상을 이용하여 새로운 영상을 생성하는 단계는 주파수 도메인에서 필터링된 영상을 공간 도메인으로 다시 바꾸기 위해 2차 역 이산 푸리에 변환 적용하고, 공간 도메인으로 변환된 영상의 중심을 이동시킨다. In the step of generating a new image using the filtered image, a second-order inverse discrete Fourier transform is applied to convert the filtered image from the frequency domain back to the spatial domain, and the center of the image converted to the spatial domain is moved.

또 다른 일 측면에 있어서, 본 발명에서 제안하는 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 장치는 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 필터링부, 필터링된 영상을 이용하여 새로운 영상을 생성하는 영상 생성부 및 생성된 영상에 기초하여 학습을 수행하는 학습부를 포함한다. In another aspect, the data augmentation apparatus in the frequency domain for improving the performance of deep learning proposed in the present invention includes a filtering unit that performs filtering in the frequency domain by applying a Fourier transform to remove data bias, filtering It includes an image generation unit for generating a new image using the generated image and a learning unit for performing learning based on the generated image.

본 발명의 실시예들에 따르면 데이터의 수가 적은 영상에 대해서 주파수 도메인에서의 필터링 전처리를 통해 새로운 영상을 얻을 수 있다. 또한, 이를 통해 기존 데이터의 불균형 문제를 해결하여 과적합을 방지할 수 있다. According to embodiments of the present invention, a new image may be obtained through pre-processing of filtering in the frequency domain for an image having a small number of data. In addition, through this, overfitting can be prevented by solving the imbalance problem of existing data.

도 1은 본 발명의 일 실시예에 따른 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법을 설명하기 위한 흐름도이다.
도 2는 본 발명의 일 실시예에 따른 필터링된 영상을 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 장치의 구성을 나타내는 도면이다.
도 4는 본 발명의 일 실시예에 따른 실험을 위한 피부암 예시와 데이터 분포를 나타내는 도면이다.
도 5는 본 발명의 일 실시예에 따른 증강 방식을 적용한 데이터에 대한 학습 결과를 나타내는 도면이다.
도 6은 본 발명의 일 실시예에 따른 데이터별 시험 데이터에 대한 오답율을 나타내는 도면이다.
도 7은 본 발명의 일 실시예에 따른 증강 기법을 적용한 학습 결과와 종래기술에 따른 학습 결과를 비교하는 도면이다.
도 8은 본 발명의 또 다른 실시예에 따른 증강 기법을 적용한 학습 결과와 종래기술에 따른 학습 결과를 비교하는 도면이다.
도 9는 본 발명의 일 실시예에 따른 데이터 증강 비율에 대한 분포 결과를 나타내는 도면이다.
도 10은 본 발명의 일 실시예에 따른 주파수 도메인 필터링 및 데이터 증강 방식을 적용한 결과를 나타내는 도면이다. 1 is a flowchart illustrating a data augmentation method in a frequency domain for improving performance of deep learning according to an embodiment of the present invention.
2 is a diagram showing a filtered image according to an embodiment of the present invention.
3 is a diagram showing the configuration of a data augmentation device in the frequency domain for deep learning performance improvement according to an embodiment of the present invention.
4 is a diagram showing examples of skin cancer and data distribution for an experiment according to an embodiment of the present invention.
5 is a diagram showing learning results for data to which an augmentation method according to an embodiment of the present invention is applied.
6 is a diagram showing an error rate for test data for each data according to an embodiment of the present invention.
7 is a diagram comparing a learning result using an augmentation technique according to an embodiment of the present invention with a learning result according to the prior art.
8 is a diagram comparing a learning result using an augmentation technique according to another embodiment of the present invention with a learning result according to the prior art.
9 is a diagram showing distribution results for data augmentation ratios according to an embodiment of the present invention.
10 is a diagram showing a result of applying frequency domain filtering and data augmentation method according to an embodiment of the present invention.

본 발명에서는 주파수 공간에서 버터워스(ButterWorth) 차단 필터를 특정 이미지에 적용하여 새로운 이미지를 만드는 방법을 제안한다. 이를 통해 기존 데이터의 불균형 문제를 해결하여 과적합을 방지할 수 있을 것이다. 또한, 의학 영상이 그레이 스케일(Gray Scale) 영상임을 가정하기 위해서 영상을 그레이 스케일로 변환한 뒤 푸리에 변환 및 주파수 도메인에서 필터링을 진행하였다. 실험을 위해 VGG16을 미세 조정하여 사용하였다. 결과적으로 두 가지 방식을 함께 적용하여 95개밖에 없었던 가장 작은 데이터의 오답율을 약 20%감소시키는 효과를 얻게 되었다. 이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다. The present invention proposes a method of creating a new image by applying a ButterWorth cut-off filter to a specific image in the frequency space. This will solve the imbalance problem of the existing data and prevent overfitting. In addition, in order to assume that the medical image is a gray scale image, the image is converted to gray scale, followed by Fourier transform and filtering in the frequency domain. For the experiments, VGG16 was used with fine adjustment. As a result, by applying the two methods together, the error rate of the smallest data with only 95 items was reduced by about 20%. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법을 설명하기 위한 흐름도이다. 1 is a flowchart illustrating a data augmentation method in a frequency domain for improving performance of deep learning according to an embodiment of the present invention.

제안하는 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 방법은 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행하는 단계(110), 필터링된 영상을 이용하여 새로운 영상을 생성하는 단계(120) 및 생성된 영상에 기초하여 학습을 수행하는 단계(130)를 포함한다. The proposed method for augmenting data in the frequency domain for improving performance of deep learning includes performing filtering in the frequency domain by applying Fourier transform to remove data bias (110), and generating a new image using the filtered image. A step 120 of generating and a step 130 of performing learning based on the generated image are included.

단계(110)에서, 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행한다. 푸리에 변환을 적용한 영상에 미리 정해진 필터를 이용하여 화소 별 곱셈을 통해 필터링된 영상을 얻기 위한 전처리를 수행한다. 영상에 2차원 이산 푸리에 변환을 적용하여 변환된 영상의 중심을 이동시키고, 주파수 도메인에서 변환된 영상과 필터에 대하여 화소 별 곱셈을 수행한다. In step 110, filtering in the frequency domain is performed by applying a Fourier transform to remove data bias. Preprocessing is performed to obtain a filtered image through pixel-by-pixel multiplication using a predetermined filter on the image to which the Fourier transform is applied. The 2D discrete Fourier transform is applied to the image, the center of the transformed image is moved, and pixel-by-pixel multiplication is performed on the transformed image and filter in the frequency domain.

단계(120)에서, 필터링된 영상을 이용하여 새로운 영상을 생성한다. 주파수 도메인에서 필터링된 영상을 공간 도메인으로 다시 바꾸기 위해 2차 역 이산 푸리에 변환 적용하고, 공간 도메인으로 변환된 영상의 중심을 이동시킨다. In step 120, a new image is generated using the filtered image. In order to convert the image filtered in the frequency domain back to the spatial domain, a second-order inverse discrete Fourier transform is applied, and the center of the image transformed to the spatial domain is shifted.

단계(130)에서, 생성된 영상에 기초하여 학습을 수행한다. In step 130, learning is performed based on the generated image.

도 2는 본 발명의 일 실시예에 따른 필터링된 영상을 나타내는 도면이다. 2 is a diagram showing a filtered image according to an embodiment of the present invention.

주파수 도메인 필터링은 푸리에 변환을 적용한 영상에 특정 필터를 화소 별 곱셈을 통해 필터링된 영상을 얻는 전처리이다. 제안하는 전처리 과정은 다음과 같다.Frequency domain filtering is a pre-processing that obtains a filtered image by multiplying a specific filter per pixel on an image to which Fourier transform is applied. The proposed preprocessing process is as follows.

먼저, 영상(210)에 2차원 이산 푸리에 변환 적용한다. 이후, 변환된 영상을 중심 이동시킨다. 주파수 도메인에서 변환된 영상(220)과 필터(230)에 대해 화소 별 곱셈을 수행한다. 그리고, 공간 도메인으로 다시 바꾸기 위해 2차 역 이산 푸리에 변환을 적용한다. 마지막으로, 공간 도메인으로 돌아온 영상(240)에 중심을 이동시킨다. First, a 2D discrete Fourier transform is applied to the image 210. Then, the center of the transformed image is shifted. Pixel-by-pixel multiplication is performed on the image 220 transformed in the frequency domain and the filter 230. Then, a second-order inverse discrete Fourier transform is applied to convert it back to the spatial domain. Finally, the center of the image 240 returned to the spatial domain is moved.

도 2에서, width=32, range=40, n=10인 버터워스(ButterWorth) 차단 필터를 적용하여 필터링된 영상을 나타내었다. In FIG. 2, a filtered image is shown by applying a ButterWorth blocking filter with width = 32, range = 40, and n = 10.

이때, 이상적 차단 필터, 가우시안 차단 필터 중에서 버터워스 차단 필터를 적용하였다. 도 2의 기존 영상과 필터링된 영상을 비교해보면 사람의 눈으로 식별 시 그렇게 큰 차이가 없다는 것이 볼 수 있다. 하지만, 전체적인 화소 값은 조금씩 변하였다. At this time, a Butterworth cut-off filter was applied among ideal cut-off filters and Gaussian cut-off filters. Comparing the original image of FIG. 2 with the filtered image, it can be seen that there is not such a big difference in identification with the human eye. However, the overall pixel value changed little by little.

도 3은 본 발명의 일 실시예에 따른 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 장치의 구성을 나타내는 도면이다. 3 is a diagram showing the configuration of a data augmentation device in the frequency domain for deep learning performance improvement according to an embodiment of the present invention.

제안하는 딥러닝 성능향상을 위한 주파수 도메인에서의 데이터 증강 장치는 필터링부(310), 영상 생성부(320) 및 학습부(330)를 포함한다. The proposed apparatus for augmenting data in the frequency domain for improving performance of deep learning includes a filtering unit 310, an image generator 320, and a learning unit 330.

필터링부(310)는 데이터의 편향성을 제거하기 위해 푸리에 변환을 적용하여 주파수 도메인에서의 필터링을 수행한다. 푸리에 변환을 적용한 영상에 미리 정해진 필터를 이용하여 화소 별 곱셈을 통해 필터링된 영상을 얻기 위한 전처리를 수행한다. 영상에 2차원 이산 푸리에 변환을 적용하여 변환된 영상의 중심을 이동시키고, 주파수 도메인에서 변환된 영상과 필터에 대하여 화소 별 곱셈을 수행한다. The filtering unit 310 performs filtering in the frequency domain by applying a Fourier transform to remove data bias. Preprocessing is performed to obtain a filtered image through pixel-by-pixel multiplication using a predetermined filter on the image to which the Fourier transform is applied. The 2D discrete Fourier transform is applied to the image, the center of the transformed image is moved, and pixel-by-pixel multiplication is performed on the transformed image and filter in the frequency domain.

영상 생성부(320)는 필터링된 영상을 이용하여 새로운 영상을 생성한다. 주파수 도메인에서 필터링된 영상을 공간 도메인으로 다시 바꾸기 위해 2차 역 이산 푸리에 변환 적용하고, 공간 도메인으로 변환된 영상의 중심을 이동시킨다. The image generator 320 generates a new image using the filtered image. In order to convert the image filtered in the frequency domain back to the spatial domain, a second-order inverse discrete Fourier transform is applied, and the center of the image transformed to the spatial domain is shifted.

학습부(330)는 생성된 영상에 기초하여 학습을 수행한다. The learning unit 330 performs learning based on the generated image.

도 4는 본 발명의 일 실시예에 따른 실험을 위한 피부암 예시와 데이터 분포를 나타내는 도면이다. 4 is a diagram showing examples of skin cancer and data distribution for an experiment according to an embodiment of the present invention.

실험 과정은 다음과 같다. 고정된 실험 변수는 학습을 200에폭을 반복하였고, 최적화를 위한 최적화기(optimizer)는 'Adam'을 선택하였다. 먼저, 기존 데이터와 기존의 데이터 증강 방식을 적용한 데이터에 대한 분류 성능을 확인한다. 그 다음으로 앞서 설명된 필터링 전처리를 이용하여 다양한 필터에 대한 실험을 통해서 분류 성능이 가장 높은 필터를 선택한다. 마지막으로, 기존의 데이터 증강 방식과 본 발명에서 제안하는 데이터 증강 방식을 함께 적용한 뒤에도 분류 성능이 가장 높은 필터를 선택한다. 최종적으로 선택된 필터들을 통해 얻은 데이터에 대한 분류 성능과 기존 데이터 증강 기법의 성능을 비교한다.The experiment process is as follows. For the fixed experimental variable, learning was repeated for 200 epochs, and 'Adam' was selected as the optimizer for optimization. First, the classification performance of the existing data and the data to which the existing data augmentation method is applied is checked. Next, using the filtering preprocessing described above, a filter with the highest classification performance is selected through experiments on various filters. Finally, even after applying both the existing data augmentation method and the data augmentation method proposed in the present invention, a filter with the highest classification performance is selected. Finally, the classification performance of the data obtained through the selected filters is compared with the performance of the existing data augmentation technique.

데이터는 Kaggle이라는 빅 데이터 사이트를 통해 수집하였으며 7가지 종류의 피부암이 각기 다른 분포를 가지고 있어 해당 실험을 하기에 알맞기 때문에 선택하였다. 피부암 예시와 데이터 분포는 도 4와 같다. Data were collected through a big data site called Kaggle, and seven types of skin cancers had different distributions, so they were selected because they were appropriate for the experiment. An example of skin cancer and data distribution are shown in FIG. 4 .

도 4(a)는 7가지 종류의 피부암을 각각 5개씩 샘플링한 도면을 나타낸다. 도 4(b)는 각 피부암의 데이터 수를 나타낸다. 왼쪽 데이터부터 Melanocytic nevi(nv): 5498개, Benign keratosis-like lesions(bkl): 907개, Melanoma(mel): 892개, Basel cell carcinoma(bcc): 430개, Actinic keratoses(akiec): 262개, Vascular lesions(vasc): 117개, Dermatofibroma(df): 95개이다. Fig. 4(a) shows a diagram in which 5 samples of 7 types of skin cancer were respectively sampled. 4(b) shows the number of data for each skin cancer. From left data: Melanocytic nevi (nv): 5498, Benign keratosis-like lesions (bkl): 907, Melanoma (mel): 892, Basel cell carcinoma (bcc): 430, Actinic keratoses (akiec): 262 , Vascular lesions (vasc): 117, Dermatofibroma (df): 95.

도 5는 본 발명의 일 실시예에 따른 증강 방식을 적용한 데이터에 대한 학습 결과를 나타내는 도면이다. 5 is a diagram showing learning results for data to which an augmentation method according to an embodiment of the present invention is applied.

먼저, 아무런 변화도 주지 않은 데이터와 기존의 데이터 증강 방식을 적용한 데이터에 대해서 학습한 결과를 확인한다. 이때, 데이터 증강으로 매 학습마다 입력 영상에 0~10도 사이의 회전이나, 중앙을 중심으로 10% 내의 확대/축소나, 전체 화소 크기의 0.1배만큼 수평/수직 이동을 적용하였다. First, the result of learning about the data without any change and the data to which the existing data augmentation method is applied is checked. At this time, rotation between 0 and 10 degrees, scaling within 10% centered on the center, or horizontal/vertical movement by 0.1 times the total pixel size were applied to the input image for each learning by data augmentation.

결과는 도 5와 같다. 도 5(a)는 기존 데이터에 대한 정확도를 나타내고, 도 5(b)는 기본 데이터에 대한 손실의 변화를 나타낸다. 여기서, 훈련 데이터 정확도=97%, 검증 데이터 정확도=72%, 시험 데이터 정확도=73.6%이다. The results are shown in FIG. 5 . Figure 5 (a) shows the accuracy for the existing data, and Figure 5 (b) shows the change in loss for the basic data. Here, training data accuracy = 97%, verification data accuracy = 72%, and test data accuracy = 73.6%.

도 5(c)는 기존 데이터에 제안하는 증강 방식을 적용한 데이터에 대한 정확도를 나타내고, 도 5(d)는 기존 데이터에 제안하는 증강 방식을 적용한 데이터에 대한 손실의 변화를 나타낸다. 여기서, 훈련 데이터 정확도=74%, 검증 데이터 정확도=71%, 시험 데이터 정확도=74.2%이다. FIG. 5(c) shows the accuracy of data obtained by applying the proposed augmentation method to existing data, and FIG. 5(d) shows changes in loss for data obtained by applying the proposed augmentation method to existing data. Here, training data accuracy = 74%, validation data accuracy = 71%, and test data accuracy = 74.2%.

기존 데이터의 정확도와 손실의 변화추이를 확인해보면 훈련 데이터에 대한 정확도는 100%에, 손실은 0으로 수렴해가고 있는 것을 볼 수 있다. 하지만, 검증 데이터의 경우 정확도가 72% 근처에서 발산하고, 손실은 20에폭 근처에서 다시 손실이 증가하고 있는 것을 볼 수 있다. 그에 반해 기존의 데이터 증강 방식을 적용한 경우, 훈련 데이터에 대해서 정확도는 약 85%, 손실은 0.3 근처로 수렴하고 있다. 훈련 데이터의 성능으로만 판단했을 때는 기존 데이터보다 좋지 않는 성능을 가지고 있을 것이라고 판단할 수 있지만, 검증 데이터의 손실과 함께 보면 기존 데이터에 비해 다시 상승하는 속도가 줄어든 것을 볼 수 있다. 이는 기존 데이터에 비해 데이터의 편향성이 감소했다고 볼 수 있다. If you check the trend of the accuracy and loss of the existing data, you can see that the accuracy of the training data is converging to 100% and the loss to 0. However, in the case of the verification data, it can be seen that the accuracy diverges around 72%, and the loss increases again around 20 epochs. On the other hand, when the existing data augmentation method is applied, the accuracy is about 85% and the loss converges to around 0.3 for the training data. Judging only by the performance of the training data, it can be judged that it will have worse performance than the existing data, but looking at the loss of the verification data, it can be seen that the rate of increase again is reduced compared to the existing data. This can be seen as a decrease in the bias of the data compared to the existing data.

도 6은 본 발명의 일 실시예에 따른 데이터별 시험 데이터에 대한 오답율을 나타내는 도면이다. 6 is a diagram showing an error rate for test data for each data according to an embodiment of the present invention.

도 6을 참조하면, 기존의 데이터(610)에 비해 증강 방식을 적용한 데이터(620)의 bcc, df, nv 데이터에 대한 오답율은 감소한 것을 볼 수 있다. 하지만, bkl, mel, vasc 데이터의 경우 크게 오답율이 오히려 증가하거나 변화가 없는 것을 볼 수 있다. 이는 기존의 데이터 증강 방식이 불완전하다고 볼 수 있으므로 이를 보완하기 위한 새로운 데이터 증강 방식이 필요하다고 볼 수 있다.Referring to FIG. 6 , it can be seen that the error rate for the bcc, df, and nv data of the data 620 to which the augmentation method is applied is reduced compared to the existing data 610 . However, in the case of bkl, mel, and vasc data, it can be seen that the error rate increases or does not change significantly. Since the existing data augmentation method is incomplete, it can be seen that a new data augmentation method is needed to compensate for this.

도 7은 본 발명의 일 실시예에 따른 증강 기법을 적용한 학습 결과와 종래기술에 따른 학습 결과를 비교하는 도면이다. 7 is a diagram comparing a learning result using an augmentation technique according to an embodiment of the present invention with a learning result according to the prior art.

도 7은 상단부터 버터워스(ButterWorth) 차단 필터를 n = 2, 5, 10으로 적용한 4:1의 비율로 데이터 증강한 뒤 검증 데이터 손실의 변화 추이를 나타낸다. 이때, 각 n에 따라서 왼쪽 3열은 기존 데이터에 필터링의 범위(range)를 20, 30, 40으로 정하고 한 개에 범위에 대해서 적당한 대역폭(width) 값을 정하여 필터링을 적용한 뒤 학습한 결과이다. 오른쪽 3열은 데이터 필터링과 기존의 데이터 증강 기법을 함께 적용하여 학습한 결과이다. 이를 정리하면 아래 표 1과 같다.7 shows the change trend of verification data loss after data augmentation at a ratio of 4:1 applying a ButterWorth cut-off filter at n = 2, 5, and 10 from the top. At this time, according to each n, the left 3 columns are the result of learning after applying filtering by setting the filtering range to 20, 30, and 40 for the existing data and setting an appropriate bandwidth value for one range. The third column on the right is the result of learning by applying both data filtering and existing data augmentation techniques. This is summarized in Table 1 below.

<표 1><Table 1>

도 8은 본 발명의 또 다른 실시예에 따른 증강 기법을 적용한 학습 결과와 종래기술에 따른 학습 결과를 비교하는 도면이다. 8 is a diagram comparing a learning result using an augmentation technique according to another embodiment of the present invention with a learning result according to the prior art.

도 8은 상단부터 버터워스(ButterWorth) 차단 필터를 n = 2, 5, 10으로 적용한 2:1의 비율로 데이터 증강한 뒤 검증 데이터 손실의 변화 추이를 나타낸다. 이때, 각 n에 따라서 왼쪽 3열은 기존 데이터에 필터링의 범위(range)를 20, 30, 40으로 정하고 한 개에 범위에 대해서 적당한 대역폭(width) 값을 정하여 필터링을 적용한 뒤 학습한 결과이다. 오른쪽 3열은 데이터 필터링과 기존의 데이터 증강 기법을 함께 적용하여 학습한 결과. 이를 정리하면 아래 표 2와 같다.8 shows the change trend of verification data loss after data augmentation at a ratio of 2:1 by applying a ButterWorth cut-off filter at n = 2, 5, and 10 from the top. At this time, according to each n, the left 3 columns are the result of learning after applying filtering by setting the filtering range to 20, 30, and 40 for the existing data and setting an appropriate bandwidth value for one range. The third column on the right is the result of learning by applying both data filtering and existing data augmentation techniques. This is summarized in Table 2 below.

<표 2><Table 2>

다음은 주파수 도메인 필터링을 이용한 데이터 증강 기법을 사용 시 데이터 증가 비율을 4:1, 2:1 비율로 나누어 필터링을 적용하는 과정이다.The following is a process of applying filtering by dividing the data augmentation ratio into 4:1 and 2:1 ratios when using the data augmentation technique using frequency domain filtering.

도 9는 본 발명의 일 실시예에 따른 데이터 증강 비율에 대한 분포 결과를 나타내는 도면이다. 9 is a diagram showing distribution results for data augmentation ratios according to an embodiment of the present invention.

도 9(a)는 데이터 증강을 4:1로 하여 데이터 증강을 수행했을 때의 분포이고, 도 9(b)는 데이터 증강을 2:1로 하여 데이터 증강을 수행했을 때의 분포이다. 결과는 도 7에 나타내었다. 9(a) is a distribution when data augmentation is performed with a data augmentation ratio of 4:1, and FIG. 9(b) is a distribution when data augmentation is performed with a data augmentation ratio of 2:1. Results are shown in FIG. 7 .

모든 그래프에서 내부 통과 필터의 크기가 커질수록 검증 데이터의 손실이 더욱 안정적으로 수렴하고 있는 것을 볼 수 있다. 이는 영상을 푸리에 변환을 적용하고 중심 이동을 하게 되면 영상의 정보가 중심으로 전부 옮겨지게 되는 현상이 생기게 되는 데 이때 영상의 중심을 dc항이라고 한다. 따라서, 변환된 영상의 중심에 가까울수록 영상의 정보가 더 많이 존재한다는 것을 알 수 있다. 이때, 내부 통과 필터의 크기는 영상의 중심으로 하는 원을 그리는 통과 필터로서 대역폭(width)이 커질수록 영상의 정보를 많이 보존한다고 볼 수 있다. 이러한 이유로 내부 통과 필터의 크기가 큰 필터가 작은 필터에 비해 검증 데이터의 손실이 잘 수렴하는 이유가 된다. In all graphs, it can be seen that the loss of verification data converges more stably as the size of the inner pass filter increases. In this case, when Fourier transform is applied to the image and the center is shifted, a phenomenon occurs in which all information of the image is shifted to the center. At this time, the center of the image is called a dc term. Accordingly, it can be seen that the closer to the center of the transformed image, the more information of the image exists. At this time, the size of the inner pass filter is a pass filter that draws a circle centered on the image, and it can be seen that as the bandwidth increases, more image information is preserved. For this reason, the loss of verification data converges better in a filter with a larger inner pass filter than in a filter with a small size.

이때, 동일한 이유로 차단 필터의 크기가 커질수록 영상의 많은 정보를 차단하지 않기 때문에 차단 필터의 크기가 커질수록 검증 데이터의 손실이 더 잘 수렴하는 이유가 된다.At this time, for the same reason, the larger the size of the blocking filter is, the more information of the image is not blocked, so the larger the size of the blocking filter is, the better the loss of verification data converges.

하지만, 여전히 훈련 데이터에 대한 정확도는 거의 100%이지만 검증 데이터의 정확도는 대체적으로 87% ~ 91% 사이에 있다. 이는 여전히 데이터의 편향성이 제거되지 않았음을 의미한다. However, while the accuracy on the training data is still almost 100%, the accuracy on the validation data is generally between 87% and 91%. This means that the bias in the data is still not removed.

마지막으로 동일한 데이터 증강 비율에 대해서 기존의 데이터 증강 방식과 본 발명에서 제안한 방식을 함께 적용한 결과를 확인한다. 기존의 데이터 증강 방식만 적용한 경우보다 훈련 데이터와 검증 데이터의 정확도가 약 10% 정도 증가했으며0.9 근처에서 수렴하지 않고 0.6 근처에서 수렴하게 되는 결과를 보여주고 있다.Finally, the result of applying the existing data augmentation method and the method proposed in the present invention together for the same data augmentation ratio is confirmed. Compared to the case where only the existing data augmentation method was applied, the accuracy of the training data and verification data increased by about 10%, and it shows the result of convergence around 0.6 instead of convergence around 0.9.

2:1비율로 증강된 데이터 역시 훈련 데이터와 검증 데이터의 정확도가 약 10% 정도 증가했으면 0.3~0.4근처에서 검증 데이터의 손실이 수렴하는 것을 볼 수 있다. If the data augmented at a ratio of 2:1 also increases the accuracy of the training data and verification data by about 10%, it can be seen that the loss of the verification data converges around 0.3 to 0.4.

도 10은 본 발명의 일 실시예에 따른 주파수 도메인 필터링 및 데이터 증강 방식을 적용한 결과를 나타내는 도면이다.10 is a diagram showing a result of applying frequency domain filtering and data augmentation method according to an embodiment of the present invention.

도 10(a)는 주파수 도메인 필터링만 적용한 결과이고, 도 10(b)는 주파수 도메인 필터링과 기존 데이터 증강 방식을 함께 적용한 결과이다. 10(a) is a result of applying only frequency domain filtering, and FIG. 10(b) is a result of applying both frequency domain filtering and an existing data augmentation method.

본 발명에서는 데이터의 편향성을 제거하기 위해서 기존의 데이터 증강 방식이 아닌 주파수 도메인에서의 필터링을 통해 새로운 영상을 생성하는 방법을 제시하고 이를 바탕으로 VGG16을 통해 학습하는 과정을 보았다. 특히, 본 발명에서는 기존의 데이터 증강 방법을 사용하기 전에 주파수 도메인에서의 필터링을 적용한 새로운 영상을 추가하여 사용하는 것을 제안한다. 비록 몇몇 피부암 데이터에 대해서는 큰 효과를 보지는 못하였지만 가장 데이터의 개수가 작았던 피부암의 경우 약 20% 정도 오답율이 감소한 것을 볼 수 있었다. 이를 통해서 작은 데이터라고 하더라도 다른 데이터 증강 방법을 이용하여 학습한다면 더 좋은 성능을 가질 수 있다고 볼 수 있다. 본 발명에서는 그레이 스케일 영상을 이용하여 학습을 진행하였으나, 칼라 영상을 이용한 학습도 가능하다. In the present invention, in order to remove the bias of data, a method of generating a new image through filtering in the frequency domain rather than the existing data augmentation method was presented, and the process of learning through VGG16 based on this was presented. In particular, the present invention proposes adding and using a new image to which filtering in the frequency domain is applied before using the existing data augmentation method. Although it did not show a significant effect on some skin cancer data, it was found that the wrong answer rate decreased by about 20% in the case of skin cancer, which had the smallest number of data. Through this, it can be seen that even small data can have better performance if it is learned using other data augmentation methods. In the present invention, learning is performed using gray scale images, but learning using color images is also possible.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device. can be embodied in Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

Claims

performing filtering in the frequency domain by applying a Fourier transform to remove data bias;
generating a new image using the filtered image; and
Performing learning based on the generated image
including,
Performing filtering in the frequency domain by applying a Fourier transform to remove the bias of the data,
In order to prevent overfitting by solving the imbalance problem of data for an image, the center of the image to which the 2D Fourier transform is applied in the frequency space is shifted, and a predetermined filtering range for the center-shifted image and a bandwidth for the range After augmenting the data using a ButterWorth cut-off filter, preprocessing is performed to obtain a filtered image through pixel-by-pixel multiplication
Data augmentation method.

delete

According to claim 1,
The step of generating a new image using the filtered image is,
In order to convert the image filtered in the frequency domain back to the spatial domain, a second-order inverse discrete Fourier transform is applied, and the center of the image transformed to the spatial domain is shifted.
Data augmentation method.

a filtering unit performing filtering in a frequency domain by applying a Fourier transform to remove data bias;
An image generator for generating a new image using the filtered image; and
A learning unit that performs learning based on the generated image
including,
The filtering unit,
In order to prevent overfitting by solving the imbalance problem of data for an image, the center of the image to which the 2D Fourier transform is applied in the frequency space is shifted, and a predetermined filtering range for the center-shifted image and a bandwidth for the range After augmenting the data using a ButterWorth cut-off filter, preprocessing is performed to obtain a filtered image through pixel-by-pixel multiplication
Data Augmentation Device.

delete

According to claim 5,
video generator,
In order to convert the image filtered in the frequency domain back to the spatial domain, a second-order inverse discrete Fourier transform is applied, and the center of the image transformed to the spatial domain is shifted.
Data Augmentation Device.