KR102615055B1

KR102615055B1 - Adversarial example restoration system and adversarial example restoration method

Info

Publication number: KR102615055B1
Application number: KR1020220096483A
Authority: KR
Inventors: 정민영; 정승환; 신영길
Original assignee: 숭실대학교 산학협력단; 서울대학교 산학협력단
Priority date: 2022-08-03
Filing date: 2022-08-03
Publication date: 2023-12-19
Also published as: WO2024029669A1

Abstract

개시된 발명의 일 측면에 따른 적대적 이미지 복원 시스템은, 원본 이미지에 노이즈가 추가되어 생성된 적대적 변형 이미지를 수신하도록 구성되는 이미지 수신 모듈; 제1 인공지능 모델을 이용하여 상기 적대적 변형 이미지로부터 노이즈를 제거하여 노이즈 제거 이미지를 생성하도록 구성되는 노이즈 제거 모듈; 및 상기 원본 이미지에 대응되는 푸리에 스펙트럼 출력값 및 상기 노이즈 제거 이미지에 대응되는 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 상기 제1 인공지능 모델을 학습하도록 구성되는 푸리에 기반 기계학습 모듈을 포함할 수 있다.An adversarial image restoration system according to one aspect of the disclosed invention includes an image reception module configured to receive an adversarial modified image generated by adding noise to an original image; a noise removal module configured to remove noise from the adversarial deformed image using a first artificial intelligence model to generate a denoised image; And a Fourier-based machine learning module configured to learn the first artificial intelligence model through a machine learning method based on the Fourier spectrum output value corresponding to the original image and the Fourier spectrum output value corresponding to the noise-removed image. .

Description

Adversarial image restoration system and adversarial image restoration method {ADVERSARIAL EXAMPLE RESTORATION SYSTEM AND ADVERSARIAL EXAMPLE RESTORATION METHOD}

본 발명은 적대적 공격에 이용되는 적대적으로 변형된 이미지를 정상적인 이미지로 복원할 수 있는 적대적 이미지 복원 시스템 및 적대적 이미지 복원 방법에 관한 것이다.The present invention relates to a hostile image restoration system and a hostile image restoration method that can restore a hostilely modified image used in a hostile attack to a normal image.

적대적 변형된 이미지란 이미지 분류를 위한 심층 신경망(Deep Neural Network; DNN)이 원래 클래스가 아닌 다른 클래스로 오인식하도록 입력 이미지에 사람이 인식할 수 없는 적대적 변형(Adversarial perturbation)을 추가하여 생성되는 이미지를 의미하며, 적대적 사례(Adversarial Example)라 부르기도 한다. 이러한 적대적 사례로부터 심층 신경망 네트워크를 방어하기 위한 방법으로 Gradient Obfuscation 방법, Robust Optimization 방법, Adversarial Example Detection 방법 및 Gradient obfuscation 방법 등이 있다.An adversarial transformed image is an image created by adding adversarial perturbation that cannot be recognized by humans to the input image so that a deep neural network (DNN) for image classification misrecognizes it as a class other than the original class. It is also called an adversarial example. Methods for defending deep neural networks from such adversarial examples include the Gradient Obfuscation method, Robust Optimization method, Adversarial Example Detection method, and Gradient obfuscation method.

Gradient Obfuscation 방법은 그라디언트를 계산하기 어렵도록 딥러닝 모델을 학습시켜, 그라디언트 기반 생성 방법으로부터 딥러닝 모델을 방어하는 방법이다. 그러나 이러한 방법은 여전히 적대적 사례 생성 방법에 취약하다는 단점이 있다.The Gradient Obfuscation method is a method of protecting the deep learning model from gradient-based generation methods by training the deep learning model so that it is difficult to calculate the gradient. However, these methods still have the disadvantage of being vulnerable to adversarial example generation methods.

Robust Optimization 방법은 딥러닝 모델을 학습할 때, 적대적 사례를 같이 사용하여 학습하는 방법이다. Robust optimization 방법은 적대적 공격 방법에 대해 좋은 성능을 보여주지만, 비적대적 이미지에 대해서는 분류 정확도가 내려가는 문제가 있다.The Robust Optimization method is a method of learning by using adversarial examples when learning a deep learning model. Robust optimization methods show good performance against adversarial attack methods, but have the problem of lowering classification accuracy for non-hostile images.

Adversarial Example Detection 방법은 비적대적 사례 이미지 및 적대적 사례 이미지를 분석하고 그 둘을 분류하는 방법이다. RGB 형태의 입력 이미지를 비적대적/적대적으로 분류하는 방법 및 주성분 분석과 같이 통계적 속성을 분석하여 분류하는 방법 등이 있으나, 적대적 사례로 분류된 이미지는 사용하지 않고 버리기에 데이터의 낭비가 발생한다는 단점이 있다.The Adversarial Example Detection method is a method of analyzing non-hostile example images and hostile example images and classifying them. There are methods for classifying input images in RGB format as non-adversarial/hostile and methods for classifying them by analyzing statistical properties such as principal component analysis, but the disadvantage is that images classified as hostile cases are discarded without being used, resulting in waste of data. There is.

본 발명은 적대적 이미지를 복원할 때 종래의 방법보다 원본 이미지와의 차이가 덜 나도록 노이즈를 제거하여 종래의 이미지 분류 방법보다 정확하게 입력 이미지를 분류할 수 있는 적대적 이미지 복원 시스템 및 적대적 이미지 복원 방법을 제공하기 위한 것이다.The present invention provides an adversarial image restoration system and an adversarial image restoration method that can classify input images more accurately than conventional image classification methods by removing noise so that there is less difference from the original image than conventional methods when restoring adversarial images. It is for this purpose.

또한, 본 발명은 적대적 이미지에서 노이즈가 제거되어 복원된 이미지를 기초로 적대적 이미지 탐지를 수행하여 적대적 이미지 탐지의 정확도를 높일 수 있는 적대적 이미지 복원 시스템 및 적대적 이미지 복원 방법을 제공하기 위한 것이다.In addition, the present invention is intended to provide a hostile image restoration system and a hostile image restoration method that can increase the accuracy of hostile image detection by performing hostile image detection based on the image restored by removing noise from the hostile image.

마지막으로, 본 발명은 적대적 이미지로 판단된 입력 이미지를 버리지 않고 추정된 노이즈를 기초로 노이즈 제거 이미지로 복원하여 이미지 분류 딥러닝 모델의 학습에 활용 가능하게 할 수 있는 적대적 이미지 복원 시스템 및 적대적 이미지 복원 방법을 제공하기 위한 것이다.Lastly, the present invention is an adversarial image restoration system and adversarial image restoration that can be used for learning an image classification deep learning model by restoring a noise-removed image based on the estimated noise without discarding the input image determined to be an adversarial image. It is intended to provide a method.

또한, 상기 푸리에 기반 기계학습 모듈은, 입력된 이미지로부터 복수개 단계의 계층 구조를 통해 특징을 추출하는 기계학습방식으로 상기 입력된 이미지를 분류하도록 구성되는 이미지 분류기로부터, 이미지 분류 과정에서 생성되는 중간 계층에서의 출력값인 중간 계층 출력값을 전달받도록 구성되는 푸리에 변환부를 포함할 수 있다.In addition, the Fourier-based machine learning module is an intermediate layer generated during the image classification process from an image classifier configured to classify the input image using a machine learning method that extracts features from the input image through a hierarchical structure of a plurality of stages. It may include a Fourier transform unit configured to receive the middle layer output value, which is the output value from .

또한, 상기 이미지 분류기는, 입력된 상기 원본 이미지를 기계학습방식으로 특징을 추출하는 과정에서 제1 중간 계층 출력값을 생성하고, 입력된 상기 노이즈 제거 이미지를 기계학습방식으로 특징을 추출하는 과정에서 제2 중간 계층 출력값을 생성하도록 구성되고, 상기 푸리에 변환부는: 상기 제1 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제1 푸리에 스펙트럼 출력값을 생성하고; 그리고 상기 제2 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제2 푸리에 스펙트럼 출력값을 생성하도록 구성될 수 있다.In addition, the image classifier generates a first intermediate layer output value in the process of extracting features from the input original image using a machine learning method, and generates a first intermediate layer output value in the process of extracting features from the input denoised image using a machine learning method. configured to generate two middle layer output values, wherein the Fourier transform unit: performs Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing a frequency domain; And it may be configured to perform Fourier transformation on the second middle layer output value to generate a second Fourier spectrum output value representing the frequency domain.

또한, 상기 푸리에 기반 기계학습 모듈은, 상기 제1 푸리에 스펙트럼 출력값 및 상기 제2 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 상기 제1 인공지능 모델을 학습하도록 구성되는 기계학습부를 포함할 수 있다.Additionally, the Fourier-based machine learning module may include a machine learning unit configured to learn the first artificial intelligence model through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value.

또한, 상기 기계학습부는: 상기 원본 이미지의 제1 푸리에 스펙트럼 출력값 및 상기 원본 이미지에 대응되는 상기 노이즈 제거 이미지의 제2 푸리에 스펙트럼 출력값을 기초로 손실 함수를 연산하고; 그리고 학습이 반복되면서 상기 손실 함수가 감소하게 상기 제1 인공지능 모델을 학습하도록 구성될 수 있다.In addition, the machine learning unit: calculates a loss function based on a first Fourier spectrum output value of the original image and a second Fourier spectrum output value of the noise-removed image corresponding to the original image; And the first artificial intelligence model may be configured to learn so that the loss function decreases as learning is repeated.

또한, 검사 대상 이미지가 적대적 변형 이미지인지 여부를 판단하도록 구성되는 적대적 사례 분류 모듈을 더 포함할 수 있다.In addition, it may further include a hostile case classification module configured to determine whether the image to be inspected is a hostile modified image.

또한, 상기 이미지 수신 모듈은, 상기 검사 대상 이미지를 수신하도록 구성되고, 상기 노이즈 제거 모듈은, 미리 학습된 상기 제1 인공지능 모델을 이용하여 상기 검사 대상 이미지를 기초로 노이즈 제거 검사 대상 이미지를 생성하도록 구성되고, 상기 적대적 사례 분류 모듈은, 상기 검사 대상 이미지와 상기 노이즈 제거 검사 대상 이미지가 합성되어 생성된 합성 이미지를 기초로, 제2 인공지능 모델을 이용하여 상기 검사 대상 이미지가 적대적 변형 이미지인지 여부를 판단하도록 구성될 수 있다.In addition, the image receiving module is configured to receive the inspection target image, and the noise removal module generates a noise removal inspection target image based on the inspection target image using the first artificial intelligence model learned in advance. The hostile case classification module is configured to determine whether the inspection target image is an adversarial modified image using a second artificial intelligence model, based on a composite image generated by combining the inspection target image and the noise removal inspection target image. It can be configured to determine whether or not.

또한, 상기 적대적 사례 분류 모듈은: 상기 검사 대상 이미지가 적대적 변형 이미지로 판단되면, 상기 노이즈 제거 검사 대상 이미지가 분류되도록 상기 노이즈 제거 검사 대상 이미지를 상기 이미지 분류기로 전달하고; 그리고 상기 검사 대상 이미지가 정상 이미지로 판단되면, 상기 검사 대상 이미지가 분류되도록 상기 검사 대상 이미지를 상기 이미지 분류기로 전달할 수 있다.In addition, the adversarial case classification module: if the inspection target image is determined to be an adversarial modified image, transmits the noise removal inspection target image to the image classifier so that the noise removal inspection target image is classified; And if the inspection target image is determined to be a normal image, the inspection target image may be transmitted to the image classifier so that the inspection target image is classified.

개시된 발명의 일 측면에 따른 적대적 이미지 복원 방법은, 적대적 이미지 복원 시스템의 동작방법으로서, 원본 이미지에 노이즈가 추가되어 생성된 적대적 변형 이미지를 수신하는 단계; 제1 인공지능 모델을 이용하여 상기 적대적 변형 이미지로부터 노이즈를 제거하여 노이즈 제거 이미지를 생성하는 단계; 및 상기 원본 이미지에 대응되는 푸리에 스펙트럼 출력값 및 상기 노이즈 제거 이미지에 대응되는 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 상기 제1 인공지능 모델을 학습하는 단계를 포함하고, 상기 제1 인공지능 모델을 학습하는 단계는: 입력된 이미지로부터 복수개 단계의 계층 구조를 통해 특징을 추출하는 기계학습방식으로 상기 입력된 이미지를 분류하도록 구성되는 이미지 분류기에 의해, 입력된 상기 원본 이미지를 기계학습방식으로 특징을 추출하는 과정에서 제1 중간 계층 출력값을 생성하고, 입력된 상기 노이즈 제거 이미지를 기계학습방식으로 특징을 추출하는 과정에서 제2 중간 계층 출력값을 생성하는 단계; 상기 제1 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제1 푸리에 스펙트럼 출력값을 생성하는 단계; 상기 제2 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제2 푸리에 스펙트럼 출력값을 생성하는 단계; 및 상기 제1 푸리에 스펙트럼 출력값 및 상기 제2 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 상기 제1 인공지능 모델을 학습하는 단계를 포함할 수 있다.An adversarial image restoration method according to one aspect of the disclosed invention is a method of operating an adversarial image restoration system, comprising: receiving an adversarial modified image generated by adding noise to an original image; generating a noise-removed image by removing noise from the adversarial transformation image using a first artificial intelligence model; And a step of learning the first artificial intelligence model through a machine learning method based on the Fourier spectrum output value corresponding to the original image and the Fourier spectrum output value corresponding to the noise-removed image, and the first artificial intelligence model The learning step is: an image classifier configured to classify the input image using a machine learning method that extracts features from the input image through a hierarchical structure of a plurality of stages, and the features of the input original image are classified using a machine learning method. Generating a first intermediate layer output value in the process of extracting, and generating a second intermediate layer output value in the process of extracting features from the input noise-removed image using a machine learning method; performing Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing a frequency domain; performing Fourier transform on the second middle layer output value to generate a second Fourier spectrum output value representing the frequency domain; And it may include learning the first artificial intelligence model through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value.

또한, 개시된 발명의 일 측면에 따른 컴퓨터 프로그램은, 상기 적대적 이미지 복원 방법을 실행시키도록 컴퓨터로 판독 가능한 비일시적 기록매체에 저장될 수 있다.Additionally, the computer program according to one aspect of the disclosed invention may be stored in a non-transitory computer-readable recording medium to execute the hostile image restoration method.

개시된 발명의 일 측면에 따르면, 적대적 이미지를 복원할 때 종래의 방법보다 원본 이미지와의 차이가 덜 나도록 노이즈를 제거하여 종래의 이미지 분류 방법보다 정확하게 입력 이미지를 분류할 수 있다.According to one aspect of the disclosed invention, when restoring an adversarial image, noise is removed so that there is less difference from the original image than in the conventional method, so that the input image can be classified more accurately than the conventional image classification method.

또한, 본 발명의 실시예에 의하면, 적대적 이미지에서 노이즈가 제거되어 복원된 이미지를 기초로 적대적 이미지 탐지를 수행하여 적대적 이미지 탐지의 정확도를 높일 수 있다.Additionally, according to an embodiment of the present invention, the accuracy of hostile image detection can be improved by performing hostile image detection based on an image restored by removing noise from the hostile image.

또한, 적대적 이미지로 판단된 입력 이미지를 버리지 않고 추정된 노이즈를 기초로 노이즈 제거 이미지로 복원하여 이미지 분류 딥러닝 모델의 학습에 활용 가능하게 할 수 있다.In addition, instead of discarding the input image determined to be a hostile image, it can be restored to a noise-removed image based on the estimated noise and used for training an image classification deep learning model.

도 1은 일 실시예에 따른 적대적 이미지 복원 시스템의 구성도이다.
도 2는 일 실시예에 따른 원본 이미지 및 적대적 변형 이미지가 어느 모듈에 전달되는지를 도시한 도면이다.
도 3은 일 실시예에 따른 인공지능 모델을 학습하는 과정 및 적대적 사례 분류 과정을 설명하기 위한 도면이다.
도 4는 일 실시예에 따른 적대적 사례 이미지에 대응되는 노이즈(RGB) 및 푸리에 스펙트럼 출력값들의 차이(FFT(Layer))를 도시한 도면이다.
도 5는 일 실시예에 따른 적대적 이미지 복원 방법의 순서도이다.
도 6은 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지에 대한 분류 성능을 나타낸 표이다.
도 7은 일 실시예에 따른 적대적 이미지 복원 방법이 종래의 복원 방법에 비해 개선된 정도를 나타낸 표이다.
도 8은 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지에 대한 분류 결과를 도시한 도면이다.
도 9는 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지에 대한 분류 성능을 나타낸 그래프이다.1 is a configuration diagram of an adversarial image restoration system according to an embodiment.
Figure 2 is a diagram showing which module the original image and the adversarial modified image are delivered to, according to one embodiment.
Figure 3 is a diagram illustrating a process of learning an artificial intelligence model and a process of classifying adversarial cases according to an embodiment.
FIG. 4 is a diagram illustrating the difference (FFT (Layer)) between noise (RGB) and Fourier spectrum output values corresponding to an adversarial example image according to an embodiment.
Figure 5 is a flowchart of an adversarial image restoration method according to an embodiment.
Figure 6 is a table showing the classification performance of an image classifier for noise-removed images according to an embodiment.
Figure 7 is a table showing the degree of improvement of the adversarial image restoration method according to an embodiment compared to the conventional restoration method.
FIG. 8 is a diagram illustrating classification results for a noise-removed image by an image classifier according to an embodiment.
Figure 9 is a graph showing the classification performance of a noise-removed image of an image classifier according to an embodiment.

명세서 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다. 본 명세서가 실시예들의 모든 요소들을 설명하는 것은 아니며, 개시된 발명이 속하는 기술분야에서 일반적인 내용 또는 실시예들 간에 중복되는 내용은 생략한다. 명세서에서 사용되는 '~부' 또는 '~모듈'이라는 용어는 소프트웨어 또는 하드웨어로 구현될 수 있으며, 실시예들에 따라 복수의 '~부' 또는 '~모듈'이 하나의 구성요소로 구현되거나, 하나의 '~부' 또는 '~모듈'이 복수의 구성요소들을 포함하는 것도 가능하다.Like reference numerals refer to like elements throughout the specification. This specification does not describe all elements of the embodiments, and general content or overlapping content between the embodiments in the technical field to which the disclosed invention pertains is omitted. The term '~unit' or '~module' used in the specification may be implemented as software or hardware, and depending on the embodiment, a plurality of '~unit' or '~module' may be implemented as a single component, or It is also possible for one '~part' or '~module' to include multiple components.

또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 제1, 제2 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위해 사용되는 것으로, 구성요소가 전술된 용어들에 의해 제한되는 것은 아니다. 단수의 표현은 문맥상 명백하게 예외가 있지 않는 한, 복수의 표현을 포함한다.Additionally, when a part "includes" a certain component, this means that it may further include other components rather than excluding other components, unless specifically stated to the contrary. Terms such as first and second are used to distinguish one component from another component, and the components are not limited by the above-mentioned terms. Singular expressions include plural expressions unless the context clearly makes an exception.

각 단계들에 있어 식별부호는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 실시될 수 있다. 이하 첨부된 도면들을 참고하여 개시된 발명의 작용 원리 및 실시예들에 대해 설명한다.The identification code for each step is used for convenience of explanation. The identification code does not explain the order of each step, and each step may be performed differently from the specified order unless a specific order is clearly stated in the context. there is. Hereinafter, the operating principle and embodiments of the disclosed invention will be described with reference to the attached drawings.

도 1은 일 실시예에 따른 적대적 이미지 복원 시스템의 구성도이다.1 is a configuration diagram of an adversarial image restoration system according to an embodiment.

도 1을 참조하면, 본 발명의 실시예에 따른 적대적 이미지 복원 시스템(100)은 이미지 수신 모듈(110), 노이즈 제거 모듈(120), 적대적 사례 분류 모듈(130), 푸리에 기반 기계학습 모듈(140) 및 메모리(150)를 포함할 수 있다.Referring to FIG. 1, the adversarial image restoration system 100 according to an embodiment of the present invention includes an image reception module 110, a noise removal module 120, an adversarial example classification module 130, and a Fourier-based machine learning module 140. ) and memory 150.

적대적 이미지 복원 시스템(100)은 수신된 적대적 변형 이미지(400)로부터 노이즈를 제거하여 노이즈 제거 이미지(500)를 생성하는데 이용되는 제1 인공지능 모델(151)을 학습하도록 구성되는 시스템일 수 있다. 본 발명의 실시예에 따른 적대적 이미지 복원 시스템(100)은 따로 마련된 이미지 복원 장치에 마련된 시스템일 수도 있고, 서버에 마련된 시스템일 수도 있다.The adversarial image restoration system 100 may be a system configured to learn a first artificial intelligence model 151 used to remove noise from the received adversarial deformed image 400 and generate a denoised image 500. The hostile image restoration system 100 according to an embodiment of the present invention may be a system provided in a separate image restoration device or a system provided in a server.

적대적 변형 이미지(400)란 이미지 분류를 수행하는 이미지 분류기(200)의 심층 신경망이 원래 클래스가 아닌 다른 클래스로 오인식하도록 정상적인 원본 이미지(300)에 사람이 인식할 수 없는 적대적 변형(Adversarial perturbation), 즉 노이즈가 추가되어 생성된 이미지이다.The adversarial perturbation image 400 is an adversarial perturbation that cannot be recognized by humans on the normal original image 300 so that the deep neural network of the image classifier 200, which performs image classification, misrecognizes it as a class other than the original class. In other words, it is an image created by adding noise.

이미지를 분석하거나 탐지하는 기술 중에서는 이웃 픽셀 간에 유사한 값을 가지는 이미지의 특성을 활용한 탐지 기법(예를 들어, Steganalysis 기반 탐지 기술)이 있다. 이러한 이웃 픽셀 간에 유사한 값을 가지는 이미지의 특성을 활용한 탐지 기법은 탐지 대상 이미지에 노이즈가 추가되어 있는 적대적 변형 이미지(400)에 대해서는 탐지 성능이 떨어지게 된다. 구체적으로, 이웃 픽셀 간에 유사한 값을 가지는 이미지의 특성을 활용한 탐지 기법은 각 픽셀의 8방향 인접 픽셀 중 2방향 이상 값의 차이가 크면 경계로 판단하는데, 이미지 내 객체의 경계 부분에 노이즈가 있으면 이러한 탐지를 회피할 수 있게 된다.Among the techniques for analyzing or detecting images, there is a detection technique (for example, Steganalysis-based detection technique) that utilizes the characteristics of images that have similar values between neighboring pixels. Detection techniques that utilize the characteristics of images with similar values between neighboring pixels have poor detection performance for the adversarial deformed image 400 in which noise is added to the detection target image. Specifically, a detection technique that utilizes the characteristics of images with similar values between neighboring pixels determines it as a border if the difference in values in two or more directions among the eight neighboring pixels of each pixel is large. If there is noise at the border of an object in the image, This detection can be avoided.

이미지 수정자(Modifier)는 정상 이미지에 노이즈(Perturbation)가 추가되는 방식으로 적대적 변형 이미지(400)를 생성할 수 있다. 이렇게 생성된 적대적 변형 이미지(400)는 이웃 픽셀 간에 유사한 값을 가지는 이미지의 특성을 활용한 탐지 기법에 의해서는 객체의 경계를 경계로 인식 못하게 될 수 있다. 적대적 변형 이미지(400)가 이미지 분류기(200)에 입력되면 성능 좋은 이미지 분류기(200)라 해도 이미지 분류 성능이 떨어진다는 문제가 발생할 수 있다. 뿐만 아니라 특정한 인공지능 모델에 대해서 적대적 변형 이미지(400)에 의한 의도적인 전이 공격의 문제가 발생할 수 있다.An image modifier can generate an adversarial modified image 400 by adding noise (perturbation) to a normal image. The adversarial deformation image 400 created in this way may not recognize the boundary of the object as a boundary using a detection technique that utilizes the characteristics of images with similar values between neighboring pixels. When an adversarial modified image 400 is input to the image classifier 200, a problem may occur in which image classification performance is poor even if the image classifier 200 has good performance. In addition, the problem of intentional transfer attacks by hostile modified images 400 may occur for specific artificial intelligence models.

어떠한 인공지능 모델에 대해서 공격자는 네트워크 구조, 학습 데이터 셋과 같은 공격 대상인 인공지능 모델에 관한 어떠한 정보도 알고 있지 않으므로 일반적인 공격이 불가능하다. 즉, 화이트박스(white-box) 방식의 공격은 일반적으로 불가능하다. 하지만, 공격자가 어떤 인공지능 모델(Model A)에 대해서 적대적 변형 이미지(400)를 통한 공격을 성공시키면, 전이성(Transferability)의 특성을 이용하여 다른 비슷한 모델(Model B)에 대해서도 적대적 변형 이미지(400)를 통한 공격을 성공시킬 수 있다. 즉, 하나의 인공지능 모델을 속이는 적대적 예제는 다른 인공지능 모델을 속일 수 있다는 문제가 있다.For any artificial intelligence model, the attacker does not know any information about the artificial intelligence model that is the target of the attack, such as network structure or learning data set, so general attacks are impossible. In other words, white-box attacks are generally impossible. However, if an attacker successfully attacks an artificial intelligence model (Model A) using a hostile modified image (400), the hostile modified image (400) will be used against another similar model (Model B) using the transferability characteristic. ), the attack can be successful. In other words, there is a problem that an adversarial example that fools one artificial intelligence model can fool other artificial intelligence models.

따라서 이웃 픽셀 간에 유사한 값을 가지는 이미지의 특성을 활용한 탐지 기법을 쓰는 이미지 분류기(200)에 분류 대상 이미지를 입력하기 전에 해당 분류 대상 이미지가 적대적 변형 이미지(400)일 경우, 해당 적대적 변형 이미지(400)를 정상적인 이미지로 복원하여 이미지 분류기(200)에 입력하는 것이 바람직하다.Therefore, before inputting the image to be classified into the image classifier 200, which uses a detection technique that utilizes the characteristics of images with similar values between neighboring pixels, if the image to be classified is an adversarial transformed image 400, the corresponding adversarial transformed image ( It is desirable to restore 400) to a normal image and input it into the image classifier 200.

이미지 수신 모듈(110)은 적대적 변형 이미지(400)를 수신할 수 있다. 이미지 수신 모듈(110)이 적대적 변형 이미지(400)들을 수신하는 것은 사용자가 입력 단말을 통해 입력한 적대적 변형 이미지(400)들을 입력 단말로부터 전달받는 방식으로 수신하는 것일 수 있으나 이에 한정되지 않는다. 예를 들어, 이미지 수신 모듈(110)은 메모리(150)에 미리 저장되어 있던 적대적 변형 이미지(400)들을 전달받거나, 적대적 이미지 복원 시스템(100)에 포함된 통신부가 서버로부터 수신한 적대적 변형 이미지(400)들을 전달받는 방식으로 적대적 변형 이미지(400)를 수신할 수도 있다. 이미지 수신 모듈(110)은 수신한 적대적 변형 이미지(400)를 노이즈 제거 모듈(120)로 전달할 수 있다.The image reception module 110 may receive the adversarial modified image 400. The image reception module 110 receiving the adversarial modified images 400 may mean receiving the hostile modified images 400 input by the user through the input terminal, but is not limited to this. For example, the image reception module 110 receives the adversarial deformation images 400 previously stored in the memory 150, or the adversarial deformation image received from the server by the communication unit included in the hostile image restoration system 100 ( The hostile modified image 400 can also be received by receiving the adversarial modified image 400. The image reception module 110 may transmit the received adversarial modified image 400 to the noise removal module 120.

적대적 변형 이미지(400)는 정상적인 이미지로 복원하고자 하는 대상인 입력 이미지일 수도 있고, 제1 인공지능 모델(151) 및 제2 인공지능 모델(152)의 학습을 위해 입력되는 복수의 학습용 적대적 이미지일 수도 있다. 이때, 학습용 적대적 이미지는 미리 정보가 알려진 원본 이미지(300)에 미리 정보가 알려진 노이즈가 추가되어 생성된 적대적 변형 이미지(400)일 수 있다.The adversarial modified image 400 may be an input image that is to be restored to a normal image, or may be a plurality of hostile images for learning that are input for learning the first artificial intelligence model 151 and the second artificial intelligence model 152. there is. At this time, the adversarial image for learning may be an adversarial modified image 400 generated by adding noise with previously known information to the original image 300, for which information is already known.

노이즈 제거 모듈(120)은 제1 인공지능 모델(151)을 이용하여 적대적 변형 이미지(400)로부터 노이즈를 제거하여 노이즈 제거 이미지(500)를 생성하도록 구성될 수 있다. 노이즈 제거 이미지(500)는 적대적 이미지 복원 시스템(100)이 적대적 변형 이미지(400)를 기초로 생성된 이미지로서 본 발명의 이미지 분류기(200)뿐만 아니라 입력된 이미지를 기계학습 방식으로 분류하는 어떠한 딥러닝 모델에도 해당 노이즈 제거 이미지(500)의 분류를 위해 전달될 수 있다.The noise removal module 120 may be configured to remove noise from the adversarial transformation image 400 using the first artificial intelligence model 151 to generate the noise removal image 500. The noise removal image 500 is an image generated by the adversarial image restoration system 100 based on the adversarial deformed image 400, and is used not only by the image classifier 200 of the present invention, but also by any deep image that classifies the input image using a machine learning method. It may also be transmitted to the learning model for classification of the corresponding noise-removed image 500.

이미지 분류기(200)는 입력된 이미지로부터 복수개 단계의 계층 구조를 통해 특징을 추출하는 기계학습방식으로 입력된 이미지를 분류하도록 구성될 수 있다. 예를 들어, 이미지 분류기(200)는 이미지를 분류하는 딥러닝 모델을 학습하기 위해 여러 단계의 컨볼루션 계층(convolution layer)을 쌓은 CNN(Convolutional Neural Networks) 구조 또는 DNN(Deep Neural Networks) 구조를 활용하는 분류기일 수 있다.The image classifier 200 may be configured to classify the input image using a machine learning method that extracts features from the input image through a hierarchical structure of a plurality of stages. For example, the image classifier 200 utilizes a CNN (Convolutional Neural Networks) structure or a DNN (Deep Neural Networks) structure that stacks several stages of convolution layers to learn a deep learning model that classifies images. It may be a classifier that does this.

도 2는 일 실시예에 따른 원본 이미지 및 적대적 변형 이미지가 어느 모듈에 전달되는지를 도시한 도면이다.Figure 2 is a diagram showing which module the original image and the adversarial modified image are delivered to, according to one embodiment.

도 2를 참조하면, 원본 이미지(300)와 적대적 변형 이미지(400)에 의한 데이터 또는 출력값들이 어떠한 경로를 통해 각 모듈에 전달되는지 확인할 수 있다. 예를 들어, 적대적 변형 이미지(400)는 노이즈 제거 모듈(120)로 전달될 수 있다.Referring to FIG. 2, it can be seen through which path the data or output values of the original image 300 and the adversarial modified image 400 are transmitted to each module. For example, the adversarial modified image 400 may be passed to the noise removal module 120.

이미지 분류기(200)는 적대적 이미지 복원 시스템(100)에 입력된 적대적 변형 이미지(400)에 대응되는 원본 이미지(300)를 수신할 수 있다. 즉, 이미지 분류기(200)는 학습용 적대적 이미지에서 노이즈가 제거된 원본 이미지(300)를 수신할 수 있다.The image classifier 200 may receive the original image 300 corresponding to the adversarial modified image 400 input to the adversarial image restoration system 100. That is, the image classifier 200 can receive the original image 300 from which noise has been removed from the adversarial image for learning.

이미지 분류기(200)는 푸리에 기반 기계학습 모듈(FFT module)(140)에 마련될 수 있다. 하지만, 이미지 분류기(200)가 반드시 푸리에 기반 기계학습 모듈(140)에 마련되어야 하는 것은 아니며, 심지어 이미지 분류기(200)는 적대적 이미지 복원 시스템(100)의 외부에 별도로 마련될 수도 있다.The image classifier 200 may be provided in a Fourier-based machine learning module (FFT module) 140. However, the image classifier 200 does not necessarily have to be provided in the Fourier-based machine learning module 140, and the image classifier 200 may even be provided separately outside the adversarial image restoration system 100.

도 3은 일 실시예에 따른 인공지능 모델을 학습하는 과정 및 적대적 사례 분류 과정을 설명하기 위한 도면이며, 도 4는 일 실시예에 따른 적대적 사례 이미지에 대응되는 노이즈(RGB) 및 푸리에 스펙트럼 출력값들의 차이(FFT(Layer))를 도시한 도면이다.Figure 3 is a diagram for explaining the process of learning an artificial intelligence model and the adversarial example classification process according to an embodiment, and Figure 4 is a diagram showing noise (RGB) and Fourier spectrum output values corresponding to a hostile example image according to an embodiment. This is a diagram showing the difference (FFT (Layer)).

도 3을 참조하면, 푸리에 기반 기계학습 모듈(140)은 원본 이미지(300)에 대응되는 푸리에 스펙트럼 출력값 및 노이즈 제거 이미지(500)에 대응되는 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 제1 인공지능 모델(151)을 학습하도록 구성될 수 있다.Referring to FIG. 3, the Fourier-based machine learning module 140 performs a first artificial learning method using a machine learning method based on the Fourier spectrum output value corresponding to the original image 300 and the Fourier spectrum output value corresponding to the noise-removed image 500. It may be configured to learn the intelligence model 151.

푸리에 스펙트럼 출력값은 이미지 분류기(200)가 수신한 입력 이미지를 분류하는 과정에서 생성되는 중간 계층 출력값이 푸리에 변환된 출력 이미지일 수 있다. 즉, 원본 이미지(300)에 대응되는 푸리에 스펙트럼 출력값은 이미지 분류기(200)가 수신한 원본 이미지(300)를 분류하는 과정에서 생성되는 중간 계층 출력값이 푸리에 변환된 출력 이미지일 수 있다. 또한, 노이즈 제거 이미지(500)에 대응되는 푸리에 스펙트럼 출력값은 이미지 분류기(200)가 적대적 이미지 복원 시스템(100)으로부터 전달받은 노이즈 제거 이미지(500)를 분류하는 과정에서 생성되는 중간 계층 출력값이 푸리에 변환된 출력 이미지일 수 있다.The Fourier spectrum output value may be an output image obtained by Fourier transforming the middle layer output value generated in the process of classifying the input image received by the image classifier 200. That is, the Fourier spectrum output value corresponding to the original image 300 may be an output image obtained by Fourier transforming the middle layer output value generated in the process of classifying the original image 300 received by the image classifier 200. In addition, the Fourier spectrum output value corresponding to the noise-removed image 500 is the middle layer output value generated in the process of the image classifier 200 classifying the noise-removed image 500 received from the adversarial image restoration system 100. It may be an output image.

푸리에 기반 기계학습 모듈(FFT-based consistency module)(140)은 푸리에 변환부(141) 및 기계학습부(142)를 포함할 수 있다. 푸리에 변환부(141)는 이미지 분류기(200)로부터 이미지 분류 과정에서 생성되는 중간 계층에서의 출력값인 중간 계층 출력값을 전달받을 수 있다.The Fourier-based machine learning module (FFT-based consistency module) 140 may include a Fourier transform unit 141 and a machine learning unit 142. The Fourier transform unit 141 may receive an intermediate layer output value, which is an intermediate layer output value generated during the image classification process, from the image classifier 200.

이미지 분류기(200)는 입력된 원본 이미지(300)를 기계학습방식으로 특징을 추출하는 과정에서 제1 중간 계층 출력값을 생성할 수 있다. 예를 들어, 제1 중간 계층 출력값은 이미지 분류기(200)가 원본 이미지(300)를 여러 단계의 컨볼루션 계층을 쌓은 CNN 구조 또는 DNN 구조를 활용하여 분류하는 과정에서 생성되는 어느 한 미리 정해진 중간 단계의 컨볼루션 계층 출력값일 수 있다.The image classifier 200 may generate a first intermediate layer output value in the process of extracting features from the input original image 300 using a machine learning method. For example, the first intermediate layer output value is a predetermined intermediate step generated in the process of the image classifier 200 classifying the original image 300 using a CNN structure or DNN structure that stacks several stages of convolutional layers. It may be the output value of the convolution layer.

이미지 분류기(200)는 입력된 노이즈 제거 이미지(500)를 기계학습방식으로 특징을 추출하는 과정에서 제2 중간 계층 출력값을 생성할 수 있다. 예를 들어, 제2 중간 계층 출력값은 이미지 분류기(200)가 노이즈 제거 이미지(500)를 여러 단계의 컨볼루션 계층을 쌓은 CNN 구조 또는 DNN 구조를 활용하여 분류하는 과정에서 생성되는 어느 한 미리 정해진 중간 단계의 컨볼루션 계층 출력값일 수 있다.The image classifier 200 may generate a second intermediate layer output value in the process of extracting features from the input noise-removed image 500 using a machine learning method. For example, the second intermediate layer output value is a predetermined intermediate value generated in the process of the image classifier 200 classifying the denoised image 500 using a CNN structure or DNN structure that stacks several stages of convolutional layers. It may be the convolution layer output value of the step.

푸리에 변환부(141)는 제1 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제1 푸리에 스펙트럼 출력값을 생성할 수 있다. 푸리에 변환부(141)는 제2 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제2 푸리에 스펙트럼 출력값을 생성할 수 있다. 즉, 푸리에 변환부(141)는 전달받은 중간 계층 출력값의 픽셀을 푸리에 변환(Fast Fourier Transform; FFT)하여 주파수 영역을 나타내는 푸리에 스펙트럼 출력값을 생성할 수 있다.The Fourier transform unit 141 may perform Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing the frequency domain. The Fourier transform unit 141 may perform Fourier transform on the second middle layer output value to generate a second Fourier spectrum output value representing the frequency domain. That is, the Fourier transform unit 141 may perform Fast Fourier Transform (FFT) on the pixels of the received middle layer output value to generate a Fourier spectrum output value representing the frequency domain.

도 4를 참조하면, 적대적 사례 이미지에 대응되는 노이즈(RGB)가 추가되어 생성된 적대적 이미지(Projected gradient descent; PGD)를 기초로 푸리에 변환부(141)에 의해 생성되는 푸리에 스펙트럼 출력값들의 차이(FFT(Layer))를 확인할 수 있다. 이때, FFT가 나타내는 이미지는 푸리에 이미지 상에서 두 이미지 간의 차이를 나타내며, Layer가 나타내는 이미지는 제1 중간 계측 출력값과 제2 중간 계층 출력값 간의 차이이고, FFT(Layer)가 나타내는 이미지는 제1 푸리에 스펙트럼 출력값과 제2 푸리에 스펙트럼 출력값 간의 차이일 수 있다.Referring to FIG. 4, the difference (FFT) between Fourier spectrum output values generated by the Fourier transform unit 141 based on an adversarial image (projected gradient descent (PGD)) generated by adding noise (RGB) corresponding to the adversarial example image. (Layer)) can be checked. At this time, the image represented by FFT represents the difference between two images in the Fourier image, the image represented by Layer is the difference between the first intermediate measurement output value and the second intermediate layer output value, and the image represented by FFT (Layer) represents the first Fourier spectrum output value. It may be the difference between and the second Fourier spectrum output value.

푸리에 변환부(141)가 생성하는 제1 푸리에 스펙트럼 출력값은 제1 중간 계층 출력값을 구성하는 주파수 성분의 강도를 나타내는 이미지일 수 있다. 또한, 푸리에 변환부(141)가 생성하는 제2 푸리에 스펙트럼 출력값은 제2 중간 계층 출력값을 구성하는 주파수 성분의 강도를 나타내는 이미지일 수 있다.The first Fourier spectrum output value generated by the Fourier transform unit 141 may be an image representing the intensity of the frequency component constituting the first middle layer output value. Additionally, the second Fourier spectrum output value generated by the Fourier transform unit 141 may be an image representing the intensity of the frequency component constituting the second middle layer output value.

제1 푸리에 스펙트럼 출력값은 제1 중간 계층 출력값의 중심일수록 저주파를 의미하고, 제1 중간 계층 출력값의 외곽일수록 고주파를 의미하는 이미지일 수 있다. 또한, 제2 푸리에 스펙트럼 출력값은 제2 중간 계층 출력값의 중심일수록 저주파를 의미하고, 제2 중간 계층 출력값의 외곽일수록 고주파를 의미하는 이미지일 수 있다.The first Fourier spectrum output value may be an image where the center of the first middle layer output value indicates a lower frequency, and the closer the center of the first middle layer output value is, the higher the frequency is. Additionally, the second Fourier spectrum output value may be an image where the center of the second middle layer output value means lower frequencies, and the closer the second middle layer output value is to the outside, the higher the frequencies are.

푸리에 변환부(141)가 제1 중간 계층 출력값 또는 제2 중간 계층 출력값 중 어느 한 중간 계층 출력값을 2D 변환하여 제1 푸리에 스펙트럼 출력값 또는 제2 푸리에 스펙트럼 출력값 중 어느 한 푸리에 스펙트럼 출력값을 생성하는데 필요한 연산은 [방정식 1]과 같다.Operations required for the Fourier transform unit 141 to 2D transform either the first middle layer output value or the second middle layer output value to generate either the first Fourier spectrum output value or the second Fourier spectrum output value. is the same as [Equation 1].

[방정식 1][Equation 1]

[방정식 1]을 참조하면, 중간 계층 출력값의 각 픽셀의 값이 해당 픽셀의 좌표를 기초로 어떠한 방식으로 생성되는지 확인할 수 있다. 이때, M은 중간 계층 출력값의 가로 크기이고, N은 중간 계층 출력값의 세로 크기일 수 있다. 푸리에 스펙트럼 출력값에서 좌표(k,l)에서의 픽셀값은 푸리에 스펙트럼 출력값에서의 좌표(k,l)와 중간 계층 출력값의 각 픽셀의 값(x(m,n))을 기초로 [방정식 1]에 따라 연산될 수 있다.Referring to [Equation 1], you can see how the value of each pixel of the middle layer output value is generated based on the coordinates of the corresponding pixel. At this time, M may be the horizontal size of the middle layer output value, and N may be the vertical size of the middle layer output value. The pixel value at coordinates (k,l) in the Fourier spectrum output value is based on the coordinates (k,l) in the Fourier spectrum output value and the value of each pixel (x(m,n)) in the middle layer output value [Equation 1] It can be calculated according to .

다시 도1 및 도 3을 참조하면, 노이즈 제거 모듈(120)은 메모리(150)에 저장된 제1 인공지능 모델(151)을 이용하여 이미지 수신 모듈(110)로부터 전달받은 적대적 변형 이미지(400)로부터 노이즈를 추출하고, 해당 적대적 변형 이미지(400)로부터 노이즈를 제거할 수 있다.Referring again to FIGS. 1 and 3 , the noise removal module 120 uses the first artificial intelligence model 151 stored in the memory 150 to obtain information from the adversarial transformation image 400 received from the image reception module 110. Noise may be extracted and noise removed from the corresponding adversarial transformation image 400.

노이즈 제거 모듈(Denoising network)(120)이 노이즈를 추출하고 제거하는 것은 적대적 변형 이미지(400)로부터 추출되는 특징(feature)의 데이터를 기반으로 미리 학습된 제1 인공지능 모델(151)을 이용하여 노이즈를 추출하고 적대적 변형 이미지(400)로부터 노이즈를 제거하는 것일 수 있다. 이때, 적대적 변형 이미지(400)로부터 특징을 추출하는 방식을 학습하기 위해 여러 단계의 컨볼루션 계층(convolution layer)을 쌓은 CNN(Convolutional Neural Networks) 구조가 활용될 수 있으며, 특히 U-net이 사용될 수 있으나, 적대적 변형 이미지(400)로부터 특징을 추출하는 방식이 이에 한정되는 것은 아니다. 어떤 특정한 적대적 변형 이미지(400)의 특징은 해당 적대적 변형 이미지(400)에 대한 다양한 특성을 나타내는 정보일 수 있다. 예를 들어, 특정한 적대적 변형 이미지(400)의 특징은 해당 적대적 변형 이미지(400)의 각 픽셀 단위에서의 색상, 명도, 경계 등에 대한 정보일 수 있으나 이에 한정되는 것은 아니다.The noise removal module (Denoising network) 120 extracts and removes noise using a first artificial intelligence model 151 previously learned based on feature data extracted from the adversarial deformed image 400. This may be extracting noise and removing noise from the adversarial transformation image 400. At this time, a CNN (Convolutional Neural Networks) structure that stacks several stages of convolutional layers can be used to learn how to extract features from the adversarial deformed image 400, and in particular, U-net can be used. However, the method of extracting features from the adversarial modified image 400 is not limited to this. The characteristics of a specific adversarial deformed image 400 may be information representing various characteristics of the adversarial deformed image 400. For example, the characteristics of a specific adversarial deformation image 400 may be information about color, brightness, border, etc. in each pixel unit of the adversarial deformation image 400, but are not limited thereto.

노이즈 제거 모듈(120)이 제1 인공지능 모델(151)을 이용해서 노이즈를 제거하기 위해서는 제1 인공지능 모델(151)이 복수의 학습용 적대적 이미지를 기초로 미리 학습될 필요가 있다. 기계학습부(142)는 제1 푸리에 스펙트럼 출력값 및 제2 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 제1 인공지능 모델(151)을 학습하도록 구성될 수 있다. In order for the noise removal module 120 to remove noise using the first artificial intelligence model 151, the first artificial intelligence model 151 needs to be trained in advance based on a plurality of hostile images for training. The machine learning unit 142 may be configured to learn the first artificial intelligence model 151 through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value.

기계 학습이란 다수의 파라미터로 구성된 모델을 이용하며, 주어진 데이터로 파라미터를 최적화하는 것을 의미할 수 있다. 기계 학습은 학습 문제의 형태에 따라 지도 학습(supervised learning), 비지도 학습(unsupervised learning), 강화 학습(reinforcement learning)을 포함할 수 있다. 지도 학습(supervised learning)은 입력과 출력 사이의 매핑을 학습하는 것이며, 입력과 출력 쌍이 데이터로 주어지는 경우에 적용할 수 있다. 비지도 학습(unsupervised learning)은 입력만 있고 출력은 없는 경우에 적용하며, 입력 사이의 규칙성 등을 찾아낼 수 있다. 다만, 일 실시예에 따른 기계 학습이 반드시 전술한 학습 방식으로 한정되는 것은 아니다.Machine learning can mean using a model composed of multiple parameters and optimizing the parameters with given data. Machine learning may include supervised learning, unsupervised learning, and reinforcement learning, depending on the type of learning problem. Supervised learning is learning the mapping between input and output, and can be applied when input and output pairs are given as data. Unsupervised learning is applied when there is only input and no output, and can find regularities between inputs. However, machine learning according to one embodiment is not necessarily limited to the above-described learning method.

구체적으로, 기계학습부(142)는 원본 이미지(300)의 제1 푸리에 스펙트럼 출력값 및 해당 원본 이미지(300)에 대응되는 노이즈 제거 이미지(500)의 제2 푸리에 스펙트럼 출력값을 기초로 손실 함수(f denoising)를 연산할 수 있다. 기계학습부(142)는 학습이 반복되면서 손실 함수(f denoising)가 감소하게 제1 인공지능 모델(151)을 학습할 수 있다. 기계학습부(142)는 반복적인 기계 학습(Machine Learning)을 통해 제1 인공지능 모델(151)을 학습할 수 있다. 제1 인공지능 모델(151)은 딥러닝(deep learning) 모델일 수 있다. 학습되는 제1 인공지능 모델(151)은 메모리(150)에 저장될 수 있다.Specifically, the machine learning unit 142 performs a loss function (f) based on the first Fourier spectrum output value of the original image 300 and the second Fourier spectrum output value of the noise-removed image 500 corresponding to the original image 300. denoising) can be calculated. The machine learning unit 142 may learn the first artificial intelligence model 151 so that the loss function (f denoising) decreases as learning is repeated. The machine learning unit 142 can learn the first artificial intelligence model 151 through repetitive machine learning. The first artificial intelligence model 151 may be a deep learning model. The first artificial intelligence model 151 to be learned may be stored in the memory 150.

손실 함수(f denoising)는 제1 푸리에 스펙트럼 출력값 및 제2 푸리에 스펙트럼 출력값에 대한 평균제곱오차(Mean Squared Error; MSE)의 회귀 분석(regression)에 이용되는 손실 함수(MSE Loss)일수 있다. 기계학습부(142)는 학습이 반복되면서 손실 함수(f denoising)가 감소하게 제1 인공지능 모델(151)을 학습하도록 구성될 수 있다. 즉, 학습을 반복하면 반복할수록 기계학습부(142)가 출력하는 제2 푸리에 스펙트럼 출력값은 제1 푸리에 스펙트럼 출력값에 가까워지고, 이는 학습을 반복할수록 제1 인공지능 모델(151)을 통해 생성되는 노이즈 제거 이미지(500)가 원본 이미지(300)에 가까워진다는 것을 의미할 수 있다.The loss function (f denoising) may be a loss function (MSE Loss) used in regression analysis of the mean squared error (MSE) for the first Fourier spectrum output value and the second Fourier spectrum output value. The machine learning unit 142 may be configured to learn the first artificial intelligence model 151 so that the loss function (f denoising) decreases as learning is repeated. In other words, the more you repeat learning, the closer the second Fourier spectrum output value output by the machine learning unit 142 is to the first Fourier spectrum output value, which means that the more you repeat learning, the more noise generated through the first artificial intelligence model 151 becomes. This may mean that the removed image 500 is closer to the original image 300.

이처럼 적대적 이미지 복원 시스템(100)은 미리 제1 인공지능 모델(151)을 학습하고, 해당 제1 인공지능 모델(151)을 이용하여 적대적 변형 이미지(400)를 정상적인 노이즈 제거 이미지(500)로 복원할 수 있다. 이렇게 복원된 노이즈 제거 이미지(500)는 복원되기 전의 적대적 변형 이미지(400) 대신 이미지 분류기(200)에 전달되어 분류될 수 있다.In this way, the adversarial image restoration system 100 learns the first artificial intelligence model 151 in advance, and uses the first artificial intelligence model 151 to restore the adversarial deformed image 400 to the normal denoised image 500. can do. The noise-removed image 500 restored in this way can be transmitted to the image classifier 200 for classification instead of the adversarial transformation image 400 before restoration.

하지만, 만약 어떤 검사 대상 이미지가 적대적 변형 이미지(400)인지 정상적인 원본 이미지(300)인지 모르는 상황에서 해당 검사 대상 이미지를 적대적 이미지 복원 시스템(100)에 입력하여 생성되는 노이즈 제거 이미지(500)의 경우 그대로 이미지 분류기(200)에 전달하는 것은 바람직하지 않을 수 있다. 만약 해당 검사 대상 이미지가 정상적인 원본 이미지(300)일 경우 해당 원본 이미지(300)의 노이즈 제거 이미지(500)는 해당 원본 이미지(300)와는 어느 정도 차이가 있을 수 있다. 이러한 상황에서 이미지 분류기(200)에는 원본 이미지(300)의 노이즈 제거 이미지(500) 대신 정상적인 원본 이미지(300)가 바로 입력되는 것이 바람직하다. 따라서 이미지 분류기(200)에 입력될 검사 대상 이미지가 적대적 변형 이미지(400)인지 정상이미지인지 판단하고, 판단 결과에 따라 이미지 분류기(200)에 전달할 이미지를 결정하는 것이 필요하다.However, in the case of a denoising image (500) generated by inputting the inspection target image into the adversarial image restoration system (100) in a situation where it is not known whether the inspection target image is an adversarial modified image (400) or a normal original image (300), It may not be desirable to transmit it as is to the image classifier 200. If the image to be inspected is a normal original image 300, the noise-removed image 500 of the original image 300 may be somewhat different from the original image 300. In this situation, it is preferable that the normal original image 300 is directly input to the image classifier 200 instead of the noise-removed image 500 of the original image 300. Therefore, it is necessary to determine whether the image to be inspected to be input to the image classifier 200 is a hostile modified image 400 or a normal image, and to determine the image to be transmitted to the image classifier 200 according to the judgment result.

이미지 수신 모듈(110)은 검사 대상 이미지를 수신할 수 있다. 이미지 수신 모듈(110)은 검사 대상 이미지를 노이즈 제거 모듈(120)로 전달할 수 있다. 노이즈 제거 모듈(120)은 미리 학습된 제1 인공지능 모델(151)을 이용하여 검사 대상 이미지를 기초로 노이즈 제거 검사 대상 이미지를 생성할 수 있다. 노이즈 제거 모듈(120)은 검사 대상 이미지 및 노이즈 제거 검사 대상 이미지를 적대적 사례 분류 모듈(130)로 전달할 수 있다.The image receiving module 110 may receive an image to be inspected. The image receiving module 110 may transmit the image to be inspected to the noise removal module 120. The noise removal module 120 may generate a noise removal inspection target image based on the inspection target image using the first artificial intelligence model 151 learned in advance. The noise removal module 120 may transmit the inspection target image and the noise removal inspection target image to the adversarial case classification module 130.

적대적 사례 분류 모듈(130)은 검사 대상 이미지와 노이즈 제거 검사 대상 이미지가 합성하여 합성 이미지를 생성할 수 있다. 합성 이미지는 검사 대상 이미지에 대한 3개의 RGB 이미지 정보 및 노이즈 제거 검사 대상 이미지에 대한 3개의 RGB 이미지 정보를 전부 포함하는 이미지일 수 있다.The adversarial case classification module 130 may generate a composite image by combining the inspection target image and the noise removal inspection target image. The composite image may be an image that includes all three RGB image information for the image to be inspected and all three RGB image information for the image to be inspected for noise removal.

적대적 사례 분류 모듈(130)은 검사 대상 이미지가 적대적 변형 이미지(400)인지 여부를 판단하도록 구성될 수 있다. 구체적으로, 적대적 사례 분류 모듈(130)은 생성된 합성 이미지를 기초로, 제2 인공지능 모델(152)을 이용하여 검사 대상 이미지가 적대적 변형 이미지(400)인지 여부를 판단할 수 있다.The hostile case classification module 130 may be configured to determine whether the image to be inspected is a hostile modified image 400. Specifically, the hostile case classification module 130 may determine whether the image to be inspected is a hostile modified image 400 using the second artificial intelligence model 152 based on the generated synthetic image.

적대적 사례 분류 모듈(130)이 검사 대상 이미지의 적대적 변형 이미지 여부를 판단하는 것은 합성 이미지로부터 추출되는 특징의 데이터를 기반으로 미리 학습된 제2 인공지능 모델(152)을 이용하여 판단하는 것일 수 있다. 이때, 합성 이미지로부터 특징을 추출하는 방식을 학습하기 위해 여러 단계의 컨볼루션 계층을 쌓은 CNN 구조가 활용될 수 있으며, 특히 로지스틱 회귀(logistic regression)가 사용될 수 있으나, 검사 대상 이미지의 적대적 변형 이미지 여부를 판단하는 방식이 이에 한정되는 것은 아니다.The adversarial case classification module 130 determines whether the image to be inspected is a hostile modified image by using the second artificial intelligence model 152 learned in advance based on feature data extracted from the synthetic image. . At this time, a CNN structure that stacks several stages of convolutional layers can be used to learn how to extract features from synthetic images, and in particular, logistic regression can be used, but it is not possible to determine whether the image to be inspected is an adversarial transformation image. The method of judging is not limited to this.

기계학습부(142)는 복수의 검사 대상 이미지로부터 추출되는 특징의 데이터 및 각각의 검사 대상 이미지에 대한 실제 적대적 변형 이미지 여부 정보를 기초로 적대적 이미지 여부를 판단하는데 이용되는 손실 함수(f cls)를 연산할 수 있다. 적대적 이미지 여부 판단에 이용되는 손실 함수(f cls)는 적대적 노이즈에 대한 이진 분류(binary classification)에 이용되는 손실 함수로서 크로스 엔트로피 손실함수(cross-entropy loss)일 수 있다. 기계학습부(142)는 학습이 반복되면서 적대적 이미지 여부 판단에 이용되는 손실 함수(f cls)가 감소하게 제2 인공지능 모델(152)을 학습하도록 구성될 수 있다.The machine learning unit 142 generates a loss function (f cls) used to determine whether or not it is a hostile image based on feature data extracted from a plurality of inspection target images and information on whether each inspection target image is actually an adversarial transformed image. It can be calculated. The loss function (f cls) used to determine whether an image is hostile is a loss function used for binary classification of hostile noise and may be a cross-entropy loss function. The machine learning unit 142 may be configured to learn the second artificial intelligence model 152 so that the loss function (f cls) used to determine whether a hostile image is decreased as learning is repeated.

적대적 사례 분류 모듈(130)은 검사 대상 이미지가 적대적 변형 이미지(400)로 판단되면, 노이즈 제거 검사 대상 이미지가 분류되도록 노이즈 제거 검사 대상 이미지를 이미지 분류기(200)로 전달할 수 있다. 이때, 이미지 분류기(200)는 노이즈 제거 검사 대상 이미지를 분류할 수 있다.If the hostile case classification module 130 determines that the image to be inspected is a hostile modified image 400, the image to be inspected for noise removal may be transmitted to the image classifier 200 so that the image to be inspected for noise removal is classified. At this time, the image classifier 200 may classify the image subject to noise removal inspection.

적대적 사례 분류 모듈(130)은 검사 대상 이미지가 정상 이미지로 판단되면, 검사 대상 이미지가 분류되도록 검사 대상 이미지를 이미지 분류기(200)로 전달할 수 있다. 이때, 이미지 분류기(200)는 검사 대상 이미지를 분류할 수 있다.If the hostile case classification module 130 determines that the image to be inspected is a normal image, it may transmit the image to be inspected to the image classifier 200 so that the image to be inspected is classified. At this time, the image classifier 200 may classify the image to be inspected.

이상에서 설명된 구성요소들의 성능에 대응하여 적어도 하나의 구성요소가 추가되거나 삭제될 수 있다. 또한, 구성요소들의 상호 위치는 시스템의 성능 또는 구조에 대응하여 변경될 수 있다는 것은 당해 기술 분야에서 통상의 지식을 가진 자에게 용이하게 이해될 것이다.At least one component may be added or deleted in response to the performance of the components described above. Additionally, it will be easily understood by those skilled in the art that the mutual positions of the components may be changed in response to the performance or structure of the system.

도 5는 일 실시예에 따른 적대적 이미지 복원 방법의 순서도이다. 이는 본 발명의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 구성이 추가되거나 삭제될 수 있음은 물론이다.Figure 5 is a flowchart of an adversarial image restoration method according to an embodiment. This is only a preferred embodiment for achieving the purpose of the present invention, and of course, some components may be added or deleted as needed.

도 5를 참조하면, 이미지 수신 모듈(110)은 적대적 변형 이미지(400)를 수신할수 있다(1001). 이미지 수신 모듈(110)은 적대적 변형 이미지(400)를 노이즈 제거 모듈(120)로 전달할 수 있다.Referring to FIG. 5, the image reception module 110 may receive an adversarial modified image 400 (1001). The image reception module 110 may transmit the adversarial modified image 400 to the noise removal module 120.

노이즈 제거 모듈(120)은 제1 인공지능 모델(151)을 이용하여 적대적 변형 이미지(400)로부터 노이즈를 제거하여 노이즈 제거 이미지(500)를 생성할 수 있다(1002). 노이즈 제거 모듈(120)은 생성된 노이즈 제거 이미지(500)를 이미지 분류기(200)로 전달할 수 있다.The noise removal module 120 may remove noise from the adversarial transformation image 400 using the first artificial intelligence model 151 to generate the noise removal image 500 (1002). The noise removal module 120 may transmit the generated noise removal image 500 to the image classifier 200.

이미지 분류기(200)는, 입력된 원본 이미지(300)를 기계학습방식으로 특징을 추출하는 과정에서 제1 중간 계층 출력값을 생성하고, 입력된 노이즈 제거 이미지(500)를 기계학습방식으로 특징을 추출하는 과정에서 제2 중간 계층 출력값을 생성할 수 있다(1003). 이미지 분류기(200)는 제1 중간 계층 출력값 및 제2 중간 계층 출력값을 푸리에 변환부(141)에 전달할 수 있다. 즉, 푸리에 변환부(141)는 이미지 분류 과정에서 생성되는 중간 계층에서의 출력값인 중간 계층 출력값을 전달받을 수 있다.The image classifier 200 generates a first intermediate layer output value in the process of extracting features from the input original image 300 using a machine learning method, and extracts features from the input denoised image 500 using a machine learning method. In the process, a second middle layer output value can be generated (1003). The image classifier 200 may transmit the first middle layer output value and the second middle layer output value to the Fourier transform unit 141. That is, the Fourier transform unit 141 can receive the middle layer output value, which is the output value of the middle layer generated during the image classification process.

푸리에 변환부(141)는 제1 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제1 푸리에 스펙트럼 출력값을 생성하고, 제2 중간 계층 출력값을 푸리에 변환하여 주파수 영역을 나타내는 제2 푸리에 스펙트럼 출력값을 생성할 수 있다(1004). 푸리에 변환부(141)는 제1 푸리에 스펙트럼 출력값 및 제2 푸리에 스펙트럼 출력값을 기계학습부(142)에 전달할 수 있다.The Fourier transform unit 141 performs Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing the frequency domain, and Fourier transforms the second middle layer output value to generate a second Fourier spectrum output value representing the frequency domain. Can (1004). The Fourier transform unit 141 may transmit the first Fourier spectrum output value and the second Fourier spectrum output value to the machine learning unit 142.

기계학습부(142)는 제1 푸리에 스펙트럼 출력값 및 제2 푸리에 스펙트럼 출력값을 기초로 기계학습방식을 통해 제1 인공지능 모델(151)을 학습할 수 있다(1005).The machine learning unit 142 may learn the first artificial intelligence model 151 through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value (1005).

이미지 수신 모듈(110)은 검사 대상 이미지를 수신할 수 있다(1006). 이미지 수신 모듈(110)은 검사 대상 이미지를 노이즈 제거 모듈(120)로 전달할 수 있다.The image receiving module 110 may receive an image to be inspected (1006). The image receiving module 110 may transmit the image to be inspected to the noise removal module 120.

노이즈 제거 모듈(120)은 미리 학습된 제1 인공지능 모델(151)을 이용하여 검사 대상 이미지를 기초로 노이즈 제거 검사 대상 이미지를 생성할 수 있다(1007). 노이즈 제거 모듈(120)은 노이즈 제거 검사 대상 이미지를 적대적 사례 분류 모듈(130)로 전달할 수 있다.The noise removal module 120 may generate a noise removal inspection target image based on the inspection target image using the pre-trained first artificial intelligence model 151 (1007). The noise removal module 120 may transmit the image subject to noise removal inspection to the adversarial case classification module 130.

적대적 사례 분류 모듈(130)은 검사 대상 이미지와 노이즈 제거 검사 대상 이미지가 합성되어 생성된 합성 이미지를 기초로, 제2 인공지능 모델(152)을 이용하여 검사 대상 이미지가 적대적 변형 이미지(400)인지 여부를 판단할 수 있다(1008).The hostile case classification module 130 uses the second artificial intelligence model 152 to determine whether the image to be inspected is an adversarial modified image 400 based on a composite image created by combining the image to be inspected and the image to be inspected for noise removal. It is possible to determine whether or not (1008).

적대적 사례 분류 모듈(130)은 이미지 분류기(200)에 이미지 분류가 수행될 이미지를 전달할 수 있다(1009). 이때, 적대적 사례 분류 모듈(130)은 검사 대상 이미지가 적대적 변형 이미지(400)로 판단되면, 노이즈 제거 검사 대상 이미지가 분류되도록 노이즈 제거 검사 대상 이미지를 이미지 분류기(200)로 전달할 수 있다. 또한, 적대적 사례 분류 모듈(130)은 검사 대상 이미지가 정상 이미지로 판단되면, 검사 대상 이미지가 분류되도록 검사 대상 이미지를 이미지 분류기(200)로 전달할 수 있다.The adversarial example classification module 130 may transmit an image on which image classification is to be performed to the image classifier 200 (1009). At this time, if the hostile case classification module 130 determines that the image to be inspected is a hostile modified image 400, it may transmit the image to be inspected for noise removal to the image classifier 200 so that the image to be inspected for noise removal is classified. Additionally, if the hostile case classification module 130 determines that the image to be inspected is a normal image, it may transmit the image to be inspected to the image classifier 200 so that the image to be inspected is classified.

이미지 수신 모듈(110), 노이즈 제거 모듈(120), 적대적 사례 분류 모듈(130), 푸리에 기반 기계학습 모듈(140), 푸리에 변환부(141) 및 기계학습부(142)는 적대적 이미지 복원 시스템(100)에 포함된 복수개의 프로세서 중 어느 하나의 프로세서를 포함할 수 있다. 또한, 지금까지 설명된 본 발명의 실시예에 따른 적대적 이미지 복원 방법은, 프로세서에 의해 구동될 수 있는 프로그램의 형태로 구현될 수 있다.The image reception module 110, the noise removal module 120, the adversarial example classification module 130, the Fourier-based machine learning module 140, the Fourier transform unit 141, and the machine learning unit 142 are an adversarial image restoration system ( 100) may include any one processor among a plurality of processors included in the processor. Additionally, the adversarial image restoration method according to the embodiment of the present invention described so far may be implemented in the form of a program that can be driven by a processor.

여기서 프로그램은, 프로그램 명령, 데이터 파일 및 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 프로그램은 기계어 코드나 고급 언어 코드를 이용하여 설계 및 제작된 것일 수 있다. 프로그램은 상술한 부호 수정을 위한 방법을 구현하기 위하여 특별히 설계된 것일 수도 있고, 컴퓨터 소프트웨어 분야에서 통상의 기술자에게 기 공지되어 사용 가능한 각종 함수나 정의를 이용하여 구현된 것일 수도 있다. 전술한 정보 표시 방법을 구현하기 위한 프로그램은, 프로세서에 의해 판독 가능한 비일시적 기록매체에 기록될 수 있다. 이때, 기록매체는 메모리(150)일 수 있다.Here, the program may include program instructions, data files, and data structures, etc., singly or in combination. Programs may be designed and produced using machine code or high-level language code. The program may be specially designed to implement the above-described method for modifying the code, or may be implemented using various functions or definitions known and available to those skilled in the art in the field of computer software. A program for implementing the above-described information display method may be recorded on a non-transitory recording medium readable by a processor. At this time, the recording medium may be the memory 150.

메모리(150)는 전술한 동작 및 후술하는 동작을 수행하는 프로그램을 저장할 수 있으며, 메모리(150)는 저장된 프로그램을 실행시킬 수 있다. 프로세서와 메모리(150)가 복수인 경우에, 이들이 하나의 칩에 집적되는 것도 가능하고, 물리적으로 분리된 위치에 마련되는 것도 가능하다. 메모리(150)는 데이터를 일시적으로 기억하기 위한 S램(Static Random Access Memory, S-RAM), D랩(Dynamic Random Access Memory) 등의 휘발성 메모리를 포함할 수 있다. 또한, 메모리(150)는 제어 프로그램 및 제어 데이터를 장기간 저장하기 위한 롬(Read Only Memory), 이피롬(Erasable Programmable Read Only Memory: EPROM), 이이피롬(Electrically Erasable Programmable Read Only Memory: EEPROM) 등의 비휘발성 메모리를 포함할 수 있다. 프로세서는 각종 논리 회로와 연산 회로를 포함할 수 있으며, 메모리(150)로부터 제공된 프로그램에 따라 데이터를 처리하고, 처리 결과에 따라 제어 신호를 생성할 수 있다.The memory 150 can store programs that perform the operations described above and the operations described later, and the memory 150 can execute the stored programs. In the case where there are a plurality of processors and memories 150, they may be integrated into one chip or may be provided in physically separate locations. The memory 150 may include volatile memory such as Static Random Access Memory (S-RAM) or Dynamic Random Access Memory (D-Lab) for temporarily storing data. In addition, the memory 150 includes read only memory (ROM), erasable programmable read only memory (EPROM), and electrically erasable programmable read only memory (EEPROM) for long-term storage of control programs and control data. May include non-volatile memory. The processor may include various logic circuits and operation circuits, process data according to a program provided from the memory 150, and generate control signals according to the processing results.

본 발명의 실시예에 따른 적대적 이미지 복원 시스템(100)의 성능을 검증하기 위하여, 종래의 적대적 이미지를 분류하는 방법과 본 발명의 적대적 이미지 복원 방법으로 CIFAR-10 및 CIFAR-100의 데이터셋의 이미지를 분류하고 복원하는 실험을 진행하였다.In order to verify the performance of the adversarial image restoration system 100 according to an embodiment of the present invention, images of the CIFAR-10 and CIFAR-100 datasets are used by the conventional adversarial image classification method and the adversarial image restoration method of the present invention. An experiment was conducted to classify and restore .

도 6은 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지(500)에 대한 분류 성능을 나타낸 표이고, 도 7은 일 실시예에 따른 적대적 이미지 복원 방법이 종래의 복원 방법에 비해 개선된 정도를 나타낸 표이며, 도 8은 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지(500)에 대한 분류 결과를 도시한 도면이고, 도 9는 일 실시예에 따른 이미지 분류기의 노이즈 제거 이미지(500)에 대한 분류 성능을 나타낸 그래프이다.FIG. 6 is a table showing the classification performance of the image classifier for the noise removal image 500 according to an embodiment, and FIG. 7 shows the degree to which the adversarial image restoration method according to an embodiment is improved compared to the conventional restoration method. It is a table, and FIG. 8 is a diagram showing classification results for a noise-removed image 500 of an image classifier according to an embodiment, and FIG. 9 is a diagram showing classification of a noise-removed image 500 of an image classifier according to an embodiment. This is a graph showing performance.

도 6을 참조하면, 일 실시예에 따라 복원된 이미지에 대한 분류 결과를 확인할 수 있다. 일 실시예에 따라 복원된 이미지의 성능을 검증하기 위해서, 반복적으로 적대적 변형 이미지(400)의 손실 값에 대한 그라디언트를 계산하여 적대적 사례를 생성하는 PGD(Projected gradient descent) 공격 방법, 클래스 간 경계선을 찾아서 적대적 변형 이미지(400)로부터 가장 가까운 경계선 방향으로 노이즈를 생성하는 Deepfool 공격 방법, 적대적 노이즈 크기를 측정하는 3가지 방법을 사용한 3가지 생성 방법으로 효과적인 목적 함수를 사용하여 신뢰도 높은 적대적 사례를 생성하는 C&W(Carlini & Wagner) 공격 방법 등으로 생성된 적대적 변형 이미지(400)들에 대해 일 실시예에 따른 복원 방법 및 다른 복원 방법으로 이미지 복원을 수행하였다. 표를 참조하면, 일 실시예에 따른 적대적 이미지 분류 방법(FFT(Layer(x ori & x pred)))이 다른 방법들(x ori & x pred, FFT(x ori & x pred), Layer(x ori & x pred))보다 분류 성능이 더 뛰어나다는 것을 확인할 수 있다. 구체적으로, 일 실시예에 따라 푸리에 변환된 중간 계층 출력값을 기초로 이미지를 복원하는 적대적 이미지 복원 방법(FFT(Layer(x ori & x pred)))으로 복원된 이미지는, 단순한 RGB 이미지(x ori & x pred), 푸리에 변환을 기초로 복원된 이미지(FFT(x ori & x pred)), 중간 계층 출력값을 기초로 복원된 이미지(Layer(x ori & x pred))보다 분류 성능이 더 뛰어난 것을 확인하였다.Referring to FIG. 6, the classification results for the restored image can be confirmed according to one embodiment. In order to verify the performance of the restored image according to one embodiment, a PGD (Projected gradient descent) attack method that generates adversarial examples by iteratively calculating the gradient for the loss value of the adversarial deformed image 400 and the boundary between classes is used. A Deepfool attack method that finds and generates noise in the direction of the nearest boundary line from the adversarial deformed image (400), a three-generation method using three methods to measure the size of the adversarial noise, and a highly reliable adversarial example using an effective objective function. Image restoration was performed on the adversarial modified images 400 generated by the C&W (Carlini & Wagner) attack method using a restoration method according to one embodiment and another restoration method. Referring to the table, the adversarial image classification method (FFT (Layer (x ori & x pred))) according to one embodiment is different from other methods (x ori & x pred, FFT (x ori & It can be seen that the classification performance is better than ori & x pred)). Specifically, according to one embodiment, the image restored by an adversarial image restoration method (FFT (Layer(x ori & & x pred), which has better classification performance than the image restored based on Fourier transform (FFT(x ori & Confirmed.

도 7을 참조하면, 일 실시예에 따라 복원 이미지와 입력 이미지로부터 추출한 특징을 기초로 적대적 이미지와 비적대적 이미지를 분류했을 때의 성능을 확인할 수 있다. 이때, 4가지 방법(LID, Mahalanobis, LayerMFS, LayerPFS)을 사용하여 입력 이미지로부터 특징값을 추출하였다. 특히, (LID w/ ours, Mahalanobis w/ ours, LayerMFS w/ ours, LayerPFS w/ ours)의 방법은 입력된 이미지와 복원된 노이즈 제거 이미지(500)가 합성된 합성 이미지로부터 특징값을 추출한 것이다. 이렇게 추출된 특징값을 기반으로 적대적/비적대적 이진 분류를 수행한 결과 일 실시예에 따른 적대적 이미지 복원 방법의 성능이 더 뛰어난 것을 확인할 수 있다.Referring to FIG. 7, according to one embodiment, the performance when classifying hostile images and non-hostile images based on features extracted from the restored image and the input image can be confirmed. At this time, feature values were extracted from the input image using four methods (LID, Mahalanobis, LayerMFS, LayerPFS). In particular, the method of (LID w/ ours, Mahalanobis w/ ours, LayerMFS w/ ours, LayerPFS w/ ours) extracts feature values from a composite image that combines the input image and the restored denoised image (500). As a result of performing adversarial/non-adversarial binary classification based on the feature values extracted in this way, it can be seen that the performance of the adversarial image restoration method according to one embodiment is superior.

도 8을 참조하면, 적대적 이미지 생성 방법의 공격 대상 모델(Target classifier)이 입력받은 이미지를 분류한 결과를 확인할 수 있다. 이때, "Adversarial"은 적대적 변형 이미지(400)를 딥러닝 모델로 분류하였을 때 결과를 나타내며, "Denoised"는 적대적 노이즈 제거 네트워크를 사용하여 복원된 노이즈 제거 이미지(500)들만 분류하였을 때 결과를 나타낸다. 이때, "Denoised"이미지에 대한 분류 결과는 아무런 적대적 변형이 가해지지 않은 "Original"이미지에 대한 분류 결과와 동일한 것을 확인할 수 있다. 즉, 일 실시예에 따른 적대적 변형 이미지 복원 방법에 의해 복원된 이미지에 대해서 공격 대상 모델은 문제없이 정상적으로 이미지 분류를 수행하는 것을 확인할 수 있다.Referring to Figure 8, you can see the results of classifying the input image by the target classifier of the hostile image generation method. At this time, “Adversarial” indicates the result when the adversarial deformed image (400) is classified using a deep learning model, and “Denoised” indicates the result when only the denoised images (500) restored using an adversarial denoising network are classified. . At this time, it can be confirmed that the classification result for the “Denoised” image is the same as the classification result for the “Original” image to which no hostile transformation has been applied. That is, it can be confirmed that the attack target model normally performs image classification without any problem with respect to the image restored by the adversarial modified image restoration method according to one embodiment.

도 9를 참조하면, 종래의 적대적 이미지를 탐지하는 방법과 본 발명의 적대적 이미지 복원 방법으로 CIFAR-10 및 CIFAR-100의 데이터셋의 적대적 변형 이미지(400)를 복원하여 공격 대상 모델(Target classifier)이 분류한 결과를 Confusion matrix로 확인할 수 있다. 이때, "Adversarial"은 적대적 변형 이미지(400)를 딥러닝 모델로 분류하였을 때의 결과를 나타내고, "Denoised"는 일 실시예에 따른 적대적 변형 이미지 복원 방법을 통해 복원된 이미지들만 분류하였을 때 결과를 나타낸 것이다. 구체적으로, PGD(Projected gradient descent) 공격 방법 및 Deepfool 공격 방법으로 변형된 CIFAR-10 및 CIFAR-100의 데이터셋의 이미지를 공격 대상인 이미지 분류 딥러닝 모델이 분류하는 실험 결과를 그래프로 확인할 수 있다. 이때, 대각선의 성분이 뚜렷하게 나타날수록 공격 대상인 이미지 분류 딥러닝 모델이 이미지를 적절하게 분류했다고 볼 수 있다. 즉, 적대적으로 변형된 이미지(Adversarial)들에 대해서는 공격 대상인 이미지 분류 딥러닝 모델이 이미지 분류를 잘 수행하지 못한 것을 확인할 수 있으나, 일 실시예에 따른 적대적 이미지 복원 방법을 통해 생성된 노이즈 제거 이미지(Denoised)(500)에 대해서는 그래프에서 대각선의 경향을 확인할 수 있다. 즉, 적대적 공격으로 인해서 원본 이미지(300)가 적대적으로 변형되었다고 하더라도, 일 실시예에 따라 해당 적대적 변형 이미지(400)를 기초로 생성된 노이즈 제거 이미지(500)는 공격 대상인 이미지 분류 딥러닝 모델이 이미지 분류를 수행하는데 문제없이 사용될 수 있음을 확인할 수 있다.Referring to FIG. 9, the adversarial modified image 400 of the CIFAR-10 and CIFAR-100 datasets is restored using the conventional adversarial image detection method and the adversarial image restoration method of the present invention to create an attack target model (target classifier). The classified results can be confirmed with a confusion matrix. At this time, “Adversarial” represents the result when the adversarial deformed image 400 is classified using a deep learning model, and “Denoised” refers to the result when only images restored through the adversarial deformed image restoration method according to one embodiment are classified. It is shown. Specifically, the results of an experiment in which the image classification deep learning model, which is the target of the attack, classifies images of the CIFAR-10 and CIFAR-100 datasets modified by the PGD (Projected gradient descent) attack method and the Deepfool attack method can be seen in a graph. At this time, the more clearly the diagonal component appears, the more likely it is that the image classification deep learning model that is the target of the attack has properly classified the image. In other words, it can be confirmed that the image classification deep learning model that is the target of the attack did not perform image classification well for the adversarial modified images (Adversarial), but the noise removal image generated through the adversarial image restoration method according to one embodiment ( Denoised (500), you can see the diagonal trend in the graph. That is, even if the original image 300 is hostilely modified due to a hostile attack, according to one embodiment, the denoised image 500 generated based on the hostile modified image 400 is an image classification deep learning model that is the target of the attack. It can be confirmed that it can be used without problem to perform image classification.

이상에서와 같이 첨부된 도면을 참조하여 개시된 실시예들을 설명하였다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고도, 개시된 실시예들과 다른 형태로 본 발명이 실시될 수 있음을 이해할 것이다. 개시된 실시예들은 예시적인 것이며, 한정적으로 해석되어서는 안 된다.As described above, the disclosed embodiments have been described with reference to the attached drawings. A person skilled in the art to which the present invention pertains will understand that the present invention can be practiced in forms different from the disclosed embodiments without changing the technical idea or essential features of the present invention. The disclosed embodiments are illustrative and should not be construed as limiting.

100: 적대적 이미지 복원 시스템
110: 이미지 수신 모듈
120: 노이즈 제거 모듈
130: 적대적 사례 분류 모듈
140: 푸리에 기반 기계학습 모듈
141: 푸리에 변환부
142: 기계학습부
150: 메모리
151: 제1 인공지능 모델
152: 제2 인공지능 모델
200: 이미지 분류기
300: 원본 이미지
400: 적대적 변형 이미지
500: 노이즈 제거 이미지100: Adversarial Image Restoration System
110: Image receiving module
120: Noise removal module
130: Hostile case classification module
140: Fourier-based machine learning module
141: Fourier transform unit
142: Machine Learning Department
150: memory
151: First artificial intelligence model
152: Second artificial intelligence model
200: Image classifier
300: Original image
400: Adversarial Transformation Image
500: Noise removed image

Claims

an image receiving module configured to receive an adversarial transformed image generated by adding noise to the original image;
a noise removal module configured to remove noise from the adversarial deformed image using a first artificial intelligence model to generate a denoised image; and
A Fourier-based machine learning module configured to learn the first artificial intelligence model through a machine learning method based on the Fourier spectrum output value corresponding to the original image and the Fourier spectrum output value corresponding to the noise-removed image,
The Fourier-based machine learning module is,
To receive the middle layer output value, which is the output value from the middle layer generated during the image classification process, from an image classifier configured to classify the input image using a machine learning method that extracts features from the input image through a hierarchical structure of multiple stages. It includes a Fourier transform unit,
The image classifier generates a first intermediate layer output value in the process of extracting features from the input original image using a machine learning method, and a second intermediate layer output value in the process of extracting features from the input denoised image using a machine learning method. configured to generate a layer output,
The Fourier transform unit:
Performing Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing a frequency domain; and
An adversarial image restoration system configured to Fourier transform the second middle layer output to generate a second Fourier spectral output representing a frequency domain.

delete

According to paragraph 1,
The Fourier-based machine learning module is,
An adversarial image restoration system comprising a machine learning unit configured to learn the first artificial intelligence model through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value.

According to paragraph 4,
The machine learning department:
calculating a loss function based on a first Fourier spectrum output value of the original image and a second Fourier spectrum output value of the noise-removed image corresponding to the original image; and
An adversarial image restoration system configured to learn the first artificial intelligence model so that the loss function decreases as learning is repeated.

According to clause 5,
An adversarial image restoration system, further comprising an adversarial case classification module configured to determine whether the image to be inspected is an adversarial modified image.

According to clause 6,
The image receiving module is,
Configured to receive the image to be inspected,
The noise removal module,
Configured to generate a noise-removing inspection target image based on the inspection target image using the first artificial intelligence model learned in advance,
The hostile case classification module is,
An adversarial image restoration system configured to determine whether the inspection target image is an adversarial modified image using a second artificial intelligence model, based on a composite image generated by combining the inspection target image and the noise removal inspection target image. .

In clause 7,
The adversarial case classification module:
If the inspection target image is determined to be an adversarial modified image, transmitting the noise removal inspection target image to the image classifier so that the noise removal inspection target image is classified; and
When the inspection target image is determined to be a normal image, an adversarial image restoration system transmits the inspection target image to the image classifier so that the inspection target image is classified.

As a method of operating an adversarial image restoration system,
Receiving an adversarial transformed image generated by adding noise to the original image;
generating a noise-removed image by removing noise from the adversarial transformation image using a first artificial intelligence model; and
Comprising learning the first artificial intelligence model through a machine learning method based on the Fourier spectrum output value corresponding to the original image and the Fourier spectrum output value corresponding to the noise-removed image,
The steps for learning the first artificial intelligence model are:
In the process of extracting features from the input original image using a machine learning method, an image classifier is configured to classify the input image using a machine learning method that extracts features from the input image through a plurality of levels of hierarchy. Generating a first intermediate layer output value and generating a second intermediate layer output value in the process of extracting features from the input noise-removed image using a machine learning method;
performing Fourier transform on the first middle layer output value to generate a first Fourier spectrum output value representing a frequency domain;
performing Fourier transform on the second middle layer output value to generate a second Fourier spectrum output value representing the frequency domain; and
An adversarial image restoration method comprising learning the first artificial intelligence model through a machine learning method based on the first Fourier spectrum output value and the second Fourier spectrum output value.

A computer program stored in a computer-readable non-transitory recording medium to execute the hostile image restoration method of claim 9.