KR20220085279A

KR20220085279A - Method and apparatus for augmenting data

Info

Publication number: KR20220085279A
Application number: KR1020200175215A
Authority: KR
Inventors: 배성호; 김유민
Original assignee: 경희대학교 산학협력단
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2022-06-22
Also published as: KR102562770B1

Abstract

본 개시는 이미지 데이터를 증강하는 방법 및 이를 수행하는 전자 장치에 관한 것이다. 일 실시 예에 따른 이미지 데이터를 증강하는 방법은 증강 대상이 되는 이미지 데이터를 획득하는 단계; 상기 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할하는 단계; 상기 분할된 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용하는 단계; 및 상기 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성하는 단계; 를 포함할 수 있다.The present disclosure relates to a method for augmenting image data and an electronic device for performing the same. A method of augmenting image data according to an embodiment includes: acquiring image data to be augmented; dividing an image region of an image corresponding to the image data into predetermined partial image regions; applying different data augmentation techniques to at least one of the divided partial image regions; and generating augmented image data from the partial image regions to which the different data augmentation techniques are applied. may include

Description

METHOD AND APPARATUS FOR AUGMENTING DATA

본 개시는 데이터를 증강하는 방법 및 장치에 관한 것이다. 보다 상세하게는 인공 신경망을 학습시키기 위한 데이터를 증강하는 방법 및 장치에 관한 것이다.The present disclosure relates to a method and apparatus for augmenting data. More particularly, it relates to a method and apparatus for augmenting data for training an artificial neural network.

인공 신경망(Artificial Neural Network)는 인공 뉴런들의 상호 연결된 집합들을 구현하기 위하여 컴퓨팅 기기 또는 컴퓨팅 기기에 의해서 수행되는 방법을 지칭할 수 있다. 인공 신경망의 일 실시 예로, 심층 신경망(Deep Neural Network) 또는 딥 러닝(Deep Learning)은 멀티 레이어 구조를 가질 수 있고, 레이어들 각각이 다수의 데이터에 따라 학습될 수 있다.An artificial neural network may refer to a computing device or a method performed by a computing device to implement interconnected sets of artificial neurons. As an example of the artificial neural network, a deep neural network or deep learning may have a multi-layer structure, and each of the layers may be learned according to a plurality of data.

최근 인공 신경망 기술의 개발이 활성화 됨에 따라, 적은량의 학습 데이터로도 인공 신경망 기술의 정확도를 향상시키기 데이터 증강 기술이 활발하게 연구되고 있다. 데이터 증강(Augmentation) 기술은 인공 신경망을 학습시키기 위한 학습 데이터의 한계를 극복하기 위한 기술로써, 기존의 데이터 증강 기법들은 단일 이미지에 단일 데이터 증강 효과만을 가지는 한계가 있다.Recently, as the development of artificial neural network technology is activated, data augmentation technology is being actively studied to improve the accuracy of artificial neural network technology even with a small amount of learning data. Data augmentation technology is a technology for overcoming the limitation of learning data for learning artificial neural networks, and existing data augmentation methods have a limitation in having only a single data augmentation effect on a single image.

또한, 종래의 두가지 이상의 이미지를 사용하는 데이터 증강 기법들은 해당 이미지가 사용되는 정도에 따라 이미지의 레이블 또한 기여도에 따른 비율만큼 새로운 레이블 데이터로 합성될 수 있다. 그러나 다른 데이터의 어떤 부분(물체, 배경)을 이용하느냐에 따라 새로운 레이블 데이터와 맞거나 맞지 않는 한계가 있다.In addition, the conventional data augmentation techniques using two or more images may be synthesized into new label data according to the degree of use of the corresponding image and the ratio according to the contribution of the label of the image. However, depending on which part of other data (object, background) is used, there is a limit that it may or may not fit with the new label data.

따라서, 다른 이미지 데이터를 사용할 때 발생할 수 있는 문제점을 해결할 수 있는 데이터 증강 기술 개발이 요구되고 있다.Accordingly, there is a demand for data augmentation technology that can solve problems that may occur when using other image data.

한국공개특허 제2020-0022739호Korean Patent Publication No. 2020-0022739

일 실시 예에 따르면, 데이터를 증강하는 방법 및 전자 장치가 제공될 수 있다.According to an embodiment, a method and an electronic device for augmenting data may be provided.

또한, 일 실시 예에 의하면, 합성곱 신경망의 지역적 편협 특성을 이용한 데이터 증강 방법 및 이를 수행하는 전자 장치가 제공될 수 있다.Also, according to an embodiment, a data augmentation method using the regional intolerance characteristic of a convolutional neural network and an electronic device performing the same may be provided.

상술한 기술적 과제를 달성하기 위한 본 개시의 일 실시 예에 따라, 이미지 데이터를 증강하는 방법은 증강 대상이 되는 이미지 데이터를 획득하는 단계; 상기 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할하는 단계; 상기 분할된 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용하는 단계; 및 상기 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성하는 단계; 를 포함할 수 있다.According to an embodiment of the present disclosure for achieving the above-described technical problem, a method for augmenting image data includes: acquiring image data to be augmented; dividing an image region of an image corresponding to the image data into predetermined partial image regions; applying different data augmentation techniques to at least one of the divided partial image regions; and generating augmented image data from the partial image regions to which the different data augmentation techniques are applied. may include

또한, 상술한 기술적 과제를 해결하기 위한 본 개시의 또 다른 실시 예에 의하면, 이미지 데이터를 증강하는 전자 장치에 있어서, 네트워크 인터페이스; 하나 이상의 인스트럭션을 저장하는 메모리; 및 상기 하나 이상의 인스트럭션을 실행하는 적어도 하나의 프로세서; 를 포함하고, 상기 적어도 하나의 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 증강 대상이 되는 이미지 데이터를 획득하고, 상기 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할하고, 상기 분할된 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용하고, 상기 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성하는 전자 장치가 제공될 수 있다.In addition, according to another embodiment of the present disclosure for solving the above technical problem, an electronic device for augmenting image data, comprising: a network interface; a memory storing one or more instructions; and at least one processor executing the one or more instructions. including, wherein the at least one processor obtains image data to be augmented by executing the one or more instructions, divides an image area of an image corresponding to the image data into predetermined partial image areas, and An electronic device may be provided that applies different data augmentation techniques to at least one partial image regions among the divided partial image regions and generates augmented image data from the partial image regions to which the different data augmentation techniques are applied. .

또한, 상술한 기술적 과제를 해결하기 위한 본 개시의 또 다른 실시 예에 따라 이미지 데이터를 증강하는 방법에 있어서, 증강 대상이 되는 이미지 데이터를 획득하는 단계; 상기 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할하는 단계; 상기 분할된 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용하는 단계; 및 상기 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성하는 단계; 를 포함하는, 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록 매체가 제공될 수 있다.In addition, in a method for augmenting image data according to another embodiment of the present disclosure for solving the above-described technical problem, the method comprising: acquiring image data to be augmented; dividing an image region of an image corresponding to the image data into predetermined partial image regions; applying different data augmentation techniques to at least one of the divided partial image regions; and generating augmented image data from the partial image regions to which the different data augmentation techniques are applied. A computer-readable recording medium recording a program for executing the method on a computer, including a computer-readable recording medium, may be provided.

도 1은 일 실시 예에 따른 전자 장치가 데이터를 증강하고, 증강된 데이터에 기초하여 인공 신경망을 학습하는 과정을 개략적으로 나타내는 도면이다.
도 2는 일 실시 예에 따른 전자 장치가 데이터를 증강하는 방법의 흐름도이다.
도 3은 일 실시 예에 따른 전자 장치가 데이터를 증강하는 구체적인 방법의 흐름도이다.
도 4는 일 실시 예에 따라 전자 장치가 이미지 데이터를 증강하는 과정을 설명하기 위한 도면이다.
도 5는 일 실시 예에 따른 전자 장치가 데이터를 증강하는 구체적인 방법의 흐름도이다.
도 6은 일 실시 예에 따른 전자 장치가 데이터 증강 과정에서 로컬 이미지 패치를 추출하는 과정을 설명하기 위한 도면이다.
도 7은 서로 다른 방법으로 증강된 데이터에 기초하여 학습된 인공 신경망의 성능 차이를 설명하기 위한 도면이다.
도 8은 일 실시 예에 따른 전자 장치가 생성한 증강 데이터에 기초하여 학습된 인공 신경망의 성능을 설명하기 위한 도면이다.
도 9는 일 실시 예에 따른 전자 장치가 생성한 증강 데이터에 기초하여 학습된 인공 신경망의 성능을 설명하기 위한 도면이다.
도 10은 일 실시 예에 따른 전자 장치의 블록도이다.
도 11은 일 실시 예에 따른 서버의 블록도이다.1 is a diagram schematically illustrating a process in which an electronic device augments data and learns an artificial neural network based on the augmented data, according to an exemplary embodiment.
2 is a flowchart of a method for augmenting data by an electronic device according to an embodiment.
3 is a flowchart of a detailed method for augmenting data by an electronic device according to an embodiment.
4 is a diagram for describing a process in which an electronic device augments image data, according to an embodiment.
5 is a flowchart of a detailed method for augmenting data by an electronic device according to an embodiment.
6 is a diagram for describing a process in which an electronic device extracts a local image patch in a data augmentation process according to an embodiment.
7 is a diagram for explaining a performance difference of an artificial neural network learned based on data augmented by different methods.
8 is a diagram for explaining the performance of an artificial neural network learned based on augmented data generated by an electronic device according to an embodiment.
9 is a diagram for explaining the performance of an artificial neural network learned based on augmented data generated by an electronic device according to an exemplary embodiment.
10 is a block diagram of an electronic device according to an embodiment.
11 is a block diagram of a server according to an embodiment.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 개시에 대해 구체적으로 설명하기로 한다. Terms used in this specification will be briefly described, and the present disclosure will be described in detail.

본 개시에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the present disclosure have been selected as currently widely used general terms as possible while considering the functions in the present disclosure, but these may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technology, and the like. In addition, in a specific case, there is a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the corresponding invention. Therefore, the terms used in the present disclosure should be defined based on the meaning of the term and the contents of the present disclosure, rather than the simple name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 명세서에 기재된 "...부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.In the entire specification, when a part "includes" a certain element, this means that other elements may be further included, rather than excluding other elements, unless otherwise stated. In addition, terms such as "...unit" and "module" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software, or a combination of hardware and software. .

아래에서는 첨부한 도면을 참고하여 본 개시의 실시예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, with reference to the accompanying drawings, the embodiments of the present disclosure will be described in detail so that those of ordinary skill in the art to which the present disclosure pertains can easily implement them. However, the present disclosure may be implemented in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present disclosure in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

도 1은 일 실시 예에 따른 전자 장치가 데이터를 증강하고, 증강된 데이터에 기초하여 인공 신경망을 학습하는 과정을 개략적으로 나타내는 도면이다.1 is a diagram schematically illustrating a process in which an electronic device augments data and learns an artificial neural network based on the augmented data, according to an embodiment.

일 실시 예에 의하면, 전자 장치(1000)는 인공 신경망(Artificial Neural Network)(120)을 학습 시키기 위한 학습 데이터를 생성할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 생성된 학습 데이터를 이용하여 인공 신경망(120) 내 레이어들 및 레이어들의 연결 강도에 관한 가중치(weight)를 수정 및 갱신함으로써 인공 신경망을 학습시킬 수 있다.According to an embodiment, the electronic device 1000 may generate training data for learning the artificial neural network 120 . According to an embodiment, the electronic device 1000 may train the artificial neural network by using the generated learning data to modify and update the layers in the artificial neural network 120 and a weight related to the connection strength of the layers. .

일 실시 예에 의하면, 전자 장치(1000)는 인공 신경망의 가중치를 처리하기 위한, AI 프로그램이 탑재되고 음성 인식 기능을 포함하는 스마트폰, 태블릿 PC, PC, 스마트 TV, 휴대폰, 미디어 플레이어, 서버, 마이크로 서버, 기타 모바일 또는 비모바일 컴퓨팅 장치일 수 있으나, 이에 제한되지 않는다.According to an embodiment, the electronic device 1000 includes a smart phone, a tablet PC, a PC, a smart TV, a mobile phone, a media player, a server, an AI program loaded with an AI program for processing the weights of the artificial neural network and including a voice recognition function; It may be, but is not limited to, a micro server or other mobile or non-mobile computing device.

일 실시 예에 의하면, 전자 장치(1000)가 이용하는 인공 신경망(Artificial Neural Network)은 생물학적 신경망에 착안된 컴퓨팅 시스템을 지칭할 수 있다. 인공 신경망은 미리 정의된 조건에 따라 작업을 수행하는 고전적인 알고리즘과 달리, 다수의 샘플들을 고려함으로써 작업을 수행하는 것을 학습할 수 있다. 인공 신경망은 인공 뉴런(neuron)들이 연결된 구조를 가질 수 있고, 뉴런들 간의 연결은 시냅스(synapse)로 지칭될 수 있다. 뉴런은 수신된 신호를 처리할 수 있고, 처리된 신호를 시냅스를 통해서 다른 뉴런에 전송할 수 있다. 뉴런의 출력은 액티베이션(activation)으로 지칭될 수 있고, 뉴런 및/또는 시냅스는 변동될 수 있는 가중치(weight)를 가질 수 있고, 가중치에 따라 뉴런에 의해 처리된 신호의 영향력이 증가하거나 감소할 수 있다.According to an embodiment, an artificial neural network used by the electronic device 1000 may refer to a computing system focusing on a biological neural network. Unlike classical algorithms that perform tasks according to predefined conditions, artificial neural networks can learn to perform tasks by considering a large number of samples. An artificial neural network may have a structure in which artificial neurons are connected, and a connection between neurons may be referred to as a synapse. A neuron may process a received signal, and may transmit the processed signal to another neuron through a synapse. The output of a neuron may be referred to as activation, and a neuron and/or synapse may have a weight that can be varied, and the influence of a signal processed by the neuron may increase or decrease depending on the weight. .

예를 들어, 인공 신경망은 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values, weights, 122)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공 신경망의 학습 결과에 의해 최적화될 수 있다. For example, the artificial neural network may be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values (weights, 122), and a neural network operation is performed through an operation between an operation result of a previous layer and a plurality of weights. The plurality of weights of the plurality of neural network layers may be optimized by the learning result of the artificial neural network.

예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 손실(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 수정 및 갱신될 수 있다. 본 개시에 따른 인공 신경망은 심층 신경망(DNN:Deep Neural Network)를 포함할 수 있으며, 예를 들어, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등이 있으나, 전술한 예에 한정되지 않는다. 이하에서는 편의상 본 개시에 따른 인공 신경망은 합성곱 신경망인 경우를 예로 설명하기로 한다.For example, a plurality of weights may be modified and updated so that a loss value or a cost value obtained from the artificial intelligence model during the learning process is reduced or minimized. The artificial neural network according to the present disclosure may include a deep neural network (DNN), for example, a Convolutional Neural Network (CNN), a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), Restricted RBM (RBM). Boltzmann Machine), DBN (Deep Belief Network), BRDNN (Bidirectional Recurrent Deep Neural Network), or deep Q-Networks, but is not limited to the above-described example. Hereinafter, for convenience, a case in which the artificial neural network according to the present disclosure is a convolutional neural network will be described as an example.

일 실시 예에 의하면, 전자 장치(1000)는 인공 신경망(120)의 정확도를 향상시키기 위해, 증강된 학습 데이터를 생성할 수 있다. 예를 들어, 전자 장치(1000)는 이미지 데이터(102)를 획득하고, 획득된 이미지 데이터에 대한 증강 기법을 적용할 수 있다. 보다 상세하게는, 전자 장치(1000)는 이미지 데이터(102)에 대응되는 이미지(104)를 생성하고, 생성된 이미지의 이미지 영역들을 분할함으로써 소정의 부분 이미지 영역들(106)을 생성할 수 있다.According to an embodiment, the electronic device 1000 may generate augmented learning data to improve the accuracy of the artificial neural network 120 . For example, the electronic device 1000 may acquire the image data 102 and apply an augmentation technique to the acquired image data. In more detail, the electronic device 1000 may generate an image 104 corresponding to the image data 102 and generate predetermined partial image regions 106 by dividing the image regions of the generated image. .

또한, 일 실시 예에 의하면, 전자 장치(1000)는 소정의 부분 이미지 영역들(106)에 대해 서로 다른 데이터 증강 기법(108)을 적용함으로써, 소정의 부분 이미지 영역들(106)을 변환할 수 있다. 전자 장치(1000)는 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터(110)를 생성할 수 있다. 전자 장치(1000)는 증강 이미지 데이터를 학습 데이터로 하여 인공 신경망(120)을 학습시킬 수 있다.Also, according to an embodiment, the electronic device 1000 may transform the predetermined partial image regions 106 by applying different data augmentation techniques 108 to the predetermined partial image regions 106 . have. The electronic device 1000 may generate the augmented image data 110 from partial image regions to which the data augmentation technique is applied. The electronic device 1000 may train the artificial neural network 120 by using the augmented image data as training data.

일 실시 예에 의하면, 전자 장치(1000)는 서버(2000)와 연동함으로써 증강 이미지 데이터를 생성하고, 생성된 증강 이미지 데이터에 기초하여 인공 신경망(120)을 학습시킬 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 이미지 데이터를 획득하고, 획득된 이미지 데이터를 서버(2000)로 전송할 수 있다. 서버(2000)는 전자 장치(1000)가 수행하는 동작의 적어도 일부를 수행함으로써, 이미지 데이터를 증강시키고, 증강된 이미지 데이터를 전자 장치(1000)로 전송할 수 있다. According to an embodiment, the electronic device 1000 may generate augmented image data by interworking with the server 2000 and train the artificial neural network 120 based on the generated augmented image data. According to an embodiment, the electronic device 1000 may acquire image data and transmit the acquired image data to the server 2000 . The server 2000 may augment image data by performing at least a part of an operation performed by the electronic device 1000 , and transmit the augmented image data to the electronic device 1000 .

또 다른 실시 예에 의하면, 서버(2000)는 증강된 이미지 데이터에 기초하여 인공 신경망을 스스로 학습시킬 수도 있다. 서버(2000)는 학습된 인공 신경망에 대한 정보들(예컨대 가중치)을 전자 장치(1000)로 전송하고, 전자 장치(1000)는 서버(2000)로부터 수신된 가중치를 인공 신경망에 적용함으로써 인공 신경망을 학습시킬 수도 있다.According to another embodiment, the server 2000 may self-learn the artificial neural network based on the augmented image data. The server 2000 transmits the learned information (eg, weights) on the artificial neural network to the electronic device 1000 , and the electronic device 1000 applies the weights received from the server 2000 to the artificial neural network to create the artificial neural network. You can also learn

도 2는 일 실시 예에 따른 전자 장치가 데이터를 증강하는 방법의 흐름도이다.2 is a flowchart of a method for augmenting data by an electronic device according to an embodiment.

S210에서, 전자 장치(1000)는 증강 대상이 되는 이미지 데이터를 획득할 수 있다. 예를 들어, 전자 장치(1000)는 이미지 데이터에 대응되는 이미지를 획득할 수도 있다. 일 실시 예에 의하면, 전자 장치(1000)는 단일의 이미지를 획득하고, 획득된 단일의 이미지 내 픽셀 데이터를 식별할 수 있다. 전자 장치(1000)는 단일의 이미지 내 픽셀 데이터에 기초하여 이미지에 대한 데이터를 획득할 수도 있다.In S210 , the electronic device 1000 may acquire image data to be augmented. For example, the electronic device 1000 may acquire an image corresponding to image data. According to an embodiment, the electronic device 1000 may acquire a single image and identify pixel data within the acquired single image. The electronic device 1000 may acquire data for an image based on pixel data in a single image.

S220에서, 전자 장치(1000)는 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 이미지 데이터에 대응되는 이미지를 생성하고, 생성된 이미지 상에 서로 교차하는 2개의 축을 기준으로 4분면을 정의할 수 있다. 전자 장치(1000)는 정의된 4분면 각각에 대응되는 제1 부분 이미지 영역, 제2 부분 이미지 영역, 제3 부분 이미지 영역 또는 제4 부분 이미지 영역을 생성할 수 있다. 또 다른 실시 예에 의하면, 전자 장치(1000)는 이미지 영역을 서로 직교하는 두개의 축을 기준으로 나누어지는 사분면 각각에 대응되는 부분 이미지 영역들로 분할할 수도 있다.In S220 , the electronic device 1000 may divide an image region of an image corresponding to image data into predetermined partial image regions. According to an embodiment, the electronic device 1000 may generate an image corresponding to the image data, and define a quadrant based on two axes intersecting each other on the generated image. The electronic device 1000 may generate a first partial image area, a second partial image area, a third partial image area, or a fourth partial image area corresponding to each of the defined quadrants. According to another embodiment, the electronic device 1000 may divide the image region into partial image regions corresponding to each of the quadrants divided based on two axes orthogonal to each other.

또한, 일 실시 예에 의하면, 전자 장치(1000)는 이미지 영역을 임의의 4개의 부분 이미지 영역들로 분할할 수도 있다. 그러나 전자 장치(1000)가 이미지를 분할하는 방법은 상술한 바에 한정되는 것은 아니고, 전자 장치(1000)는 임의의 분할 방법에 따라 이미지 영역을 소정의 적어도 하나의 부분 이미지 영역들로 분할할 수 있다.Also, according to an embodiment, the electronic device 1000 may divide the image region into four arbitrary partial image regions. However, the method for dividing the image by the electronic device 1000 is not limited to the above-described method, and the electronic device 1000 may divide the image region into at least one predetermined partial image region according to any division method. .

S230에서, 전자 장치(1000)는 분할된 부분 이미지 영역들 중, 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 분할된 부분 이미지 영역들 모두에 서로 다른 데이터 증강 기법을 적용할 수도 있다. 그러나 또 다른 실시 예에 의하면, 전자 장치(1000)는 분할된 부분 이미지 영역들 모두에 동일한 데이터 증강기법을 적용할 수도 있음은 물론이다. 또한, 일 실시 예에 의하면, 전자 장치(1000)는 데이터 증강기법이 적용된 후 모든 부분 이미지 영역들에 RGB Channel shuffling을 적용할 수도 있다. In S230 , the electronic device 1000 may apply different data augmentation techniques to at least one partial image area among the divided partial image areas. According to an embodiment, the electronic device 1000 may apply different data augmentation techniques to all of the divided partial image regions. However, according to another embodiment, of course, the electronic device 1000 may apply the same data augmentation technique to all of the divided partial image regions. Also, according to an embodiment, the electronic device 1000 may apply RGB channel shuffling to all partial image regions after the data augmentation technique is applied.

S240에서, 전자 장치(1000)는 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들을 합성함으로써 단일의 증강 이미지 데이터를 생성하고, 생성된 증강 이미지 데이터에 대응되는 증강 이미지를 생성할 수도 있다. 또 다른 실시 예에 따라, 전자 장치(1000)는 부분 이미지 영역들에 서로 다른 데이터 증강 기법이 적용된 후, 모든 부분 이미지 영역들에 RGB Channel shuffling을 적용이 되면, RGB Channel shuffling이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성할 수도 있다. 전자 장치(1000)는 생성된 증강 이미지를 전자 장치 내 출력부를 통하여 표시할 수도 있다.In S240 , the electronic device 1000 may generate augmented image data from partial image regions to which different data augmentation techniques are applied. According to an embodiment, the electronic device 1000 may generate single augmented image data by synthesizing partial image regions to which different data augmentation techniques are applied, and may also generate an augmented image corresponding to the generated augmented image data. According to another embodiment, when different data augmentation techniques are applied to the partial image regions and then RGB channel shuffling is applied to all the partial image regions, the electronic device 1000 applies RGB channel shuffling to the partial image regions. Augmented image data may be generated from The electronic device 1000 may display the generated augmented image through an output unit in the electronic device.

본 개시에 따른 전자 장치(1000)는 상술한 바에 따라 이미지 데이터를 증강함으로써, 합성곱 신경망의 지역적 편협 특성을 이용할 수 있다. 일 실시 예에 의하면 합성곱 신경망의 지역적 편협 특성은 신경망이 이미지 안의 물체를 구분할 때 물체의 전체적인 구조에 의존하는 것이 아닌 물체의 결이나 부분적인 모양과 같은 부분적인 정보에 의존하는 특성을 의미할 수 있다. 본 개시에 따른 전자 장치가 수행하는 방법은 이러한 합성곱 신경망의 지역적 편협 특성을 데이터 증강 기법에 이용함으로써, 물체의 전체적인 구조가 파괴되는 대신 단일 이미지의 다양한 데이터 증강 기법을 통해 데이터 증강 효과를 극대화할 수 있다. 또한, 본 개시에 따른 방법은 데이터 증강 효과를 극대화함으로써, 증강 데이터에 기초하여 학습된 신경망 학습의 과적합 현상을 방지하고, 성능을 향상시킬 수 있다.The electronic device 1000 according to the present disclosure may use the regional intolerance characteristic of the convolutional neural network by augmenting image data as described above. According to an embodiment, the regional intolerance characteristic of the convolutional neural network may refer to a characteristic that depends on partial information such as the texture or partial shape of an object rather than depending on the overall structure of the object when the neural network distinguishes an object in an image. have. The method performed by the electronic device according to the present disclosure maximizes the data augmentation effect through various data augmentation techniques of a single image instead of destroying the overall structure of the object by using the regional intolerance characteristic of the convolutional neural network in the data augmentation technique. can In addition, the method according to the present disclosure maximizes the data augmentation effect, thereby preventing overfitting of neural network learning learned based on the augmented data and improving performance.

도 3은 일 실시 예에 따른 전자 장치가 데이터를 증강하는 구체적인 방법의 흐름도이다.3 is a flowchart of a detailed method for augmenting data by an electronic device according to an embodiment.

S310에서, 전자 장치(1000)는 이미지 영역을 분할함으로써 생성된 소정의 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 각도로 회전시킬 수 있다. 예를 들어, 전자 장치(1000)는 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 90도, 180도 또는 270도 중 적어도 하나의 각도로 회전시킬 수 있다. 또 다른 실시 예에 의하면, 전자 장치(1000)는 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 확률에 따라, 소정의 각도로 회전시킬 수도 있다.In S310 , the electronic device 1000 may rotate at least one partial image area among predetermined partial image areas generated by dividing the image area by a predetermined angle. For example, the electronic device 1000 may rotate at least one partial image area among the partial image areas by at least one of 90 degrees, 180 degrees, and 270 degrees. According to another embodiment, the electronic device 1000 may rotate at least one partial image area among the partial image areas by a predetermined angle according to a predetermined probability.

S320에서, 전자 장치(1000)는 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 확률로 플립(fliiped)할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)가 부분 이미지 영역을 플립하는 동작은, 부분 이미지 영역을 상하 또는 좌우로 뒤집는 동작에 대응될 수 있다. 예를 들어, 전자 장치(1000)는 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 확률로 좌우 또는 상하 방향으로 뒤집을 수 있다. 또 다른 실시 예에 의하면, 전자 장치(1000)는 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 확률로 플립시킬 수도 있다.In S320 , the electronic device 1000 may flip at least one partial image area among the partial image areas with a predetermined probability. According to an embodiment, the operation of the electronic device 1000 flipping the partial image area may correspond to an operation of flipping the partial image area up and down or left and right. For example, the electronic device 1000 may invert at least one partial image area among the partial image areas in the left and right or up and down directions with a predetermined probability. According to another embodiment, the electronic device 1000 may flip at least one partial image area among the partial image areas with a predetermined probability.

S330에서, 전자 장치(1000)는 부분 이미지 영역들 중, 회전되거나 뒤집히지 않은(예컨대 플립 되지않은) 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출하고, 추출된 로컬 이미지 패치를 재배치할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 부분 이미지 영역들 중, 회전되거나 플립되지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치들을 무작위로 추출하고, 추출된 로컬 이미지 패치들을 부분 이미지 영역들 중 회전되거나 뒤집히지 않은 부분 이미지 영역내에서 재배치할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)가 추출된 로컬 이미지 패치들을 재배치 하는 동작은 해당 부분 이미지 영역 내에서 로컬 이미지 패치들을 섞는 동작에 대응될 수 있다.In S330 , the electronic device 1000 may extract a local image patch from the remaining partial image areas that are not rotated or flipped (eg, not flipped) among the partial image areas, and rearrange the extracted local image patches. According to an embodiment, the electronic device 1000 randomly extracts local image patches from the remaining partial image regions that are not rotated or flipped among the partial image regions, and rotates the extracted local image patches from among the partial image regions. It can be rearranged within the non-flipped partial image area. According to an embodiment, the operation of the electronic device 1000 to rearrange the extracted local image patches may correspond to the operation of mixing the local image patches in the corresponding partial image area.

보다 상세하게는, 전자 장치(1000)는 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치들을 추출할 위치를 결정할 수 있다. 또한, 전자 장치(1000)는 증강된 이미지 데이터를 이용하여 학습된 인공 신경망의 하이퍼 파라미터(예컨대 베타 파라미터)에 기초하여, 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 추출될 상기 로컬 이미지 패치의 크기를 결정할 수 있다. 전자 장치(1000)는 결정된 위치 및 크기에 기초하여, 부분 이미지 영역들 중, 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출할 수 있다.In more detail, the electronic device 1000 may determine a location to extract local image patches from the remaining partial image area that is not rotated or flipped. In addition, the electronic device 1000 determines the size of the local image patch to be extracted from the remaining partial image area that is not rotated or flipped based on the hyper parameter (eg, beta parameter) of the artificial neural network learned using the augmented image data. can decide Based on the determined position and size, the electronic device 1000 may extract a local image patch from the remaining partial image regions that are not rotated or inverted among the partial image regions.

도 4는 일 실시 예에 따라 전자 장치가 이미지 데이터를 증강하는 과정을 설명하기 위한 도면이다.4 is a diagram for describing a process in which an electronic device augments image data, according to an embodiment.

S402에서, 전자 장치(1000)는 강아지 형상을 포함하는 이미지를 획득할 수 있다. S404에서, 전자 장치(1000)는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할할 수 있다. 예를 들어, 전자 장치(1000)는 이미지 영역을 소정의 부분 이미지 영역들로 분할함으로써 sub-region들을 생성할 수 있다.In S402, the electronic device 1000 may acquire an image including a dog shape. In S404 , the electronic device 1000 may divide the image region of the image into predetermined partial image regions. For example, the electronic device 1000 may generate sub-regions by dividing the image region into predetermined partial image regions.

S406에서, 전자 장치(1000)는 소정의 부분 이미지 영역들 중 하나의 부분 이미지 영역(422)은 제1 방향으로 회전(rotation)시킬 수 있다. 또한, 전자 장치(1000)는 부분 이미지 영역들 중 하나의 부분 이미지 영역(424)은 소정의 제2 방향으로 플립시킬 수 있다. 또한, 전자 장치(1000)는 부분 이미지 영역들 중 부분 이미지 영역(428) 및 부분 이미지 영역(426)으로부터 로컬 이미지 패치들을 무작위로 추출하고, 추출된 로컬 이미지 패치들을 섞을 수 있다. In S406 , the electronic device 1000 may rotate one partial image area 422 among predetermined partial image areas in a first direction. Also, the electronic device 1000 may flip one partial image area 424 of the partial image areas in a second predetermined direction. Also, the electronic device 1000 may randomly extract local image patches from the partial image region 428 and the partial image region 426 among the partial image regions, and may mix the extracted local image patches.

일 실시 예에 의하면, 전자 장치(1000)는 상술한 과정이 종료되면, 모든 부분 이미지 영역들(422, 424, 426, 428)에 RGB Channel shuffling을 적용할 수도 있다.According to an embodiment, when the above-described process is completed, the electronic device 1000 may apply RGB channel shuffling to all partial image regions 422 , 424 , 426 , and 428 .

S408에서, 전자 장치(1000)는 소정의 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성하고, 생성된 증강 이미지 데이터에 대응되는 증강 이미지를 생성할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 S406이후에, 모든 부분 이미지 영역들에 RGB Channel shuffling을 적용하고, RGB Channel shuffling이 적용된 모든 부분 이미지 영역들을 이용하여 증강 이미지 데이터를 생성할 수도 있다.In S408 , the electronic device 1000 may generate augmented image data from partial image regions to which a predetermined data augmentation technique is applied, and may generate an augmented image corresponding to the generated augmented image data. According to an embodiment, after S406 , the electronic device 1000 may apply RGB channel shuffling to all partial image regions and generate augmented image data using all partial image regions to which RGB channel shuffling is applied.

도 5는 일 실시 예에 따른 전자 장치가 데이터를 증강하는 구체적인 방법의 흐름도이다.5 is a flowchart of a detailed method for augmenting data by an electronic device according to an embodiment.

도 5를 참조하여 전자 장치(1000)가 회전, 플립 데이터 증강 기법이 적용된 나머지 부분 이미지 영역들로부터 로컬 이미지 패치를 추출하는 과정을 구체적으로 설명하기로 한다. S510에서, 전자 장치(1000)는 회전 되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출할 위치를 결정할 수 있다. 일 실시 예에 의하면, 전자 장치(1000)는 무작위 방법으로, 회전 되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출할 위치를 결정할 수도 있다. 예를 들어, 전자 장치(1000)는 로컬 이미지 패치를 추출할 위치를 0 내지 B(예컨대 인공 신경망의 하이퍼파라미터) 범위에서 Uniform distribution으로 랜덤하게 결정될 수 있다.A process in which the electronic device 1000 extracts a local image patch from the remaining partial image regions to which the rotation and flip data augmentation techniques are applied will be described in detail with reference to FIG. 5 . In S510 , the electronic device 1000 may determine a location to extract a local image patch from the remaining partial image area that is not rotated or flipped. According to an embodiment, the electronic device 1000 may determine a location from which to extract a local image patch from the remaining partial image area that is not rotated or flipped by a random method. For example, the electronic device 1000 may randomly determine a location from which a local image patch is to be extracted in a uniform distribution in a range of 0 to B (eg, a hyperparameter of an artificial neural network).

S520에서, 전자 장치(1000)는 증강된 이미지 데이터를 이용하여 학습될 인공 신경망의 하이퍼 파라미터에 기초하여 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 추출될 로컬 이미지 패치의 크기를 결정할 수 있다. 일 실시 예에 의하면, 로컬 이미지 패치의 가로 및 세로의 크기는 하이퍼 파라미터 B에 의해 결정될 수 있다. 일 실시 예에 의하면 로컬 이미지 패치의 가로 크기는 W(예컨대 부분 이미지 영역의 폭)/2 - B이고, 로컬 이미지 패치의 세로 크기는 H(예컨대 부분 이미지 영역의 높이)/2 ?? B로 결정될 수 있다. In S520 , the electronic device 1000 may determine the size of the local image patch to be extracted from the remaining partial image area that is not rotated or flipped based on the hyper parameter of the artificial neural network to be learned using the augmented image data. According to an embodiment, the horizontal and vertical sizes of the local image patch may be determined by the hyper parameter B. According to an embodiment, the horizontal size of the local image patch is W (eg, the width of the partial image region)/2 - B, and the vertical size of the local image patch is H (eg, the height of the partial image region)/2 ?? B can be determined.

S530에서, 전자 장치(1000)는 S510 내지 S520에서 결정된 위치 및 크기에 기초하여 부분 이미지 영역들 중 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출할 수 있다. S540에서, 전자 장치(1000)는 추출된 로컬 이미지 패치들을 재배치함으로써, 부분 이미지 영역들 내 소정의 데이터 증강 기법 적용을 완료할 수 있다.In S530 , the electronic device 1000 may extract a local image patch from the remaining partial image regions that are not rotated or overturned among the partial image regions based on the positions and sizes determined in S510 to S520 . In S540 , the electronic device 1000 may complete application of a predetermined data augmentation technique in the partial image regions by rearranging the extracted local image patches.

도 6은 일 실시 예에 따른 전자 장치가 데이터 증강 과정에서 로컬 이미지 패치를 추출하는 과정을 설명하기 위한 도면이다.6 is a diagram for describing a process in which an electronic device extracts a local image patch in a data augmentation process according to an embodiment.

도 6을 참조하면, 전자 장치(1000)가 로컬 이미지 패치(602)를 추출하는 과정이 도시된다. 예를 들어, 부분 이미지 영역의 폭을 W, 부분 이미지 영역의 높이를 H로 가정하면, 전자 장치(1000)가 부분 이미지 영역으로부터 추출할 로컬 이미지 패치의 폭(614) 및 높이(612)는 각각 W(예컨대 부분 이미지 영역의 폭)/2 ?? B 와 H(예컨대 부분 이미지 영역의 높이)/2 ?? B로 결정될 수 있다. Referring to FIG. 6 , a process in which the electronic device 1000 extracts a local image patch 602 is illustrated. For example, assuming that the width of the partial image region is W and the height of the partial image region is H, the width 614 and the height 612 of the local image patch to be extracted from the partial image region by the electronic device 1000 are respectively W (eg width of partial image area)/2 ?? B and H (eg height of partial image area)/2 ?? B can be determined.

도 7은 서로 다른 방법으로 증강된 데이터에 기초하여 학습된 인공 신경망의 성능 차이를 설명하기 위한 도면이다.7 is a diagram for explaining a performance difference of an artificial neural network learned based on data augmented by different methods.

일 실시 예에 의하면, 전자 장치(1000)는 단일 이미지(702)를 획득하고, 획득된 단일 이미지(702)에 대하여 서로 다른 데이터 증강 기법을 적용함으로써 증강 데이터를 생성할 수 있다.According to an embodiment, the electronic device 1000 may generate augmented data by acquiring a single image 702 and applying different data augmentation techniques to the acquired single image 702 .

예를 들어, 전자 장치(1000)는 단일 이미지(702)의 전체 이미지 영역을 회전시킴으로써 전역 회전 이미지(704)를 생성할 수 있다. 또 다른 실시 예에 의하면, 전자 장치(1000)는 단일 이미지(702)의 이미지 영역을 소정의 부분 이미지 영역들(706)로 분할하고, 분할된 부분 이미지 영역들 각각에 대해 서로 다른 방식으로 회전시킴으로써 지역 회전 이미지(708)를 생성할 수 있다. 전자 장치(1000)가 생성한 전역 회전 이미지(704) 및 지역 회전 이미지(708)는 다른 방식으로 생성된 증강 데이터일 수 있다.For example, the electronic device 1000 may generate the global rotation image 704 by rotating the entire image area of the single image 702 . According to another embodiment, the electronic device 1000 divides the image region of the single image 702 into predetermined partial image regions 706 and rotates each of the divided partial image regions in different ways. A local rotation image 708 may be generated. The global rotation image 704 and the regional rotation image 708 generated by the electronic device 1000 may be augmented data generated by other methods.

도 7을 참조하면, 지역 회전 이미지(708)와 같은 증강 데이터에 기초하여 학습된 인공 신경망의 정확도(712)와 전역 회전 이미지(704)와 같은 증강 데이터에 기초하여 학습된 인공 신경망의 정확도(714)가 도시된다. 도 7의 차트 (712)를 참조하면, 단일 이미지의 영역을 소정의 부분 이미지 영역들로 분할하고, 부분 이미지 영역들 각각을 회전시킴으로써 생성된 지역 회전 이미지(708)에 기초하여 학습된 인공 신경망의 정확도가 더 높은 것을 알 수 있다.Referring to FIG. 7 , the accuracy 712 of an artificial neural network learned based on augmented data, such as a local rotation image 708 , and an accuracy 714 of an artificial neural network learned based on augmented data such as a global rotation image 704 . ) is shown. Referring to the chart 712 of FIG. 7 , an artificial neural network trained based on a regional rotation image 708 generated by dividing a region of a single image into predetermined partial image regions, and rotating each of the partial image regions. It can be seen that the accuracy is higher.

도 8은 일 실시 예에 따른 전자 장치가 생성한 증강 데이터에 기초하여 학습된 인공 신경망의 성능을 설명하기 위한 도면이다.8 is a diagram for explaining the performance of an artificial neural network learned based on augmented data generated by an electronic device according to an embodiment.

도 8의 차트를 참조하면, 도 1 내지 6에 개시된, 전자 장치(1000)가 데이터를 증강하는 방법에 따라 생성된 증강 데이터에 기초하여 학습된 인공 신경망(802)의 Top1 Accuracy가 76,87로 가장 높은 것을 알 수 있다. 또한, 전자 장치(1000)가 생성된 증강 데이터에 기초하여 학습된 인공 신경망(802)의 Top5 Accuracy가 93.42로 타 신경망에 비해 두 번째로 높은 정확도를 나타냄을 알 수 있다.Referring to the chart of FIG. 8 , the Top1 Accuracy of the artificial neural network 802 learned based on the augmented data generated according to the method of augmenting data by the electronic device 1000 disclosed in FIGS. 1 to 6 is 76,87. It can be seen that the highest Also, it can be seen that the Top5 Accuracy of the artificial neural network 802 learned based on the augmented data generated by the electronic device 1000 is 93.42, indicating the second highest accuracy compared to other neural networks.

도 9는 일 실시 예에 따른 전자 장치가 생성한 증강 데이터에 기초하여 학습된 인공 신경망의 성능을 설명하기 위한 도면이다.9 is a diagram for explaining the performance of an artificial neural network learned based on augmented data generated by an electronic device according to an embodiment.

도 9의 차트를 참조하면, 전자 장치(1000)가 생성한 증강 데이터에 기초하여 학습된 인공 신경망(902)의 Top1 Accuracy가 제1 후보 네트워크 군들(예컨대 WideResNet에서부터 WideResNet+CutMix까지) 중 88.65로 제일 높음을 알 수 있다. Referring to the chart of FIG. 9 , the Top1 Accuracy of the artificial neural network 902 learned based on the augmented data generated by the electronic device 1000 is the highest at 88.65 among the first candidate network groups (eg, WideResNet to WideResNet+CutMix). high can be seen.

또한, 도 9의 차트를 참조하면, 전자 장치(1000)가 생성한 증강 데이터에 기초하여 학습된 인공 신경망(904)의 Top1 Accuracy가 제2 후보 네트워크 군들(예컨대 ResNetXt에서부터 ResNetXt+CutMix까지) 중 86.33로 제일 높은 정확도를 나타냄을 관측할 수 있다.Also, referring to the chart of FIG. 9 , the Top1 Accuracy of the artificial neural network 904 learned based on the augmented data generated by the electronic device 1000 is 86.33 among the second candidate network groups (eg, from ResNetXt to ResNetXt+CutMix). It can be observed that it shows the highest accuracy.

도 10은 일 실시 예에 따른 전자 장치의 블록도이다.10 is a block diagram of an electronic device according to an embodiment.

일 실시 예에 의하면, 전자 장치(1000)는 프로세서(1400), 네트워크 인터페이스(1500) 및 메모리(1700)를 포함할 수 있다. 그러나 또 다른 실시 예에 의하면, 데이터를 증강하는 전자 장치(1000)는 도시된 구성 요소보다 많은 구성 요소를 포함할 수도 있고, 더 적은 구성 요소를 포함할 수도 있다.According to an embodiment, the electronic device 1000 may include a processor 1400 , a network interface 1500 , and a memory 1700 . However, according to another embodiment, the electronic device 1000 for augmenting data may include more or fewer components than the illustrated components.

프로세서(1400)는, 통상적으로 전자 장치(1000)의 전반적인 동작을 제어한다. The processor 1400 generally controls the overall operation of the electronic device 1000 .

일 실시 예에 의하면, 본 개시에 따른 프로세서(1400)는 메모리(1700)에 저장된 프로그램들을 실행함으로써, 도 1 내지 도 9에 기재된 전자 장치(1000)의 기능을 수행할 수 있다. 또한, 프로세서(1400)는 하나 또는 복수의 프로세서로 구성될 수 있고, 하나 또는 복수의 프로세서는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU와 같은 그래픽 전용 프로세서일 수 있다. 일 실시 예에 의하면, 프로세서(1400)가 범용 프로세서 및 그래픽 전용 프로세서를 포함하는 경우, 각각의 범용 프로세서 및 그래픽 전용 프로세서는 별도의 칩으로 구현될 수도 있다. According to an embodiment, the processor 1400 according to the present disclosure executes programs stored in the memory 1700 to perform the functions of the electronic device 1000 described in FIGS. 1 to 9 . In addition, the processor 1400 may include one or a plurality of processors, and the one or more processors may be a general-purpose processor such as a CPU, an AP, a digital signal processor (DSP), or the like, or a graphics-only processor such as a GPU. According to an embodiment, when the processor 1400 includes a general-purpose processor and a graphics-only processor, each of the general-purpose processor and the graphics-only processor may be implemented as a separate chip.

일 실시 예에 의하면, 프로세서(1400)가 복수의 프로세서 또는 그래픽 전용 프로세서로 구현될 때, 복수의 프로세서 또는 그래픽 전용 프로세서 중 적어도 일부는 전자 장치(1000) 및 전자 장치(1000)와 연결된 다른 전자 장치 또는 서버에 탑재될 수도 있다. According to an embodiment, when the processor 1400 is implemented as a plurality of processors or graphics-only processors, at least some of the plurality of processors or graphics-only processors may include the electronic device 1000 and other electronic devices connected to the electronic device 1000 . Alternatively, it may be mounted on a server.

예를 들어, 프로세서(1400)는, 메모리(1700)에 저장된 프로그램들을 실행함으로써, 이미지 데이터를 증강하고, 증강된 이미지 데이터를 이용하여 인공 신경망을 학습시킬 수 있다.For example, the processor 1400 may augment image data by executing programs stored in the memory 1700 , and train the artificial neural network using the augmented image data.

일 실시 예에 의하면, 프로세서(1400)는 증강 대상이 되는 이미지 데이터를 획득하고, 상기 이미지 데이터에 대응되는 이미지의 이미지 영역을 소정의 부분 이미지 영역들로 분할하고, 상기 분할된 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역들에 서로 다른 데이터 증강 기법을 적용하고, 상기 서로 다른 데이터 증강 기법이 적용된 부분 이미지 영역들로부터 증강 이미지 데이터를 생성할 수 있다. 또한, 일 실시 예에 의하면, 프로세서(1400)는 상기 생성된 증강 이미지 데이터로부터, 상기 증강 이미지 데이터에 대응되는 증강 이미지를 생성할 수 있다.According to an embodiment, the processor 1400 acquires image data to be augmented, divides an image region of an image corresponding to the image data into predetermined partial image regions, and among the divided partial image regions Different data augmentation techniques may be applied to at least one partial image region, and augmented image data may be generated from the partial image regions to which the different data augmentation techniques are applied. Also, according to an embodiment, the processor 1400 may generate an augmented image corresponding to the augmented image data from the generated augmented image data.

일 실시 예에 의하면, 프로세서(1400)는 단일의 이미지를 획득하고, 상기 획득된 단일의 이미지 내 픽셀 데이터를 식별하고, 상기 단일의 이미지 내 식별된 픽셀 데이터를 상기 이미지 데이터로 획득할 수 있다. 일 실시 예에 의하면, 프로세서(1400)는 상기 이미지 영역을 4개의 부분 이미지 영역들로 분할 할 수 있다.According to an embodiment, the processor 1400 may acquire a single image, identify pixel data in the acquired single image, and acquire the identified pixel data in the single image as the image data. According to an embodiment, the processor 1400 may divide the image area into four partial image areas.

일 실시 예에 의하면, 프로세서(1400)는 상기 이미지 영역을 서로 직교하는 두개의 축을 기준으로 나누어지는 사분면 각각에 대응되는 부분 이미지 영역들로 분할 할 수 있다. 일 실시 예에 의하면, 프로세서(1400)는 상기 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 각도로 회전시키고, 상기 부분 이미지 영역들 중 적어도 하나의 부분 이미지 영역을 소정의 확률로 뒤집고, 상기 부분 이미지 영역들 중, 상기 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 로컬 이미지 패치를 추출하고, 상기 추출된 로컬 이미지 패치를 재 배치할 수 있다.According to an embodiment, the processor 1400 may divide the image region into partial image regions corresponding to each of the quadrants divided based on two axes orthogonal to each other. According to an embodiment, the processor 1400 rotates at least one partial image area among the partial image areas by a predetermined angle, and flips at least one partial image area among the partial image areas with a predetermined probability; Among the partial image regions, a local image patch may be extracted from the remaining partial image regions that are not rotated or inverted, and the extracted local image patch may be rearranged.

일 실시 예에 의하면, 프로세서(1400)는 상기 부분 이미지 영역들 중 하나의 부분 이미지 영역을, 90도, 180도 또는 270도 중 하나의 각도로 회전시키고, 상기 부분 이미지 영역들 중 하나의 부분 이미지 영역을 소정의 확률로 좌우 또는 상하 방향으로 뒤집고, 상기 부분 이미지 영역들 중 회전되거나 뒤집히지 않은 부분 이미지 영역으로부터 로컬 이미지 패치들을 무작위로 추출하고, 추출된 로컬 이미지 패치들을 상기 부분 이미지 영역들 중 회전되거나 뒤집히지 않은 부분 이미지 영역 내에서 재배치할 수 있다.According to an embodiment, the processor 1400 rotates one partial image area among the partial image areas by one of 90 degrees, 180 degrees, and 270 degrees, and the partial image of one of the partial image areas A region is flipped horizontally or vertically with a predetermined probability, and local image patches are randomly extracted from a rotated or non-flipped partial image region among the partial image regions, and the extracted local image patches are rotated among the partial image regions. It can be repositioned within the partial image area that is not flipped or flipped.

일 실시 예에 의하면, 프로세서(1400)는 상기 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 상기 로컬 이미지 패치를 추출할 위치를 결정하고, 상기 증강된 이미지 데이터를 이용하여 학습될 인공 신경망의 하이퍼 파라미터에 기초하여, 상기 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 추출될 상기 로컬 이미지 패치의 크기를 결정하고, 상기 결정된 위치 및 크기에 기초하여 상기 부분 이미지 영역들 중, 상기 회전되거나 뒤집히지 않은 나머지 부분 이미지 영역으로부터 상기 로컬 이미지 패치를 추출하고, 상기 추출된 로컬 이미지 패치를 재 배치할 수 있다.According to an embodiment, the processor 1400 determines a position to extract the local image patch from the remaining partial image area that is not rotated or flipped, and is based on the hyperparameter of the artificial neural network to be learned using the augmented image data. based on the size of the local image patch to be extracted from the remaining partial image region that is not rotated or flipped, and the remaining partial image that is not rotated or flipped among the partial image regions based on the determined position and size The local image patch may be extracted from the region, and the extracted local image patch may be rearranged.

네트워크 인터페이스(1500)는 전자 장치(1000)가 다른 장치(미도시) 및 서버(2000)와 통신을 하게 하는 하나 이상의 구성요소를 포함할 수 있다. 다른 장치(미도시)는 전자 장치(1000)와 같은 컴퓨팅 장치이거나, 센싱 장치일 수 있으나, 이에 제한되지 않는다. 예를 들어, 네트워크 인터페이스(1500)는 근거리 통신부, 이동 통신부를 포함할 수 있다. The network interface 1500 may include one or more components that allow the electronic device 1000 to communicate with another device (not shown) and the server 2000 . The other device (not shown) may be a computing device such as the electronic device 1000 or a sensing device, but is not limited thereto. For example, the network interface 1500 may include a short-range communication unit and a mobile communication unit.

근거리 통신부(short-range wireless communication unit) 는, 블루투스 통신부, BLE(Bluetooth Low Energy) 통신부, 근거리 무선 통신부(Near Field Communication unit), WLAN(와이파이) 통신부, 지그비(Zigbee) 통신부, 적외선(IrDA, infrared Data Association) 통신부, WFD(Wi-Fi Direct) 통신부, UWB(ultra wideband) 통신부, 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. 이동 통신부는, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신한다. 일 실시 예에 의하면, 네트워크 인터페이스(미도시)는 프로세서의 제어에 의하여, 서버로 인공 신경망 내 가중치 값들, 가중치들을 포함하는 인공 신경망의 손실 함수의 값, 손실 기울기 값, 증강 데이터 등을 전송할 수 있고, 서버로부터 수정된 가중치 값들, 손실 함수의 값, 기울기 값 및 증강 데이터 등을 수신할 수도 있다.Short-range wireless communication unit, Bluetooth communication unit, BLE (Bluetooth Low Energy) communication unit, Near Field Communication unit, WLAN (Wi-Fi) communication unit, Zigbee communication unit, infrared (IrDA, infrared) It may include a data association) communication unit, a Wi-Fi Direct (WFD) communication unit, an ultra wideband (UWB) communication unit, and the like, but is not limited thereto. The mobile communication unit transmits/receives a radio signal to and from at least one of a base station, an external terminal, and a server on a mobile communication network. According to an embodiment, the network interface (not shown) may transmit, under the control of the processor, weight values in the artificial neural network, the value of the loss function of the artificial neural network including the weights, the loss gradient value, augmented data, etc. , may receive the modified weight values, the value of the loss function, the gradient value and the augmentation data, etc. from the server.

메모리(1700)는, 프로세서(1400)의 처리 및 제어를 위한 프로그램을 저장할 수 있고, 전자 장치(1000)로 입력되거나 전자 장치(1000)로부터 출력되는 데이터를 저장할 수도 있다. 또한, 메모리(1700)는 인공 신경망을 구성하는 레이어들, 레이어들에 포함된 노드들 및 레이어들의 연결 강도에 관한 가중치들에 대한 정보를 저장할 수 있다. 또한, 메모리(1700)는 증강 데이터들을 더 저장할 수도 있다. 또한, 메모리(1700)는 인공 신경망 내 가중치들이 수정 및 갱신될 경우, 수정 및 갱신된 가중치에 관한 정보를 더 저장할 수 있다. The memory 1700 may store a program for processing and control of the processor 1400 , and may also store data input to or output from the electronic device 1000 . Also, the memory 1700 may store information about the layers constituting the artificial neural network, nodes included in the layers, and weights related to the connection strength of the layers. Also, the memory 1700 may further store augmentation data. Also, when the weights in the artificial neural network are modified and updated, the memory 1700 may further store information about the modified and updated weights.

메모리(1700)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 1700 may include a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD memory), and a RAM. (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk , may include at least one type of storage medium among optical disks.

도 11은 일 실시 예에 따른 서버의 블록도이다.11 is a block diagram of a server according to an embodiment.

일 실시 예에 의하면, 서버(2000)는 프로세서(2400), 네트워크 인터페이스(2500) 및 데이터 베이스(2700)를 포함할 수 있다. 일 실시 예에 의하면 프로세서(2400)는 서버(2000)내 구성들의 전반적인 동작을 제어할 수 있다. 예를 들어, 프로세서(2400)는 데이터 베이스(2700)에 저장된 하나 이상의 인스트럭션을 실행함으로써, 전자 장치(1000)로부터 증강 데이터를 획득하고, 획득된 증강 데이터에 기초하여 인공 신경망을 스스로 학습시킬 수도 있다. 프로세서(2400)는 직접 이미지 데이터를 획득하고, 획득된 이미지 데이터를 증강시킬 수도 있다. 프로세서(2400)는 네트워크 인터페이스(2500)를 제어함으로써, 증강 데이터 또는 학습된 인공 신경망의 정보들을 전자 장치(1000)로 전송하거나, 수신할 수 있다. According to an embodiment, the server 2000 may include a processor 2400 , a network interface 2500 , and a database 2700 . According to an embodiment, the processor 2400 may control the overall operation of components in the server 2000 . For example, the processor 2400 may obtain augmented data from the electronic device 1000 by executing one or more instructions stored in the database 2700, and self-learn the artificial neural network based on the obtained augmented data. . The processor 2400 may directly acquire image data and augment the acquired image data. The processor 2400 may transmit or receive augmented data or information of the learned artificial neural network to the electronic device 1000 by controlling the network interface 2500 .

서버(2000)내 데이터 베이스(2700)의 구성은 도 10의 전자 장치 내 메모리에 대응될 수 있다. 일 실시 예에 의하면 데이터 베이스(2700)는 이미지 데이터, 이미지 데이터를 증강함으로써 생성된 증강 이미지 데이터, 인공 신경망에 대한 정보들을 저장할 수 있다.The configuration of the database 2700 in the server 2000 may correspond to the memory in the electronic device of FIG. 10 . According to an embodiment, the database 2700 may store image data, augmented image data generated by augmenting the image data, and information about an artificial neural network.

일 실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 개시를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. The method according to an embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present disclosure, or may be known and available to those skilled in the art of computer software.

또한, 상기 일 실시 예에 다른 방법을 수행하도록 하는 프로그램이 저장된 기록매체를 포함하는 컴퓨터 프로그램 장치가 제공될 수 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. In addition, according to the embodiment, a computer program apparatus including a recording medium storing a program for performing another method may be provided. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

이상에서 본 개시의 실시예에 대하여 상세하게 설명하였지만 본 개시의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 개시의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 개시의 권리범위에 속한다.Although the embodiments of the present disclosure have been described in detail above, the scope of the present disclosure is not limited thereto, and various modifications and improved forms of the present disclosure are also provided by those skilled in the art using the basic concept of the present disclosure as defined in the following claims. belong to the scope of the right.

Claims

A method for augmenting image data, comprising:
acquiring image data to be augmented;
dividing an image region of an image corresponding to the image data into predetermined partial image regions;
applying different data augmentation techniques to at least one of the divided partial image regions; and
generating augmented image data from the partial image regions to which the different data augmentation techniques are applied; A method comprising

The method of claim 1, wherein the method
generating an augmented image corresponding to the augmented image data from the generated augmented image data; A method further comprising:

The method of claim 1, wherein the obtaining of the image data comprises:
acquiring a single image;
identifying pixel data in the single acquired image; and
acquiring identified pixel data in the single image as the image data; A method comprising

The method of claim 1 , wherein the dividing into the predetermined partial image regions comprises:
dividing the image region into four partial image regions; A method comprising

The method of claim 1 , wherein the dividing into the predetermined partial image regions comprises:
dividing the image region into partial image regions corresponding to each of the quadrants divided based on two axes orthogonal to each other; A method comprising

The method of claim 1, wherein the applying the different data augmentation techniques comprises:
rotating at least one partial image area among the partial image areas by a predetermined angle;
inverting at least one partial image region among the partial image regions with a predetermined probability; and
extracting a local image patch from the remaining partial image areas that are not rotated or flipped among the partial image areas, and rearranging the extracted local image patches; A method comprising

5. The method of claim 4, wherein the applying the different data augmentation techniques comprises:
rotating one of the partial image regions by an angle of one of 90 degrees, 180 degrees or 270 degrees;
flipping one partial image area among the partial image areas in the left and right or up and down directions with a predetermined probability; and
randomly extracting local image patches from the rotated or non-flipped partial image regions among the partial image regions, and rearranging the extracted local image patches within the rotated or non-flipped partial image regions among the partial image regions; A method comprising

7. The method of claim 6, wherein relocating the local image patch comprises:
determining a location from which to extract the local image patch from the remaining partial image area that is not rotated or flipped;
determining a size of the local image patch to be extracted from the remaining partial image region that is not rotated or flipped based on a hyper parameter of an artificial neural network to be learned using the augmented image data;
extracting the local image patch from the remaining partial image areas that are not rotated or overturned among the partial image areas based on the determined position and size; and
rearranging the extracted local image patch; A method comprising

An electronic device for augmenting image data, comprising:
network interface;
a memory storing one or more instructions; and
at least one processor executing the one or more instructions; including, wherein the at least one processor executes the one or more instructions,
Acquire image data to be augmented,
dividing an image region of an image corresponding to the image data into predetermined partial image regions;
applying different data augmentation techniques to at least one partial image region among the divided partial image regions;
An electronic device that generates augmented image data from partial image regions to which the different data augmentation techniques are applied.

10. The method of claim 9, wherein the at least one processor comprises:
An electronic device that generates an augmented image corresponding to the augmented image data from the generated augmented image data.

10. The method of claim 9, wherein the at least one processor comprises:
to acquire a single image,
identify pixel data in the single acquired image;
obtaining the identified pixel data in the single image as the image data.

10. The method of claim 9, wherein the at least one processor comprises:
dividing the image region into four partial image regions.

10. The method of claim 9, wherein the at least one processor comprises:
and dividing the image region into partial image regions corresponding to quadrants divided based on two axes orthogonal to each other.

10. The method of claim 9, wherein the at least one processor comprises:
rotating at least one partial image area among the partial image areas by a predetermined angle;
flipping at least one partial image area among the partial image areas with a predetermined probability;
extracting a local image patch from the remaining partial image areas that are not rotated or flipped among the partial image areas, and rearranging the extracted local image patches.

13. The method of claim 12, wherein the at least one processor comprises:
rotating one of the partial image regions by an angle of one of 90 degrees, 180 degrees or 270 degrees;
flipping one partial image area among the partial image areas in the left and right or up and down directions with a predetermined probability;
randomly extracting local image patches from a rotated or non-flipped partial image region of the partial image regions, and relocating the extracted local image patches within a rotated or non-flipped partial image region of the partial image regions; Device.

15. The method of claim 14, wherein the at least one processor comprises:
determining a location to extract the local image patch from the remaining partial image area that is not rotated or flipped;
determining the size of the local image patch to be extracted from the remaining partial image area that is not rotated or flipped based on the hyperparameter of the artificial neural network to be learned using the augmented image data;
extracting the local image patch from the remaining partial image regions that are not rotated or flipped among the partial image regions based on the determined position and size;
Relocating the extracted local image patch, the electronic device.

A method for augmenting image data, comprising:
acquiring image data to be augmented;
dividing an image region of an image corresponding to the image data into predetermined partial image regions;
applying different data augmentation techniques to at least one of the divided partial image regions; and
generating augmented image data from the partial image regions to which the different data augmentation techniques are applied; A computer-readable recording medium recording a program for executing the method on a computer, comprising a.