KR102492121B1

KR102492121B1 - Image classification method using data augmentation technology and computing device for performing the method

Info

Publication number: KR102492121B1
Application number: KR1020220064091A
Authority: KR
Inventors: 김법렬; 이동은; 박희원; 왕치엔
Original assignee: 경북대학교 산학협력단
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2023-01-26

Abstract

Disclosed are an image classification method using data augmentation technology and a computing device for performing the method, which can increase the number of training images required to train an artificial neural network model. The image classification method performed by a computing device comprises: a step of generating a gradient energy map for training images included in a training dataset; a step of increasing the number of training images included in the training dataset by adjusting the size of the training images based on the generated gradient energy map; and a step of using the training dataset including the increased training images to train a learning model for image classification.

Description

Image classification method using data augmentation technology and computing device performing the method

본 발명은 이미지 분류 방법에 관한 것으로, 보다 구체적으로는 컴퓨터 비전 기술과 이미지 처리 기술을 이용하여 이미지를 분류하는 방법에 관한 것이다.The present invention relates to an image classification method, and more particularly, to a method for classifying an image using computer vision technology and image processing technology.

컴퓨터 비전 기술과 이미지 처리 기술을 이용하여 이미지를 분류하기 위해 다양한 종류의 인공 신경망 모델(neural network model)이 사용되고 있다. 이러한 인공 신경망 모델은 학습에 필요한 훈련 이미지가 많을수록 이미지의 분류 정확도가 향상될 수 있으며, 이를 위해 효율적인 이미지 리사이징(resizing)이 요구되고 있다.Various types of artificial neural network models are used to classify images using computer vision technology and image processing technology. The artificial neural network model can improve image classification accuracy as the number of training images required for learning increases, and for this purpose, efficient image resizing is required.

그러나 이미지 리사이징을 위한 크기 조정(scaling) 및 자르기(cropping)과 같은 종래 방법은 훈련 이미지에 포함된 중요한 정보의 손실을 초래하여 인공 신경망 모델의 학습 결과에 부정적인 영향을 끼칠 수 있는 문제가 있다.However, conventional methods such as scaling and cropping for image resizing may cause loss of important information included in training images, which may negatively affect the learning result of an artificial neural network model.

본 발명은 훈련 이미지에 포함된 중요한 정보의 손실 없이 이미지 리사이징을 수행함으로써 인공 신경망 모델의 학습에 필요한 훈련 이미지의 개수를 증가시킬 수 있는 방법 및 장치를 제공한다.The present invention provides a method and apparatus capable of increasing the number of training images required for learning an artificial neural network model by performing image resizing without loss of important information included in the training images.

또한, 본 발명은 통합 최대-평균 풀링(Integrated Max-Mean Pooling) 레이어 및 주의 기반 네트워크(Attention-Based Network) 노드를 이용하는 인공 신경망 모델을 통해 이미지의 분류 정확도를 향상시킬 수 있는 방법 및 장치를 제공한다.In addition, the present invention provides a method and apparatus capable of improving the classification accuracy of an image through an artificial neural network model using an integrated max-mean pooling layer and an attention-based network node. do.

본 발명의 일실시예에 따른 컴퓨팅 장치에 의해 수행되는 이미지 분류 방법은 훈련 데이터 세트에 포함된 각각의 훈련 이미지에 대해 경사 에너지 맵(Gradient energy map)을 생성하는 단계; 상기 생성된 경사 에너지 맵에 기초하여 상기 각각의 훈련 이미지의 크기를 조절함으로써 상기 훈련 데이터 세트에 포함되는 훈련 이미지의 개수를 증가시키는 단계; 및 상기 증가된 훈련 이미지를 포함하는 훈련 데이터 세트를 이용하여 이미지 분류를 위한 학습 모델을 학습하는 단계를 포함할 수 있다.An image classification method performed by a computing device according to an embodiment of the present invention includes generating a gradient energy map for each training image included in a training data set; increasing the number of training images included in the training data set by adjusting the size of each training image based on the generated gradient energy map; and learning a learning model for image classification using a training data set including the augmented training image.

상기 경사 에너지 맵은 상기 훈련 이미지를 구성하는 픽셀에 대해 각 색상 채널에 대한 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값의 합산 결과를 이용하여 결정될 수 있다.The gradient energy map may be determined using a summation result of an absolute gradient value in an x direction and an absolute value in a y direction for each color channel with respect to pixels constituting the training image.

상기 경사 에너지 맵을 생성하는 단계는 상기 픽셀에 대해 각 색상 채널 별로 결정된 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값의 합산 결과를 모두 결합하여 해당 픽셀에 대한 픽셀 값으로 결정할 수 있다.In the generating of the gradient energy map, a pixel value for a corresponding pixel may be determined by combining a summation result of an absolute gradient value in an x direction and an absolute absolute gradient value in a y direction determined for each color channel for the pixel.

상기 훈련 이미지의 개수를 증가시키는 단계는 상기 훈련 이미지를 가로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 행에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하는 단계; 및 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 가로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대하는 단계를 포함할 수 있다.The increasing of the number of training images may include extracting seam pixels having a minimum pixel value for all rows of a gradient energy map corresponding to the training image when scaling the training image horizontally; and reducing or enlarging the size of the training image in a horizontal direction by removing or copying the extracted seam pixels.

상기 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하는 단계는 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 행에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하는 단계; 상기 식별된 제1 행의 제1 이음새 픽셀에 인접하는 제2 행의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하는 단계; 및 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 행까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택하는 단계를 포함할 수 있다.The extracting of the seam pixels having the minimum pixel value may include identifying a pixel having the minimum pixel value as a first seam pixel among pixels located in a first row of a gradient energy map corresponding to the training image; selecting, as a second seam pixel, a neighboring pixel having a minimum pixel value among neighboring pixels of a second row adjacent to the identified first seam pixel of the first row; and iteratively selecting a third seam pixel having a minimum pixel value up to a last row of the gradient energy map based on the method for selecting the second seam pixel.

상기 훈련 이미지의 개수를 증가시키는 단계는 상기 훈련 이미지를 세로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 열에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하는 단계; 및 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 세로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대하는 단계를 포함할 수 있다.The increasing of the number of training images may include extracting seam pixels having minimum pixel values for all columns of a gradient energy map corresponding to the training image when scaling the training image in a vertical direction; and reducing or enlarging the size of the training image in a vertical direction by removing or copying the extracted seam pixels.

상기 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하는 단계는 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 열에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하는 단계; 상기 식별된 제1 열의 제1 이음새 픽셀에 인접하는 제2 열의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하는 단계; 및 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 열까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택하는 단계를 포함할 수 있다.The extracting of the seam pixels having the minimum pixel value may include identifying a pixel having the minimum pixel value as a first seam pixel among pixels located in a first column of a gradient energy map corresponding to the training image; selecting, as a second seam pixel, a neighboring pixel having a minimum pixel value among neighboring pixels of a second column adjacent to the identified first seam pixel of the first row; and iteratively selecting a third seam pixel having a minimum pixel value up to a last row of the gradient energy map based on the method for selecting the second seam pixel.

본 발명의 일실시예에 따른 이미지 분류 방법은 복수의 테스트 이미지가 포함된 테스트 데이터 세트를 식별하는 단계; 상기 식별된 테스트 데이터 세트를 이미지 분류를 위한 학습 모델에 적용함으로써 상기 식별된 테스트 데이터 세트에 포함된 테스트 이미지들을 결함 종류 별로 분류하는 단계를 포함하고, 상기 학습 모델은 경사 에너지 맵에 기초하여 훈련 이미지의 크기를 조절함으로써 상기 훈련 이미지의 개수가 증가된 훈련 데이터 세트를 통해 학습될 수 있다.An image classification method according to an embodiment of the present invention includes identifying a test data set including a plurality of test images; and classifying test images included in the identified test data set for each defect type by applying the identified test data set to a learning model for image classification, wherein the learning model is a training image based on a gradient energy map. It can be learned through a training data set in which the number of training images is increased by adjusting the size of .

상기 경사 에너지 맵은 상기 픽셀에 대해 각 색상 채널 별로 결정된 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값의 합산 결과를 모두 결합하여 해당 픽셀에 대한 픽셀 값으로 결정함으로써 생성될 수 있다.The gradient energy map may be generated by combining results of summing the gradient absolute values in the x direction and the absolute gradient values in the y direction determined for each color channel for the pixel and determining a pixel value for the corresponding pixel.

상기 훈련 이미지는 상기 훈련 이미지를 가로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 행에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하고, 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 가로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대함으로써 증가될 수 있다.When the training image is to be scaled in the horizontal direction, seam pixels having a minimum pixel value are extracted for all rows of the gradient energy map corresponding to the training image, and the extracted seam pixels are removed; or It can be increased by reducing or enlarging the size of the training image in the horizontal direction by copying.

상기 최소 픽셀 값을 가지는 이음새 픽셀들은 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 행에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하고, 상기 식별된 제1 행의 제1 이음새 픽셀에 인접하는 제2 행의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하며, 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 행까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택함으로써 추출될 수 있다.The seam pixels having the minimum pixel value identify a pixel having the minimum pixel value among pixels located in the first row of the gradient energy map corresponding to the training image as a first seam pixel, and 1 A neighboring pixel having a minimum pixel value among neighboring pixels of a second row adjacent to a seam pixel is selected as a second seam pixel, and up to the last row of the gradient energy map based on the method of selecting the second seam pixel. It can be extracted by repeatedly selecting the third seam pixel having the minimum pixel value.

상기 훈련 이미지는 상기 훈련 이미지를 세로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 열에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하고, 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 세로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대함으로써 증가될 수 있다.When the training image is to be resized in the vertical direction, seam pixels having a minimum pixel value are extracted for all columns of the gradient energy map corresponding to the training image, and the extracted seam pixels are removed or copied. By doing so, it can be increased by reducing or enlarging the size of the training image in the vertical direction.

상기 최소 픽셀 값을 가지는 이음새 픽셀들은 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 열에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하고, 상기 식별된 제1 열의 제1 이음새 픽셀에 인접하는 제2 열의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하며, 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 열까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택함으로써 추출될 수 있다.The seam pixels having the minimum pixel value identify a pixel having the minimum pixel value among pixels located in a first column of the gradient energy map corresponding to the training image as a first seam pixel, and A neighboring pixel having a minimum pixel value among neighboring pixels in a second column adjacent to a pixel is selected as a second seam pixel, and based on a method of selecting the second seam pixel, the minimum pixel value up to the last row of the gradient energy map. It can be extracted by repeatedly selecting the third seam pixel having .

본 발명의 일실시예에 따른 이미지 분류 방법을 수행하는 컴퓨팅 장치는 프로세서를 포함하고, 상기 프로세서는 복수의 훈련 이미지가 포함된 훈련 데이터 세트에 대해 각각의 훈련 이미지에 대한 경사 에너지 맵(Gradient energy map)을 생성하고, 상기 생성된 경사 에너지 맵에 기초하여 상기 각각의 훈련 이미지의 크기를 조절함으로써 상기 훈련 데이터 세트에 포함되는 훈련 이미지의 개수를 증가시키며, 상기 증가된 훈련 이미지를 포함하는 훈련 데이터 세트를 이용하여 이미지 분류를 위한 학습 모델을 학습할 수 있다.A computing device that performs an image classification method according to an embodiment of the present invention includes a processor, and the processor includes a gradient energy map for each training image for a training data set including a plurality of training images. ), and increasing the number of training images included in the training data set by adjusting the size of each training image based on the generated gradient energy map, and a training data set including the increased training image. A learning model for image classification can be learned using .

상기 프로세서는 상기 픽셀에 대해 각 색상 채널 별로 결정된 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값의 합산 결과를 모두 결합하여 해당 픽셀에 대한 픽셀 값으로 결정함으로써 경사 에너지 맵을 생성할 수 있다.The processor may generate a gradient energy map by combining a summation result of an absolute gradient value in the x direction and an absolute gradient value in the y direction determined for each color channel for the pixel and determining a pixel value for the corresponding pixel.

상기 프로세서는 상기 훈련 이미지를 가로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 행에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하고, 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 가로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대함으로써 상기 훈련 이미지의 개수를 증가시킬 수 있다.When the processor intends to scale the training image horizontally, the processor extracts seam pixels having a minimum pixel value for all rows of the gradient energy map corresponding to the training image, and removes or copies the extracted seam pixels. By doing so, the number of training images can be increased by reducing or enlarging the size of the training images in the horizontal direction.

상기 프로세서는 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 행에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하고, 상기 식별된 제1 행의 제1 이음새 픽셀에 인접하는 제2 행의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하며, 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 행까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택함으로써 상기 최소 픽셀 값을 가지는 이음새 픽셀들을 추출할 수 있다.The processor identifies a pixel having a minimum pixel value among pixels located in a first row of the gradient energy map corresponding to the training image as a first seam pixel, and adjacent to the first seam pixel of the identified first row. Among the neighboring pixels in the second row, a neighboring pixel having a minimum pixel value is selected as a second seam pixel, and based on the method of selecting the second seam pixel, a first row having a minimum pixel value up to the last row of the gradient energy map is selected. By repeatedly selecting 3 seam pixels, it is possible to extract seam pixels having the minimum pixel value.

상기 프로세서는 상기 훈련 이미지를 세로 방향으로 크기 조절하고자 하는 경우, 상기 훈련 이미지에 대응하는 경사 에너지 맵의 모든 열에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하고, 상기 추출된 이음새 픽셀들을 제거 또는 복사함으로써 세로 방향으로 상기 훈련 이미지의 크기를 축소 또는 확대함으로써 상기 훈련 이미지의 개수를 증가시킬 수 있다.When the processor intends to resize the training image in the vertical direction, the processor extracts seam pixels having a minimum pixel value for all columns of the gradient energy map corresponding to the training image, and removes or copies the extracted seam pixels. The number of training images may be increased by reducing or enlarging the size of the training images in the vertical direction.

상기 프로세서는 상기 훈련 이미지에 대응하는 경사 에너지 맵의 제1 열에 위치한 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀로 식별하고, 상기 식별된 제1 열의 제1 이음새 픽셀에 인접하는 제2 열의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀로 선택하며, 상기 제2 이음새 픽셀을 선택하는 방법에 기초하여 상기 경사 에너지 맵의 마지막 열까지 최소 픽셀 값을 가지는 제3 이음새 픽셀을 반복하여 선택함으로써 상기 최소 픽셀 값을 가지는 이음새 픽셀들을 추출할 수 있다.The processor identifies a pixel having a minimum pixel value among pixels located in a first column of the gradient energy map corresponding to the training image as a first seam pixel, and a second seam pixel adjacent to the identified first seam pixel in the first column. A neighboring pixel having a minimum pixel value among neighboring pixels in a row is selected as a second seam pixel, and a third seam pixel having a minimum pixel value up to the last row of the gradient energy map based on the method for selecting the second seam pixel. It is possible to extract seam pixels having the minimum pixel value by repeatedly selecting .

본 발명은 훈련 이미지에 포함된 중요한 정보의 손실 없이 이미지 리사이징을 수행함으로써 인공 신경망 모델의 학습에 필요한 훈련 이미지의 개수를 증가시킬 수 있다.The present invention can increase the number of training images required for learning an artificial neural network model by performing image resizing without loss of important information included in the training images.

또한, 본 발명은 통합 최대 평균 풀링(Integrated Max-Mean Pooling) 레이어 및 주의 기반 네트워크(Attention-Based Network) 노드를 이용하는 인공 신경망 모델을 통해 이미지의 분류 정확도를 향상시킬 수 있다.In addition, the present invention can improve classification accuracy of images through an artificial neural network model using an integrated max-mean pooling layer and an attention-based network node.

도 1은 본 발명의 일실시예에 따른 이미지 분류 방법을 수행하는 컴퓨팅 장치를 나타낸 도면이다.
도 2는 본 발명의 일실시예에 따른 이미지 분류를 위한 학습 모델의 학습 방법을 나타낸 도면이다.
도 3은 본 발명의 일실시예에 따른 학습 데이터 세트에 포함되는 콘크리트 이미지의 샘플을 나타낸 도면이다.
도 4는 본 발명의 일실시예에 따른 원본 훈련 이미지에 이미지 증가 방법을 적용하여 사본 훈련 이미지를 생성함으로써 훈련 이미지의 개수를 증가시킨 예를 나타낸 도면이다.
도 5 내지 도 7은 본 발명의 일실시예에 따른 심 카빙을 이용하여 훈련 데이터 세트에 포함된 훈련 이미지의 개수를 증가시키는 방법을 나타낸 도면이다.
도 8은 본 발명의 일실시예에 따른 이미지 처리 기술을 나타낸 도면이다.
도 9는 본 발명의 일실시예에 따른 학습 모델의 구조를 나타낸 도면이다.1 is a diagram illustrating a computing device performing an image classification method according to an embodiment of the present invention.
2 is a diagram illustrating a learning method of a learning model for image classification according to an embodiment of the present invention.
3 is a diagram showing samples of concrete images included in a learning data set according to an embodiment of the present invention.
4 is a diagram illustrating an example in which the number of training images is increased by generating duplicate training images by applying an image augmentation method to an original training image according to an embodiment of the present invention.
5 to 7 are views illustrating a method of increasing the number of training images included in a training data set using seam carving according to an embodiment of the present invention.
8 is a diagram illustrating an image processing technique according to an embodiment of the present invention.
9 is a diagram showing the structure of a learning model according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 이미지 분류 방법을 수행하는 컴퓨팅 장치를 나타낸 도면이다.1 is a diagram illustrating a computing device performing an image classification method according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 컴퓨팅 장치(100)는 하나 이상의 프로세서(110), 프로세서(110)에 의하여 수행되는 프로그램(140)을 로드(load)하는 메모리(130)와, 프로그램(140)를 저장하는 스토리지(120)를 포함할 수 있다. 도 1의 컴퓨팅 장치(100)에 포함된 구성 요소는 일례에 불과하고, 본 발명이 속한 기술분야의 통상의 기술자라면 도 1에 도시된 구성 요소들 외에 다른 범용적인 구성 요소들이 더 포함될 수 있음을 알 수 있다.As shown in FIG. 1, the computing device 100 includes one or more processors 110, a memory 130 for loading a program 140 executed by the processor 110, and a program 140. It may include a storage 120 for storing. Components included in the computing device 100 of FIG. 1 are only examples, and those skilled in the art to which the present invention pertains may further include other general-purpose components in addition to the components shown in FIG. 1 . Able to know.

프로세서(110)는 컴퓨팅 장치(100)의 각 구성의 전반적인 동작을 제어한다. 프로세서(110)는 CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 또는 본 발명의 기술 분야에 잘 알려진 임의의 형태의 프로세서 중 적어도 하나를 포함하여 구성될 수 있다. 또한, 프로세서(110)는 본 발명의 다양한 실시예들에 따른 방법/동작을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있다. 컴퓨팅 장치(100)는 하나 이상의 프로세서를 구비할 수 있다.The processor 110 controls the overall operation of each component of the computing device 100 . The processor 110 may include at least one of a Central Processing Unit (CPU), a Micro Processor Unit (MPU), a Micro Controller Unit (MCU), a Graphic Processing Unit (GPU), or any type of processor well known in the art. can be configured to include Also, the processor 110 may perform an operation for at least one application or program for executing a method/operation according to various embodiments of the present disclosure. Computing device 100 may include one or more processors.

메모리(130)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(130)는 본 발명의 다양한 실시예들에 따른 방법/동작들을 실행하기 위하여 스토리지(120)에 저장된 프로그램(140)을 로드(load) 할 수 있다. 메모리(130)의 예시는 RAM이 될 수 있으나, 이에 한정되는 것은 아니다.Memory 130 stores various data, commands and/or information. The memory 130 may load the program 140 stored in the storage 120 to execute methods/operations according to various embodiments of the present invention. An example of the memory 130 may be RAM, but is not limited thereto.

스토리지(120)는 하나 이상의 프로그램(140)을 비임시적으로 저장할 수 있다. 스토리지(120)는 ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리 등과 같은 비휘발성 메모리, HDD(Hard Disk Drive), SSD(Solid State Disk), 착탈형 디스크, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터로 읽을 수 있는 기록 매체를 포함하여 구성될 수 있다.The storage 120 may non-temporarily store one or more programs 140 . The storage 120 may include nonvolatile memory such as read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, a hard disk drive (HDD), a solid state disk (SSD), and a removable memory. It may be configured to include a disk, or any type of computer-readable recording medium well known in the art to which the present invention pertains.

프로그램(140)은 본 발명의 다양한 실시예들에 따른 방법/동작들이 구현된 하나 이상의 동작(action)들을 포함할 수 있다. 여기서, 동작은 프로그램(140)에서 실현되는 명령어에 대응한다. 예를 들어, 프로그램(140)은 훈련 데이터 세트에 포함된 각각의 훈련 이미지에 대해 경사 에너지 맵(Gradient energy map)을 생성하는 동작, 생성된 경사 에너지 맵에 기초하여 각각의 훈련 이미지의 크기를 조절함으로써 훈련 데이터 세트에 포함되는 훈련 이미지의 개수를 증가시키는 동작 및 증가된 훈련 이미지를 포함하는 훈련 데이터 세트를 이용하여 이미지 분류를 위한 학습 모델을 학습하는 동작을 수행하도록 하는 인스트럭션들을 포함할 수 있다.Program 140 may include one or more actions in which methods/acts according to various embodiments of the present invention are implemented. Here, an operation corresponds to an instruction realized in the program 140 . For example, the program 140 generates a gradient energy map for each training image included in the training data set, and adjusts the size of each training image based on the generated gradient energy map. It may include instructions for performing an operation of increasing the number of training images included in the training data set and an operation of learning a learning model for image classification using the training data set including the increased training images.

프로그램(140)이 메모리(130)에 로드 되면, 프로세서(110)는 프로그램을 구현하기 위한 복수의 동작들을 실행시킴으로써 본 발명의 다양한 실시예들에 따른 방법/동작들을 수행할 수 있다.When the program 140 is loaded into the memory 130, the processor 110 may perform methods/operations according to various embodiments of the present disclosure by executing a plurality of operations for implementing the program.

프로그램(140)의 실행 화면은 디스플레이(150)을 통해 표시될 수 있다. 도 1의 경우, 디스플레이(150)는 컴퓨팅 장치(100)와 연결되는 별도의 장치로 표현되나, 스마트폰, 테블릿 등 사용자가 휴대할 수 있는 단말기와 같은 컴퓨팅 장치(100)의 경우 디스플레이(150)가 컴퓨팅 장치(100)의 구성 요소로 될 수 있다. 디스플레이(150)에 표현되는 화면은 프로그램에 정보를 입력하기 전이나 프로그램의 실행 결과일 수 있다.An execution screen of the program 140 may be displayed through the display 150 . In the case of FIG. 1 , the display 150 is represented as a separate device connected to the computing device 100, but in the case of a computing device 100 such as a user-portable terminal such as a smartphone or a tablet, the display 150 ) may be a component of the computing device 100. The screen displayed on the display 150 may be before information is input to the program or a result of executing the program.

도 2는 본 발명의 일실시예에 따른 이미지 분류를 위한 학습 모델의 학습 방법을 나타낸 도면이다.2 is a diagram illustrating a learning method of a learning model for image classification according to an embodiment of the present invention.

단계(210)에서, 컴퓨팅 장치(100)의 프로세서(110)는 이미지 분류를 위한 학습 모델을 학습시킬 때 이용되는 학습 데이터 세트를 식별할 수 있다. 일례로, 본 발명은 학습 데이터 세트에 콘크리트 이미지가 사용될 수 있으며, 콘크리트 이미지는 비균열(non-crack) 콘크리트 이미지, 표면 균열(surface crack) 콘크리트 이미지, 박리(delamination) 콘크리트 이미지 및 스폴링(spalling) 콘크리트 이미지 중 어느 하나의 이미지에 해당할 수 있다. 다만, 이러한 학습 데이터 세트에 사용되는 이미지의 종류 및 구분은 하나의 예시일 뿐 상기의 예에 한정되지 않는다. At step 210, the processor 110 of the computing device 100 may identify a training data set to be used when training a training model for image classification. For example, in the present invention, concrete images can be used in the training data set, and the concrete images include non-crack concrete images, surface crack concrete images, delamination concrete images, and spalling concrete images. ) may correspond to any one of the concrete images. However, the type and classification of images used in this learning data set is only an example and is not limited to the above example.

도 3은 본 발명의 일실시예에 따른 학습 데이터 세트에 포함되는 콘크리트 이미지의 샘플을 나타낸 도면이다. 도 3의 (a)는 비균열 콘크리트 이미지를 나타내고, (b)는 표면 균열 콘크리트 이미지를 나타내며, (c)는 박리 콘크리트 이미지를 나타내고, (d)는 스폴링 콘크리트 이미지를 나타낸다. 3 is a diagram showing samples of concrete images included in a learning data set according to an embodiment of the present invention. 3 (a) shows an image of non-cracked concrete, (b) shows an image of surface cracked concrete, (c) shows an image of peeled concrete, and (d) shows an image of spalled concrete.

프로세서(110)는 이와 같이 식별된 학습 데이터 세트에 포함된 콘크리트 이미지를 훈련 이미지 및 테스트 이미지로 구분하여 훈련 이미지 세트 및 테스트 이미지 세트를 생성할 수 있다. 일례로, 프로세서(110)는 학습 데이터 세트에 포함된 콘크리트 이미지를 7 : 3의 비율로 훈련 이미지 및 테스트 이미지로 구분하여 훈련 이미지 세트 및 테스트 이미지 세트를 생성할 수 있다. 다만, 이와 같이 콘크리트 이미지를 구분하는 비율 값은 하나의 예시일 뿐 비율 값은 사용자에 의해 변경될 수 있다.The processor 110 may generate a training image set and a test image set by dividing the concrete images included in the identified training data set into training images and test images. For example, the processor 110 may generate a training image set and a test image set by dividing concrete images included in the training data set into training images and test images at a ratio of 7:3. However, the ratio value for classifying concrete images in this way is only an example, and the ratio value can be changed by the user.

단계(220)에서, 프로세서(110)는 학습 데이터 세트 중 훈련 데이터 세트에 포함된 훈련 이미지의 개수를 증가시킬 수 있다. 보다 구체적으로 프로세서(110)는 훈련 데이터 세트에 포함된 훈련 이미지 각각에 대해 크기 조정(Scaling), 자르기(Cropping) 및 심 카빙(Seam Carving) 중 적어도 하나의 이미지 증가 방법을 적용함으로써 훈련 데이터 세트에 포함된 훈련 이미지의 개수를 증가시킬 수 있다. In step 220, the processor 110 may increase the number of training images included in the training data set among the training data sets. More specifically, the processor 110 applies at least one image augmentation method of scaling, cropping, and seam carving to each of the training images included in the training data set to obtain the training data set. The number of included training images can be increased.

도 4는 본 발명의 일실시예에 따른 원본 훈련 이미지에 이미지 증가 방법을 적용하여 사본 훈련 이미지를 생성함으로써 훈련 이미지의 개수를 증가시킨 예를 나타낸 도면이다. 먼저, 프로세서(110)는 원본 훈련 이미지(410)에 스케일 팩터(Scale factor) 0.5 및 2를 적용함으로써 원본 훈련 이미지(410) 대비 크기가 1/4로 줄어든 사본 훈련 이미지(420) 및 4배로 늘어난 사본 훈련 이미지(430)를 생성할 수 있다. 이러한 스케일 팩터의 크기는 하나의 예시일 뿐 이에 한정되지 않는다.4 is a diagram illustrating an example in which the number of training images is increased by generating duplicate training images by applying an image augmentation method to an original training image according to an embodiment of the present invention. First, the processor 110 applies scale factors of 0.5 and 2 to the original training image 410, thereby generating a copy training image 420 whose size is reduced by 1/4 compared to the original training image 410 and a copy training image 420 whose size is increased by 4 times. A duplicate training image 430 may be created. The size of this scale factor is only one example and is not limited thereto.

또한, 프로세서(110)는 원본 훈련 이미지(410) 내부의 특정 영역에 대해 자르기를 적용함으로써 해당 특정 영역만을 포함하는 사본 훈련 이미지(440)를 생성할 수 있다. Also, the processor 110 may generate a copy training image 440 including only the specific region by applying cropping to a specific region inside the original training image 410 .

또한, 프로세서(110)는 원본 훈련 이미지(410)에 심 카빙을 적용함으로써 가로 길이는 동일하지만 세로 길이가 다른 사본 훈련 이미지(450)와 세로 길이는 동일하지만 가로 길이가 다른 사본 훈련 이미지(460)를 생성할 수 있다. In addition, the processor 110 applies seam carving to the original training image 410 to obtain a copy training image 450 having the same horizontal length but different vertical length and a copy training image 460 having the same vertical length but different horizontal length. can create

단계(230)에서, 프로세서(110)는 훈련 이미지가 증가된 훈련 데이터 세트에 대해 이미지 처리 기술을 적용할 수 있다. 보다 구체적으로 프로세서(110)는 훈련 이미지에 이미지 분할을 적용하여 관심 대상을 추출하고, 관심 대상이 추출된 훈련 이미지를 그레이스케일로 변환함으로써 추후 학습 모델의 성능을 향상시키는데 이용될 수 있다. In step 230, the processor 110 may apply an image processing technique to the training data set in which the training images are augmented. More specifically, the processor 110 extracts an object of interest by applying image segmentation to a training image, and converts the training image from which the object of interest is extracted into a gray scale, thereby improving performance of a learning model.

단계(240)에서, 프로세서(110)는 이미지 처리 기술이 적용된 훈련 데이터 세트의 훈련 이미지를 이용하여 이미지 분류를 위한 학습 모델을 학습할 수 있다. 일례로, 본 발명은 이미지 분류를 위한 학습 모델로 CNN(Convolutional Neural Network) 아키텍처를 이용할 수 있다. In step 240, the processor 110 may learn a learning model for image classification using a training image of a training data set to which an image processing technique is applied. As an example, the present invention may use a Convolutional Neural Network (CNN) architecture as a learning model for image classification.

CNN은 이미지 분류에 자주 사용되는 심층 신경망으로 활성화 함수가 있는 구성된 컨볼루션 레이어, 입력 특성을 분석하기 위한 풀링 레이어 및 분류를 위한 연결 레이어로 구성될 수 있다.A CNN is a deep neural network often used for image classification, which can consist of a convolutional layer with an activation function, a pooling layer for analyzing input characteristics, and a connection layer for classification.

일례로, VGG16은 가장 자주 사용되는 CNN 변종이다. VGG16은 총 16개의 레이어로 구성되는데 그 중 13개의 레이어는 컨볼루션 레이어이고, 나머지 3개의 레이어는 완전히 연결되어 있다. VGG16은 ReLU를 활성화 함수로 사용하여 학습 모델의 비선형성을 개선하고, softmax 함수를 사용하여 최종 계층에서 이미지를 분류할 수 있다.As an example, VGG16 is the most frequently used CNN variant. VGG16 consists of a total of 16 layers, 13 of which are convolutional layers, and the remaining 3 layers are fully connected. VGG16 uses ReLU as an activation function to improve the nonlinearity of the learning model, and uses a softmax function to classify images in the final layer.

도 5 내지 도 7은 본 발명의 일실시예에 따른 심 카빙을 이용하여 훈련 데이터 세트에 포함된 훈련 이미지의 개수를 증가시키는 방법을 나타낸 도면이다. 5 to 7 are views illustrating a method of increasing the number of training images included in a training data set using seam carving according to an embodiment of the present invention.

프로세서(110)는 데이터 세트를 구성하는 훈련 데이터 세트에 포함된 각각의 훈련 이미지에 대해 경사 에너지 맵(Gradient energy map)을 생성할 수 있다. 보다 구체적으로 프로세서(110)는 훈련 이미지를 구성하는 픽셀에 대해 각 색상 채널에 대한 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값을 합산할 수 있다. 이후 프로세서(110)는 해당 픽셀에 대해 식별된 모든 색상 채널에 대한 x 방향의 경사 절대 값 및 y 방향의 경사 절대 값의 합산 결과를 결합하고, 결합 결과를 해당 픽셀에 대한 픽셀 값으로 하는 경사 에너지 맵을 생성할 수 있다. 일례로, MATLAB 플래폼을 사용하는 경우 프로세서(110)는 "gradients(img)"의 내장 함수를 사용하여 훈련 이미지의 x 방향 경사 절대 값 Fx 및 y 방향 경사 절대 값 Fy를 계산할 수 있으며, 계산된 Fx와 Fy의 합을 통해 경사 에너지 맵을 생성할 수 있다.The processor 110 may generate a gradient energy map for each training image included in the training data set constituting the data set. More specifically, the processor 110 may sum the x-direction gradient absolute value and the y-direction gradient absolute value for each color channel with respect to pixels constituting the training image. Then, the processor 110 combines the sum of the gradient absolute values in the x direction and the absolute gradient values in the y direction for all color channels identified for the corresponding pixel, and obtains the gradient energy with the combined result as the pixel value for the corresponding pixel. You can create maps. As an example, when using the MATLAB platform, the processor 110 may use the built-in function of “gradients(img)” to calculate the absolute value of the gradient in the x direction Fx and the absolute value of the gradient in the y direction Fy of the training image, and the calculated Fx A gradient energy map can be generated through the sum of F and Fy.

일례로, 도 5 내지 도 7은 원본 훈련 이미지(510)에 심 카빙을 적용하여 세로 길이는 동일하지만 가로 길이가 다른 사본 훈련 이미지(530, 540)을 생성하는 방법을 나타낸다. 먼저, 도 5에서, 프로세서(110)는 원본 훈련 이미지(510)에 대응하는 경사 에너지 맵(520)의 모든 행에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출할 수 있다. 이를 위해 프로세서(110)는 경사 에너지 맵(520)의 제1 행에 포함된 픽셀들 중 최소 픽셀 값을 가지는 픽셀을 제1 이음새 픽셀(11)로 식별할 수 있다. As an example, FIGS. 5 to 7 show a method of generating copy training images 530 and 540 having the same vertical length but different horizontal lengths by applying seam carving to an original training image 510 . First, in FIG. 5 , the processor 110 may extract seam pixels having minimum pixel values for all rows of the gradient energy map 520 corresponding to the original training image 510 . To this end, the processor 110 may identify a pixel having a minimum pixel value among pixels included in the first row of the gradient energy map 520 as the first seam pixel 11 .

이후 프로세서(110)는 제1 행의 이음새 픽셀(11)에 인접하는 제2 행의 이웃 픽셀들(25, 17, 12) 중 최소 픽셀 값을 가지는 이웃 픽셀을 제2 이음새 픽셀(12)로 선택할 수 있다. Thereafter, the processor 110 selects a neighboring pixel having a minimum pixel value among the neighboring pixels 25, 17, and 12 of the second row adjacent to the seam pixel 11 of the first row as the second seam pixel 12. can

이와 같이 프로세서(110)는 선택된 이음새 픽셀과 인접하는 다음 행의 이웃 픽셀들 중 최소 픽셀 값을 가지는 이웃 픽셀을 이음새 픽셀로 선택하는 방법을 경사 에너지 맵(520)의 마지막 행까지 반복함으로써 이음새 픽셀들로 구성된 이음새 조각을 결정할 수 있다. 일례로, 도 5에서 이음새 조각은 (11, 12, 14, 16, 14, 10)으로 구성될 수 있다. 상기의 예에서는 제1 행에서부터 이음새 픽셀을 선택하였으나 이는 하나의 예시일 뿐 마지막 행에서부터 이음새 픽셀을 선택할 수도 있다.In this way, the processor 110 repeats the method of selecting a neighboring pixel having the minimum pixel value among the neighboring pixels of the next row adjacent to the selected seam pixel as a seam pixel up to the last row of the gradient energy map 520, so that the seam pixels It is possible to determine the seam piece consisting of For example, in FIG. 5, the seam piece may be composed of (11, 12, 14, 16, 14, 10). In the above example, the seam pixels are selected from the first row, but this is only an example, and the seam pixels may be selected from the last row.

한편, 프로세서(110)는 이와 같이 결정된 이음새 조각을 이용하여 원본 훈련 이미지(510)로부터 세로 길이는 동일하지만 가로 길이가 다른 사본 훈련 이미지(530, 540)을 생성할 수 있다.Meanwhile, the processor 110 may generate copy training images 530 and 540 having the same vertical length but different horizontal lengths from the original training image 510 using the seam pieces determined in this way.

일례로, 프로세서(110)는 도 6의 (a)와 같이 경사 에너지 맵(520)에서 결정된 이음새 조각에 대응하는 이음새 픽셀들을 제거함으로써 도 6의 (b)와 같이 세로 길이는 동일하지만 가로 길이가 짧은 사본 훈련 이미지(530)를 생성할 수 있다. For example, the processor 110 removes seam pixels corresponding to the seam pieces determined in the gradient energy map 520 as shown in (a) of FIG. 6 so that the vertical length is the same but the horizontal length is the same as in (b) of FIG. A short copy training image 530 can be created.

다른 일례로, 프로세서(110)는 도 7의 (a)와 같이 경사 에너지 맵(520)에서 결정된 이음새 조각에 대응하는 이음새 픽셀들을 복사하여 가로 방향으로 추가함으로써 도 7의 (b)와 같이 세로 길이는 동일하지만 가로 길이가 긴 사본 훈련 이미지(540)를 생성할 수 있다.As another example, the processor 110 copies the seam pixels corresponding to the seam pieces determined in the gradient energy map 520 as shown in (a) of FIG. 7 and adds them in the horizontal direction, thereby extending the vertical length as shown in (b) of FIG. may generate a copy training image 540 having the same but long horizontal length.

도 5 내지 도 7은 원본 훈련 이미지(510)에 심 카빙을 적용하여 세로 길이는 동일하지만 가로 길이가 다른 사본 훈련 이미지(530, 540)을 생성하는 방법을 제공하지만 이는 하나의 예시일 뿐 원본 훈련 이미지(510)에 심 카빙을 적용하여 가로 길이는 동일하지만 세로 길이가 다른 사본 훈련 이미지를 생성할 수도 있다.5 to 7 provide a method of generating copy training images 530 and 540 having the same vertical length but different horizontal lengths by applying seam carving to the original training image 510, but this is only an example of the original training image 510. By applying seam carving to the image 510, copy training images having the same horizontal length but different vertical lengths may be generated.

이를 위해 프로세서(110)는 원본 훈련 이미지(510)에 대응하는 경사 에너지 맵(520)의 모든 열에 대해 최소 픽셀 값을 가지는 이음새 픽셀들을 추출하여 이음새 조각을 결정하고, 결정된 이음새 조각에 대응하는 이음새 픽셀들을 제거 또는 복사함으로써 가로 길이는 동일하지만 세로 길이가 다른 사본 훈련 이미지를 생성할 수 있다.To this end, the processor 110 determines a seam piece by extracting seam pixels having minimum pixel values for all columns of the gradient energy map 520 corresponding to the original training image 510, and the seam pixel corresponding to the determined seam piece. By removing or copying them, copy training images having the same horizontal length but different vertical lengths can be created.

도 8은 본 발명의 일실시예에 따른 이미지 처리 기술을 나타낸 도면이다.8 is a diagram illustrating an image processing technique according to an embodiment of the present invention.

위에서 언급한 바와 같이 프로세서(110)는 훈련 이미지가 증가된 훈련 데이터 세트에 대해 이미지 처리 기술을 적용할 수 있다.As mentioned above, the processor 110 may apply an image processing technique to a training data set in which training images are increased.

이를 위해 단계(810)에서, 프로세서(110)는 훈련 이미지에 이미지 분할을 적용하여 관심 대상을 추출할 수 있다. 이미지 분할은 훈련 이미지를 분석함에 있어 보다 의미 있고 이해하기 쉬운 이미지로 변환하는데 사용될 수 있다. 일례로, 프로세서(110)는 k-평균 알고리즘(K-means clustering algorithm)을 이용하여 훈련 이미지를 분할함으로써 관심 대상을 추출할 수 있다. 이러한 이미지 분할 알고리즘은 하나의 예시일 뿐 이에 한정되지 않는다.To this end, in step 810, the processor 110 may extract an object of interest by applying image segmentation to the training image. Image segmentation can be used to transform training images into more meaningful and understandable images for analysis. For example, the processor 110 may extract an object of interest by dividing a training image using a k-means clustering algorithm. This image segmentation algorithm is only one example and is not limited thereto.

단계(820)에서, 프로세서(110)는 관심 대상이 추출된 훈련 이미지를 그레이스케일로 변환할 수 있다. 그레이스케일 레벨은 휘도를 유지하면서 훈련 이미지에서 색조 정보 및 채도 정보를 제거하는 데 사용될 수 있다. 이와 같이 프로세서(110)는 훈련 이미지를 그레이스케일로 변환하여 픽셀 값에서 대비(Contrast)의 변화 깊이를 증가시킴으로써 추후 학습 모델의 성능을 향상시킬 수 있다.In operation 820, the processor 110 may convert the training image from which the object of interest is extracted into a gray scale. Grayscale levels can be used to remove hue and chroma information from training images while preserving luminance. In this way, the processor 110 converts the training image into a gray scale and increases the depth of change in contrast in pixel values, thereby improving the performance of a later learning model.

단계(830)에서, 프로세서(110)는 그레이스케일로 변환된 훈련 이미지의 가장 중요한 에지(Edge) 특징을 식별하여 깨끗한 에지 맵을 생성할 수 있다.In step 830, the processor 110 may generate a clean edge map by identifying the most important edge features of the training image converted to grayscale.

이후 단계(840)에서, 프로세서(110)는 생성된 에지 맵에 대해 영상 이진화(Binarization)를 수행함으로써 훈련 이미지의 색상 채널을 보완할 수 있다. After that, in step 840, the processor 110 may supplement color channels of the training image by performing image binarization on the generated edge map.

마지막으로 단계(850)에서, 프로세서(110)는 영상 이진화가 수행된 훈련 이미지에 대해 중간값 필터(Median filter)를 적용하여 잡음을 제거함으로써 훈련 이미지의 해상도를 향상시킬 수 있다.Finally, in step 850, the processor 110 may improve the resolution of the training image by removing noise by applying a median filter to the training image on which image binarization has been performed.

도 9는 본 발명의 일실시예에 따른 학습 모델의 구조를 나타낸 도면이다.9 is a diagram showing the structure of a learning model according to an embodiment of the present invention.

본 발명에서 제공하는 이미지 분류를 위한 학습 모델은 학습 데이터 세트 중 테스트 데이터 세트에 포함된 테스트 이미지를 비균열(non-crack) 콘크리트 이미지, 표면 균열(surface crack) 콘크리트 이미지, 박리(delamination) 콘크리트 이미지 및 스폴링(spalling) 콘크리트 이미지 중 어느 하나의 콘크리트 이미지로 분류하는데 활용될 수 있다. In the learning model for image classification provided by the present invention, the test images included in the test data set among the training data sets are non-crack concrete images, surface crack concrete images, and delamination concrete images. and spalling concrete images.

도 9는 이러한 학습 모델에 대응하는 컨볼루션 기반의 결함 분류 신경망의 아키텍처를 나타낸다. 본 발명에서 제공하는 학습 모델인 결함 분류 신경망은 통합 최대-평균 풀링 레이어와 주의 기반 네트워크 노드를 함께 사용하여 VGG16으로 재구성됨으로써 콘크리트 이미지의 분류 정확도를 향상시킬 수 있다. 9 shows the architecture of a convolution-based defect classification neural network corresponding to this learning model. The learning model provided by the present invention, the defect classification neural network, is reconstructed into VGG16 using an integrated maximum-average pooling layer and an attention-based network node, thereby improving the classification accuracy of concrete images.

일반적으로 최대 풀링 레이어와 평균 풀링 레이어를 단독적으로 사용하는 방법은 이미지에 있는 정보가 손실될 수 있다는 단점이 있다. 그러나 본 발명에서 제공하는 통합 최대-평균 풀링 레이어는 이미지에 있는 중요한 정보의 손실을 피할 수 있다. 이를 위해 본 발명에서 제공하는 통합 최대-평균 풀링 레이어는 배경 정보를 유지하면서 분산 증가를 피하기 위해 풀링 영역의 모든 구성 요소를 평가하고 가장 강한 활성화의 전경 텍스처 정보만 풀링 영역의 대표적인 특징으로 캡처하는 개념이다.In general, the method of using the maximum pooling layer and the average pooling layer alone has a disadvantage in that information in the image may be lost. However, the combined maximum-average pooling layer provided by the present invention can avoid loss of important information in the image. To this end, the integrated maximum-average pooling layer provided by the present invention evaluates all components of the pooling area to avoid increasing variance while maintaining background information, and captures only the foreground texture information of the strongest activation as a representative feature of the pooling area. to be.

한편, 주의 기반 네트워크의 주요 목적은 이미지에서 여러 객체를 인식하는 것으로 본 발명에서 제공하는 주의 기반 네트워크 노드는 가능한 최대 성능을 달성하기 위해 채널 축을 기준으로 평균을 결정하는 네트워크의 연결 수단을 구현하기 위해 최대-평균 풀링 기법을 사용할 수 있다.On the other hand, the main purpose of the attention-based network is to recognize multiple objects in an image, and the attention-based network node provided in the present invention implements a means of connecting the network that determines the average based on the channel axis to achieve the maximum possible performance. A max-average pooling technique can be used.

한편, 본 발명에 따른 방법은 컴퓨터에서 실행될 수 있는 프로그램으로 작성되어 마그네틱 저장매체, 광학적 판독매체, 디지털 저장매체 등 다양한 기록 매체로도 구현될 수 있다.Meanwhile, the method according to the present invention is written as a program that can be executed on a computer and can be implemented in various recording media such as magnetic storage media, optical reading media, and digital storage media.

본 명세서에 설명된 각종 기술들의 구현들은 디지털 전자 회로조직으로, 또는 컴퓨터 하드웨어, 펌웨어, 소프트웨어로, 또는 그들의 조합들로 구현될 수 있다. 구현들은 데이터 처리 장치, 예를 들어 프로그램가능 프로세서, 컴퓨터, 또는 다수의 컴퓨터들의 동작에 의한 처리를 위해, 또는 이 동작을 제어하기 위해, 컴퓨터 프로그램 제품, 즉 정보 캐리어, 예를 들어 기계 판독가능 저장 장치(컴퓨터 판독가능 매체) 또는 전파 신호에서 유형적으로 구체화된 컴퓨터 프로그램으로서 구현될 수 있다. 상술한 컴퓨터 프로그램(들)과 같은 컴퓨터 프로그램은 컴파일된 또는 인터프리트된 언어들을 포함하는 임의의 형태의 프로그래밍 언어로 기록될 수 있고, 독립형 프로그램으로서 또는 모듈, 구성요소, 서브루틴, 또는 컴퓨팅 환경에서의 사용에 적절한 다른 유닛으로서 포함하는 임의의 형태로 전개될 수 있다. 컴퓨터 프로그램은 하나의 사이트에서 하나의 컴퓨터 또는 다수의 컴퓨터들 상에서 처리되도록 또는 다수의 사이트들에 걸쳐 분배되고 통신 네트워크에 의해 상호 연결되도록 전개될 수 있다.Implementations of the various techniques described herein may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or combinations thereof. Implementations may be a computer program product, i.e., an information carrier, e.g., a machine-readable storage, for processing by, or for controlling, the operation of a data processing apparatus, e.g., a programmable processor, computer, or plurality of computers. It can be implemented as a computer program tangibly embodied in a device (computer readable medium) or a radio signal. A computer program, such as the computer program(s) described above, may be written in any form of programming language, including compiled or interpreted languages, and may be written as a stand-alone program or in a module, component, subroutine, or computing environment. It can be deployed in any form, including as other units suitable for the use of. A computer program can be deployed to be processed on one computer or multiple computers at one site or distributed across multiple sites and interconnected by a communication network.

컴퓨터 프로그램의 처리에 적절한 프로세서들은 예로서, 범용 및 특수 목적 마이크로프로세서들 둘 다, 및 임의의 종류의 디지털 컴퓨터의 임의의 하나 이상의 프로세서들을 포함한다. 일반적으로, 프로세서는 판독 전용 메모리 또는 랜덤 액세스 메모리 또는 둘 다로부터 명령어들 및 데이터를 수신할 것이다. 컴퓨터의 요소들은 명령어들을 실행하는 적어도 하나의 프로세서 및 명령어들 및 데이터를 저장하는 하나 이상의 메모리 장치들을 포함할 수 있다. 일반적으로, 컴퓨터는 데이터를 저장하는 하나 이상의 대량 저장 장치들, 예를 들어 자기, 자기-광 디스크들, 또는 광 디스크들을 포함할 수 있거나, 이것들로부터 데이터를 수신하거나 이것들에 데이터를 송신하거나 또는 양쪽으로 되도록 결합될 수도 있다. 컴퓨터 프로그램 명령어들 및 데이터를 구체화하는데 적절한 정보 캐리어들은 예로서 반도체 메모리 장치들, 예를 들어, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(Magnetic Media), CD-ROM(Compact Disk Read Only Memory), DVD(Digital Video Disk)와 같은 광 기록 매체(Optical Media), 플롭티컬 디스크(Floptical Disk)와 같은 자기-광 매체(Magneto-Optical Media), 롬(ROM, Read Only Memory), 램(RAM, Random Access Memory), 플래시 메모리, EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM) 등을 포함한다. 프로세서 및 메모리는 특수 목적 논리 회로조직에 의해 보충되거나, 이에 포함될 수 있다.Processors suitable for processing a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from read only memory or random access memory or both. Elements of a computer may include at least one processor that executes instructions and one or more memory devices that store instructions and data. In general, a computer may include, receive data from, send data to, or both, one or more mass storage devices that store data, such as magnetic, magneto-optical disks, or optical disks. It can also be combined to become. Information carriers suitable for embodying computer program instructions and data include, for example, semiconductor memory devices, for example, magnetic media such as hard disks, floppy disks and magnetic tapes, compact disk read only memory (CD-ROM) ), optical media such as DVD (Digital Video Disk), magneto-optical media such as Floptical Disk, ROM (Read Only Memory), RAM (RAM) , Random Access Memory), flash memory, EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), and the like. The processor and memory may be supplemented by, or included in, special purpose logic circuitry.

또한, 컴퓨터 판독가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용매체일 수 있고, 컴퓨터 저장매체 및 전송매체를 모두 포함할 수 있다.In addition, computer readable media may be any available media that can be accessed by a computer, and may include both computer storage media and transmission media.

본 명세서는 다수의 특정한 구현물의 세부사항들을 포함하지만, 이들은 어떠한 발명이나 청구 가능한 것의 범위에 대해서도 제한적인 것으로서 이해되어서는 안되며, 오히려 특정한 발명의 특정한 실시형태에 특유할 수 있는 특징들에 대한 설명으로서 이해되어야 한다. 개별적인 실시형태의 문맥에서 본 명세서에 기술된 특정한 특징들은 단일 실시형태에서 조합하여 구현될 수도 있다. 반대로, 단일 실시형태의 문맥에서 기술한 다양한 특징들 역시 개별적으로 혹은 어떠한 적절한 하위 조합으로도 복수의 실시형태에서 구현 가능하다. 나아가, 특징들이 특정한 조합으로 동작하고 초기에 그와 같이 청구된 바와 같이 묘사될 수 있지만, 청구된 조합으로부터의 하나 이상의 특징들은 일부 경우에 그 조합으로부터 배제될 수 있으며, 그 청구된 조합은 하위 조합이나 하위 조합의 변형물로 변경될 수 있다.Although this specification contains many specific implementation details, they should not be construed as limiting on the scope of any invention or what is claimed, but rather as a description of features that may be unique to a particular embodiment of a particular invention. It should be understood. Certain features that are described in this specification in the context of separate embodiments may also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments individually or in any suitable subcombination. Further, while features may operate in particular combinations and are initially depicted as such claimed, one or more features from a claimed combination may in some cases be excluded from that combination, and the claimed combination is a subcombination. or sub-combination variations.

마찬가지로, 특정한 순서로 도면에서 동작들을 묘사하고 있지만, 이는 바람직한 결과를 얻기 위하여 도시된 그 특정한 순서나 순차적인 순서대로 그러한 동작들을 수행하여야 한다거나 모든 도시된 동작들이 수행되어야 하는 것으로 이해되어서는 안 된다. 특정한 경우, 멀티태스킹과 병렬 프로세싱이 유리할 수 있다. 또한, 상술한 실시형태의 다양한 장치 컴포넌트의 분리는 그러한 분리를 모든 실시형태에서 요구하는 것으로 이해되어서는 안되며, 설명한 프로그램 컴포넌트와 장치들은 일반적으로 단일의 소프트웨어 제품으로 함께 통합되거나 다중 소프트웨어 제품에 패키징 될 수 있다는 점을 이해하여야 한다.Similarly, while actions are depicted in the drawings in a particular order, it should not be construed as requiring that those actions be performed in the specific order shown or in the sequential order, or that all depicted actions must be performed to obtain desired results. In certain cases, multitasking and parallel processing can be advantageous. Further, the separation of various device components in the embodiments described above should not be understood as requiring such separation in all embodiments, and the program components and devices described may generally be integrated together into a single software product or packaged into multiple software products. You need to understand that you can.

한편, 본 명세서와 도면에 개시된 본 발명의 실시 예들은 이해를 돕기 위해 특정 예를 제시한 것에 지나지 않으며, 본 발명의 범위를 한정하고자 하는 것은 아니다. 여기에 개시된 실시 예들 이외에도 본 발명의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 자명한 것이다.On the other hand, the embodiments of the present invention disclosed in this specification and drawings are only presented as specific examples to aid understanding, and are not intended to limit the scope of the present invention. In addition to the embodiments disclosed herein, it is obvious to those skilled in the art that other modified examples based on the technical idea of the present invention can be implemented.

100 : 컴퓨팅 장치
110 : 프로세서
120 : 스토리지
130 : 메모리
140 : 프로그램
150 : 디스플레이100: computing device
110: processor
120: storage
130: memory
140: program
150: display

Claims

An image classification method performed by a computing device,
generating a gradient energy map for each training image included in the training data set;
extracting seam pixels having minimum pixel values for all rows of a gradient energy map corresponding to the training image when scaling the training image horizontally;
increasing the number of training images by reducing or enlarging the size of the training images in a horizontal direction by removing or copying the extracted seam pixels; and
Learning a learning model for image classification using a training data set including the augmented training image
Image classification method comprising a.

According to claim 1,
The gradient energy map,
An image classification method determined by using a summation result of an absolute gradient value in an x direction and an absolute absolute gradient value in a y direction for each color channel with respect to pixels constituting the training image.

According to claim 2,
Generating the gradient energy map,
The image classification method of determining a pixel value for a corresponding pixel by combining all results of summing the gradient absolute value in the x direction and the absolute absolute value of the gradient in the y direction determined for each color channel for the pixel.

delete

According to claim 1,
Extracting the seam pixels having the minimum pixel value,
identifying a pixel having a minimum pixel value among pixels located in a first row of a gradient energy map corresponding to the training image as a first seam pixel;
selecting, as a second seam pixel, a neighboring pixel having a minimum pixel value among neighboring pixels of a second row adjacent to the identified first seam pixel of the first row; and
Repeatedly selecting a third seam pixel having a minimum pixel value until the last row of the gradient energy map based on the method for selecting the second seam pixel.
Image classification method comprising a.

An image classification method performed by a computing device,
generating a gradient energy map for each training image included in the training data set;
extracting seam pixels having minimum pixel values for all columns of a gradient energy map corresponding to the training image when scaling the training image in the vertical direction;
increasing the number of training images by reducing or enlarging the size of the training images in a vertical direction by removing or copying the extracted seam pixels; and
Learning a learning model for image classification using a training data set including the augmented training image
Image classification method comprising a.

According to claim 6,
Extracting the seam pixels having the minimum pixel value,
identifying a pixel having a minimum pixel value among pixels located in a first column of a gradient energy map corresponding to the training image as a first seam pixel;
selecting, as a second seam pixel, a neighboring pixel having a minimum pixel value among neighboring pixels of a second column adjacent to the identified first seam pixel of the first row; and
Repeatedly selecting a third seam pixel having a minimum pixel value until the last column of the gradient energy map based on the method for selecting the second seam pixel.
Image classification method comprising a.

identifying a test data set including a plurality of test images;
Classifying test images included in the identified test data set according to defect types by applying the identified test data set to a learning model for image classification.
including,
The training image used to learn the learning model is
When resizing the training image in the horizontal direction, extracting seam pixels having a minimum pixel value for all rows of the gradient energy map corresponding to the training image, and removing or copying the extracted seam pixels to perform the training Increased by reducing or enlarging the size of the image,
The learning model,
An image classification method in which learning for image classification is performed using a training data set including the augmented training image.

According to claim 8,
The gradient energy map,
An image classification method determined by using a summation result of an absolute gradient value in an x direction and an absolute absolute gradient value in a y direction for each color channel with respect to pixels constituting the training image.

According to claim 9,
The gradient energy map,
The method of classifying the image by combining the summation results of the gradient absolute value in the x direction and the absolute gradient value in the y direction determined for each color channel for the pixel and determining the pixel value for the corresponding pixel.

delete

According to claim 8,
The seam pixels having the minimum pixel value,
Among the pixels located in the first row of the gradient energy map corresponding to the training image, a pixel having a minimum pixel value is identified as a first seam pixel, and a second row adjacent to the identified first seam pixel of the first row is identified. A neighboring pixel having a minimum pixel value among neighboring pixels of is selected as a second seam pixel, and a third seam pixel having a minimum pixel value up to the last row of the gradient energy map based on a method for selecting the second seam pixel. Image classification method extracted by repeatedly selecting

identifying a test data set including a plurality of test images;
Classifying test images included in the identified test data set by defect type by applying the identified test data set to a learning model for image classification.
including,
The training image used to learn the learning model is
When resizing the training image in the vertical direction, extracting seam pixels having a minimum pixel value for all columns of the gradient energy map corresponding to the training image, and removing or copying the extracted seam pixels to obtain the training image is increased by reducing or enlarging the size of
The learning model,
An image classification method in which learning for image classification is performed using a training data set including the augmented training image.

According to claim 13,
The seam pixels having the minimum pixel value,
Among the pixels located in the first column of the gradient energy map corresponding to the training image, a pixel having a minimum pixel value is identified as a first seam pixel, and a neighboring pixel in a second column adjacent to the identified first seam pixel in the first column. Among them, a neighboring pixel having the minimum pixel value is selected as a second seam pixel, and based on the method of selecting the second seam pixel, the third seam pixel having the minimum pixel value is repeated until the last column of the gradient energy map. Image classification method extracted by selecting.

A computing device for performing an image classification method,
contains a processor;
the processor,
For a training data set including a plurality of training images, a gradient energy map is generated for each training image, and when scaling the training image horizontally, the gradient corresponding to the training image is desired. Extracting seam pixels having a minimum pixel value for all rows of the energy map, removing or copying the extracted seam pixels to reduce or enlarge the size of the training image in the horizontal direction to increase the number of training images, A computing device that learns a learning model for image classification using a training data set including the augmented training image.

According to claim 15,
The gradient energy map,
Computing device generated by combining a summation result of an absolute gradient value in the x direction and an absolute gradient value in the y direction for each color channel for pixels constituting the training image and determining a pixel value for the corresponding pixel.

delete

According to claim 15,
the processor,
Among the pixels located in the first row of the gradient energy map corresponding to the training image, a pixel having a minimum pixel value is identified as a first seam pixel, and a second row adjacent to the identified first seam pixel of the first row is identified. A neighboring pixel having a minimum pixel value among neighboring pixels of is selected as a second seam pixel, and a third seam pixel having a minimum pixel value up to the last row of the gradient energy map based on a method for selecting the second seam pixel. A computing device for extracting seam pixels having the minimum pixel value by repeatedly selecting .

A computing device for performing an image classification method,
contains a processor;
the processor,
For a training data set including a plurality of training images, a gradient energy map for each training image is generated, and when scaling the training image in the vertical direction, the gradient corresponding to the training image is desired. The number of training images is increased by extracting seam pixels having a minimum pixel value for all columns of the energy map, reducing or enlarging the size of the training image in the vertical direction by removing or copying the extracted seam pixels, A computing device that learns a learning model for image classification using a training data set comprising augmented training images.

According to claim 19,
the processor,
Among the pixels located in the first column of the gradient energy map corresponding to the training image, a pixel having a minimum pixel value is identified as a first seam pixel, and a neighboring pixel in a second column adjacent to the identified first seam pixel in the first column. Among them, a neighboring pixel having the minimum pixel value is selected as a second seam pixel, and based on the method of selecting the second seam pixel, the third seam pixel having the minimum pixel value is repeated until the last column of the gradient energy map. A computing device for extracting seam pixels having the minimum pixel value by selecting.