KR20200087297A

KR20200087297A - Defect inspection method and apparatus using image segmentation based on artificial neural network

Info

Publication number: KR20200087297A
Application number: KR1020180171438A
Authority: KR
Inventors: 김정태; 조희연; 김경실
Original assignee: 이화여자대학교 산학협력단
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2020-07-21
Also published as: KR102166458B1

Abstract

According to the present invention, a defect detection method using an image segmentation based on an artificial neural network comprises the steps of: inputting, by a computer device, a source image including a product object to a learning network; and outputting, by the computer device, a segmented image generated by the learning network. The learning network is a convolutional encoder-decoder, and outputs the segmented image obtained by classifying a defective part in the product object.

Description

DEFECT INSPECTION METHOD AND APPARATUS USING IMAGE SEGMENTATION BASED ON ARTIFICIAL NEURAL NETWORK

이하 설명하는 기술은 영상 처리에 기반한 불량 검출 기법에 관한 것이다. The technique described below relates to a defect detection technique based on image processing.

머신 비젼(machine vision)은 결함 검사, 분류, 인식 등의 응용 분야에서 수동검사자를 대치하여 반복적으로 고정도의 검사를 수행할 수 있어서 널리 연구되고 있다. Machine vision (machine vision) has been widely studied because it can perform high-precision inspection repeatedly by replacing a manual inspector in applications such as defect inspection, classification, and recognition.

최근 주목받고 있는 머신 러닝 기술은 수동 검사자의 판단 형태를 학습할 수 있어서 기존 머신 비젼 기술이 적용되기 어려웠던 응용 분야에서 활발히 연구되고 있다.The machine learning technology, which has recently attracted attention, is actively being studied in applications where it is difficult to apply the existing machine vision technology because it can learn the judgment form of a manual inspector.

한국등록특허 제10-1688458호Korean Registered Patent No. 10-1688458

이하 설명하는 기술은 인공신경망에 기반한 영상 분할로 제품 불량 검사를 수행하는 기법을 제공하고자 한다. The technique described below is intended to provide a technique for performing product defect inspection by segmenting an image based on an artificial neural network.

인공신경망 기반의 영상 분할을 이용한 불량 검출 방법은 컴퓨터장치가 학습네트워크에 제품 객체가 포함된 소스 영상을 입력하는 단계 및 상기 컴퓨터장치가 상기 학습네트워크가 생성하는 분할 영상을 출력하는 단계를 포함한다. 상기 학습네트워크는 컨볼루션 인코더-디코더이고, 상기 제품 객체에서 불량 부위를 구분한 상기 분할 영상을 출력한다.A method of detecting a defect using image segmentation based on an artificial neural network includes inputting a source image including a product object into a learning network by a computer device and outputting a split image generated by the learning network by the computer device. The learning network is a convolutional encoder-decoder, and outputs the segmented image in which defective parts are separated from the product object.

인공신경망 기반의 영상 분할을 이용한 불량 검출 장치는 제품 객체가 포함된 소스 영상을 입력받는 입력장치, 입력 영상에서 불량 부위를 구분하는 출력값을 생성하는 복수의 컨볼루션 인코더-디코더를 저장하는 저장장치 및 상기 소스 영상을 상기 복수의 컨볼루션 인코더-디코더에 각각 입력하여 상기 소스 영상의 각 픽셀에 대한 출력값을 생성하고, 상기 복수의 컨볼루션 인코더-디코더가 출력하는 각 픽셀에 대한 출력값을 기준으로 상기 각 픽셀에 대한 최종 클래스를 분류하고, 상기 각 픽셀에 대한 최종 클래스를 기준으로 출력 영상을 생성하는 연산장치를 포함한다.A defect detection device using artificial neural network-based image segmentation includes an input device that receives a source image containing a product object, a storage device that stores a plurality of convolutional encoder-decoders that generate output values that distinguish defective areas from the input image, and The source image is input to each of the plurality of convolutional encoder-decoders to generate output values for each pixel of the source image, and each of the plurality of convolutional encoder-decoders outputs the output values for each pixel. And a computing device that classifies a final class for pixels and generates an output image based on the final class for each pixel.

이하 설명하는 기술은 컨볼루션 인코더-디코더를 이용하여 영상을 분석하고, 불량 부위를 표시하는 영상을 출력한다. 이하 설명하는 기술은 불량 부위를 표현하여 작업자가 보다 편리하게 제품 불량 여부 및 불량 부위를 확인할 수 있다.The technique described below analyzes an image using a convolutional encoder-decoder and outputs an image indicating a defective area. The technique described below expresses a defective portion so that the worker can more conveniently check whether the product is defective or not.

도 1은 영상 분할을 이용한 불량 검출 시스템에 대한 예이다.
도 2는 컨볼루셔널 인코더-디코더에 대한 예이다.
도 3은 영상 분할하는 신경망 모델에 대한 예이다.
도 4는 영상 분할하는 신경망 모델에 대한 다른 예이다.
도 5는 영상 분할을 이용한 불량 검출 장치의 구성에 대한 예이다.1 is an example of a defect detection system using image segmentation.
2 is an example of a convolutional encoder-decoder.
3 is an example of a neural network model for segmenting an image.
4 is another example of a neural network model for segmenting an image.
5 is an example of a configuration of a defect detection apparatus using image segmentation.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시례를 가질 수 있는 바, 특정 실시례들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The technique described below may be applied to various changes and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technology described below to specific embodiments, and should be understood to include all modifications, equivalents, or substitutes included in the spirit and scope of the technology described below.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 상기 용어들에 의해 한정되지는 않으며, 단지 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 이하 설명하는 기술의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, B, etc. can be used to describe various components, but the components are not limited by the above terms, and only for distinguishing one component from other components Used only. For example, the first component may be referred to as a second component, and similarly, the second component may be referred to as a first component without departing from the scope of the technology described below. The term and/or includes a combination of a plurality of related described items or any one of a plurality of related described items.

본 명세서에서 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 설시된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.In the terminology used herein, a singular expression should be understood to include a plurality of expressions unless clearly interpreted differently in the context, and terms such as “comprises” describe features, numbers, steps, operations, and components described. It is to be understood that it means that a part or a combination thereof is present, and does not exclude the presence or addition possibility of one or more other features or numbers, step operation components, parts or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Prior to the detailed description of the drawings, it is intended to clarify that the division of components in this specification is only divided by the main functions of each component. That is, two or more components to be described below may be combined into one component, or one component may be divided into two or more for each subdivided function. In addition, each of the constituent parts to be described below may additionally perform some or all of the functions of other constituent parts in addition to the main functions of the constituent parts, and some of the main functions of the constituent parts are different. Needless to say, it may also be carried out in a dedicated manner.

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In addition, in performing the method or the method of operation, each process constituting the method may occur differently from the specified order unless a specific order is explicitly stated in the context. That is, each process may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the reverse order.

이하 설명하는 기술은 기계학습(machine learning)모델을 사용하여 제품의 불량을 검출하는 기법이다. 이하 설명하는 기술은 제품에 대한 영상을 분석하여 제품 품질을 검사하는 머신 비전 기법이다. 기계학습모델은 널리 알려진 바와 같이 다양한 모델이 있다. 이하 설명하는 기술은 인공신경망(artificial neural network)을 사용하여 영상을 분석한다고 가정한다.The technique described below is a technique for detecting defects in products using a machine learning model. The technology described below is a machine vision technique that analyzes an image of a product to inspect product quality. There are various models of machine learning models as is well known. The technique described below is assumed to analyze an image using an artificial neural network.

이하 인공신경망을 이용하여 영상을 분석하는 주체는 영상 처리 장치라고 명명하다. 영상 처리 장치는 일정한 데이터 처리 및 연산이 가능한 장치에 해당한다. 예컨대, 영상 처리 장치는 PC, 스마트기기, 서버 등과 같은 장치로 구현될 수 있다. 영상 처리 장치는 사전에 학습된 인공신경망 모델을 이용하여 입력 영상을 처리한다. Hereinafter, a subject analyzing an image using an artificial neural network is called an image processing apparatus. The image processing device corresponds to a device capable of processing and calculating data. For example, the image processing device may be implemented as a device such as a PC, a smart device, and a server. The image processing apparatus processes an input image using a previously trained artificial neural network model.

영상 처리 장치는 제품(객체)을 포함하는 영상을 분석한다. 분석 대상인 영상을 이하 소스 영상이라고 명명한다. 영상 처리 장치는 인공신경망에 소스 영상을 입력하여 분석 결과(분류)를 생성한다.The image processing apparatus analyzes an image including a product (object). The image to be analyzed is hereinafter referred to as a source image. The image processing apparatus inputs a source image into an artificial neural network to generate an analysis result (classification).

한편, 단일 인공신경망의 출력 결과 자체가 불량 여부를 나타내는 결과일 수도 있지만, 인공신경망의 출력 결과를 일정하게 가공하여 최종적인 불량 여부 판단을 할 수도 있다. 따라서 소스 영상을 분석한 결과 나아가 분석 결과를 가공한 값으로 제품에 대한 불량 검사를 수행하는 장치를 컴퓨터 장치라고 명명한다. 컴퓨터 장치는 전술한 영상 처리 장치를 포함하는 개념이다.On the other hand, although the output result of a single artificial neural network itself may be a result indicating whether it is defective, the output result of the artificial neural network may be constantly processed to determine whether or not the final failure occurs. Therefore, a device that performs a defect inspection on a product with a processed value of a result of analyzing the source image and further processing is called a computer device. The computer device is a concept including the above-described image processing device.

컴퓨터 장치는 인공신경망을 이용하여 소스 영상의 분석 결과를 일정한 영상을 출력할 수 있다. 예컨대, 컴퓨터 장치는 소스 영상에 대한 분할(segmentation)을 하여 분석 결과를 일정한 영상으로 생성할 수 있다. 이하 불량 여부 내지 불량 부위를 나타내는 영상을 분할 영상(segmented image)이라고 명명한다. 영상 분할을 위한 다양한 인공신경망 모델이 연구되었고, 이하 설명하는 기술은 그 중 어느 하나를 채택하여 수행될 수도 있다. The computer device may output a constant image of the analysis result of the source image using the artificial neural network. For example, the computer device may generate a result of the analysis as a constant image by segmenting the source image. Hereinafter, an image indicating whether or not the defect is defective is referred to as a segmented image. Various artificial neural network models for image segmentation have been studied, and the technique described below may be performed by adopting any one of them.

사용자는 제품 불량 여부를 검사하는 주체를 의미한다. The user means the subject who checks for product defects.

도 1은 영상 분할을 이용한 불량 검출 시스템(100)에 대한 예이다. 도 1의 불량 검출 시스템(100)은 제품 공정 라인에 설치된 시스템일 수 있다. 카메라(111, 112, 113)는 검사 대상인 제품의 외관을 촬영한다. 도 1은 3개의 카메라(111, 112 및 113)를 예로 도시하였다. 복수의 공정 라인에서 개별 카메라를 사용하여 불량 검사를 할 수 있다. 나아가 하나의 제품을 서로 다른 관점에서 촬영하는 복수의 카메라를 사용하여 불량 검사를 할 수도 있다. 카메라(111, 112 내지 113)는 소스 영상을 캡쳐한다. 1 is an example of a defect detection system 100 using image segmentation. The defect detection system 100 of FIG. 1 may be a system installed in a product processing line. The cameras 111, 112, and 113 photograph the appearance of the product to be inspected. 1 shows three cameras 111, 112, and 113 as examples. Defect inspection can be performed using individual cameras in multiple process lines. Furthermore, defect inspection may be performed using a plurality of cameras photographing one product from different viewpoints. The cameras 111, 112 to 113 capture the source image.

도 1에서 불량 검출하는 컴퓨터 장치로 검출 서버(150) 및 검출 PC(180)를 예로 도시하였다. 불량 검출 시스템(100)은 검출 서버(150) 또는 검출 PC(180) 중 어느 하나를 포함할 수 있다. 나아가 불량 검출 시스템(100)은 검출 서버(150) 및 검출 PC(180)를 모두 포함할 수 있다.In FIG. 1, the detection server 150 and the detection PC 180 are illustrated as computer devices for detecting defects. The defect detection system 100 may include either the detection server 150 or the detection PC 180. Furthermore, the defect detection system 100 may include both the detection server 150 and the detection PC 180.

검출 서버(150)는 카메라(111, 112 내지 113)로부터 소스 영상을 전달받는다. 검출 서버(150)는 유선 또는 무선 네트워크를 통해 소스 영상을 수신할 수 있다. 검출 서버(150)는 제품의 불량 여부를 나타내는 분석 결과 또는 분할 영상을 생성하여 사용자(10)에게 전달한다. 사용자(10)는 사용자 단말을 통해 검출 서버(150)의 분석 결과 또는 분할 영상을 확인할 수 있다. 사용자 단말은 PC, 스마트 기기, 디스플레이 장치 등과 같은 장치를 의미한다.The detection server 150 receives source images from the cameras 111, 112 to 113. The detection server 150 may receive the source image through a wired or wireless network. The detection server 150 generates an analysis result indicating whether the product is defective or a divided image and transmits it to the user 10. The user 10 may check the analysis result or the divided image of the detection server 150 through the user terminal. The user terminal means a device such as a PC, a smart device, and a display device.

검출 PC(180)는 카메라(111, 112 내지 113)로부터 소스 영상을 전달받는다. 검출 PC(180)는 유선 또는 무선 네트워크를 통해 소스 영상을 수신할 수 있다. 검출 PC(180)는 제품의 불량 여부를 나타내는 분석 결과 또는 분할 영상을 생성하여 사용자(10)에게 전달한다. 사용자(20)는 검출 PC(180)를 통해 제품의 불량 여부를 확인할 수 있다.The detection PC 180 receives the source image from the cameras 111, 112 to 113. The detection PC 180 may receive the source image through a wired or wireless network. The detection PC 180 generates an analysis result indicating whether the product is defective or a divided image and transmits it to the user 10. The user 20 may check whether the product is defective through the detection PC 180.

컴퓨터 장치는 소스 영상에 대한 분할을 수행하여 불량 여부 및 부위를 검출할 수 있다. 컴퓨터 장치는 다양한 영상 분할 기법을 이용할 수 있다. 설명의 편의를 위하여 컴퓨터 장치는 컨볼루셔널 인코더-디코더(convolutional encoder/decoder)를 이용한다고 가정한다. The computer device may detect whether the defect is or not by performing division on the source image. The computer device may use various image segmentation techniques. For convenience of description, it is assumed that the computer device uses a convolutional encoder/decoder.

도 2는 컨볼루셔널 인코더-디코더에 대한 예이다. 컨볼루셔널 인코더-디코더는 컨볼루셔널 인코더 및 컨볼루셔널 디코더로 구성된다. 컨볼루셔널 인코더-디코더는 서로 거울상 구조를 갖는다. 컨볼루셔널 인코더-디코더는 소스 영상에서 추출되는 특징 맵(feature map) 의 크기를 줄였다가 다시 소스 영상 크기만큼 크게 만들어서, 소스 영상의 각 픽셀에 대해 분류 결과로 클래스를 분류한다.2 is an example of a convolutional encoder-decoder. The convolutional encoder-decoder consists of a convolutional encoder and a convolutional decoder. The convolutional encoder-decoder has a mirror image structure with each other. The convolutional encoder-decoder reduces the size of the feature map extracted from the source image and then makes it as large as the source image size, and classifies the class as a classification result for each pixel of the source image.

컨볼루셔널 인코더는 복수의 계층을 갖는다. 도 2는 5개의 계층을 갖는 컨볼루셔널 인코더를 예로 도시하였다. 하나의 계층은 컨볼루션널 계층(convolutional layer)과 풀링 계층(pooling layer)을 갖는다. 도 2는 컨볼루션널 계층이라고만 표시하였으나, 해당 계층은 컨볼루션널 계층 외에도 배치 표준화 계층(batch normalization layer) 및 비선형화 계층(non linear activation layer)을 더 포함할 수 있다. 이 경우 하나의 계층은 컨볼루션널 계층, 배치 표준화 계층, 비선형화 계층 순서를 가질 수 있다. 해당 계층은 복수 회 반복될 수 있다. 도 2는 2회 반복되는 계층을 도시하였다. 즉 하나의 계층은 (i) (컨볼루션널 계층, 배치 표준화 계층 및 비선형화 계층) × n회 + (ii) 풀링 계층 구조를 갖는다.Convolutional encoders have multiple layers. 2 shows an example of a convolutional encoder having 5 layers. One layer has a convolutional layer and a pooling layer. Although FIG. 2 shows only the convolutional layer, the layer may further include a batch normalization layer and a non-linear activation layer in addition to the convolutional layer. In this case, one layer may have a convolutional layer, a batch normalization layer, and a nonlinearization layer order. This layer can be repeated multiple times. 2 shows a layer repeated twice. That is, one layer has (i) (convolutional layer, batch normalization layer, and nonlinearization layer) × n times + (ii) a pooling layer structure.

컨볼루셔널 계층은 입력 이미지에 대한 컨볼루셔널 연산을 통해 특징맵(feature map)을 출력한다. 이때 컨볼루셔널 연산을 수행하는 필터(filter)를 커널(kernel) 이라고도 부른다. 필터의 크기를 필터 크기 또는 커널 크기라고 한다. 커널을 구성하는 연산 파라미터(parameter)를 커널 파라미터(kernel parameter), 필터 파라미터(filter parameter), 또는 가중치(weight)라고 한다. 컨볼루셔널 계층에서는 하나의 입력에 서로 다른 종류의 필터를 사용할 수 있다. The convolutional layer outputs a feature map through convolutional operation on the input image. At this time, a filter that performs convolutional operations is also called a kernel. The size of the filter is called the filter size or kernel size. The operation parameters constituting the kernel are referred to as kernel parameters, filter parameters, or weights. In the convolutional layer, different types of filters can be used for one input.

컨볼루셔널 계층은 입력이미지의 특정 영역을 대상으로 컨볼루션 연산을 수행한다. 연산 영역을 윈도우 (window)라고 부른다. 윈도우는 영상의 좌측 상단에서 우측 하단까지 한 칸씩 이동할 수 있고, 한 번에 이동하는 이동 크기를 조절할 수 있다. 이동 크기를 스트라이드(stride)라고 한다. 컨볼루셔널 계층은 입력이미지에서 윈도우를 이동하면서 입력이미지의 모든 영역에 대하여 컨볼루션 연산을 수행한다. 한편 컨볼루셔널 계층은 영상의 가장 자리에 패딩(padding)을 하여 컨볼루션 연산 후 입력 영상의 차원을 유지시킨다. The convolutional layer performs a convolution operation on a specific area of the input image. The computational domain is called a window. The window can be moved one space from the top left to the bottom right of the image, and the size of the move can be adjusted at one time. The size of the movement is called stride. The convolutional layer performs a convolution operation on all areas of the input image while moving a window in the input image. Meanwhile, the convolutional layer pads at the edge of the image to maintain the dimension of the input image after the convolution operation.

풀링 계층은 컨볼루셔널 계층의 연산 결과로 얻은 특징맵을 서브 샘플링(sub sampling)한다. 풀링 연산은 최대 풀링(max pooling)과 평균 풀링(average pooling) 등이 있다. 최대 풀링은 윈도우 내에서 가장 큰 샘플 값을 선택한다. 평균 풀링은 윈도우에 포함된 값의 평균 값으로 샘플링한다. 일반적으로 풀링은 스트라이드와 윈도우의 크기가 갖도록 하는 것일 일반적이다. The pooling layer sub-samples the feature map obtained as a result of the computation of the convolutional layer. The pooling operation includes maximum pooling and average pooling. Maximum pooling selects the largest sample value within the window. The average pooling is sampled with the average value of the values included in the window. In general, pooling is such that the stride and the window have a size.

비선형 연산 계층(nonlinear operation layer)은 뉴런(노드)에서 출력값을 결정하는 계층이다. 비선형 연산 계층은 전달 함수(transfer function)를 사용한다. 전달 함수는 Relu, sigmoid 함수 등이 있다.The nonlinear operation layer is a layer that determines output values from neurons (nodes). The nonlinear operation layer uses a transfer function. Transfer functions include Relu and sigmoid functions.

컨볼루셔널 인코더는 소스 영상에 대한 특징 맵을 생성한다.The convolutional encoder generates a feature map for the source image.

컨볼루셔널 디코더는 컨볼루셔널 인코더가 생성한 특징 맵을 이용하여 일정한 영상을 생성한다. 컨볼루셔널 디코더는 복수의 계층으로 구성된다. 도 2는 5개의 계층을 갖는 컨볼루셔널 디코더를 예로 도시하였다. The convolutional decoder generates a constant image using the feature map generated by the convolutional encoder. The convolutional decoder is composed of a plurality of layers. 2 shows an example of a convolutional decoder having 5 layers.

하나의 계층은 업샘플링 계층(upsampling layer) 및 역컨볼루션널 계층(deconvolutional layer)을 갖는다. 역컨볼루션널 계층은 컨볼루셔널 인코더의 컨볼루션널 계층의 역동작을 수행한다. 역컨볼루션널 계층은 컨볼루셔널 계층, 배치 표준화 계층, 비 선형화 계층의 구조가 반복될 수 있다. 다만 컨볼루셔널 디코더는 컨볼루셔널 인코더와 대칭적인 구조를 갖기에, 컨볼루셔널 인코더와 동일한 개수의 컨볼루션널 계층, 동일한 컨볼루셔널 필터 크기 및 개수를 갖는다.One layer has an upsampling layer and a deconvolutional layer. The inverse convolutional layer performs the inverse operation of the convolutional layer of the convolutional encoder. The structure of the convolutional layer, the convolutional layer, the batch standardization layer, and the non-linearization layer may be repeated. However, since the convolutional decoder has a symmetrical structure with the convolutional encoder, it has the same number of convolutional layers and the same convolutional filter size and number as the convolutional encoder.

업샘플링 계층은 풀링 계층의 역동작을 수행한다. 업샘플링 계층은 업샘플링(upsampling)을 진행한다. 업샘플링 계층은 풀링 계층과 다르게 반대로 차원을 확대하는 역할을 한다.The upsampling layer performs the reverse operation of the pooling layer. The upsampling layer performs upsampling. The upsampling layer, unlike the pooling layer, serves to expand the dimension.

역컨볼루셔널 계층은 컨볼루셔널 계층의 역동작을 수행한다. 역컨볼루셔널 계층은 컨볼루셔널 계층과 반대 방향으로 컨볼루션 연산을 수행한다. 역컨볼루셔널 계층은 입력으로 특징맵을 받아 커널을 이용한 컨볼루션 연산으로 출력 영상을 생성한다. 스트라이드를 1로 하면 역컨볼루셔널 계층은 특징맵의 가로, 세로 크기가 출력의 가로, 세로와 동일한 영상을 출력한다. 스트라이드를 2로 하면 역컨볼루셔널 계층은 특징맵의 가로, 세로 크기 대비 절반 크기의 영상을 출력한다. The inverse convolutional layer performs an inverse operation of the convolutional layer. The inverse convolutional layer performs a convolution operation in the opposite direction to the convolutional layer. The inverse convolutional layer receives a feature map as an input and generates an output image by convolution operation using the kernel. If stride is 1, the inverse convolutional layer outputs an image in which the horizontal and vertical sizes of the feature map are the same as the horizontal and vertical dimensions of the output. If stride is 2, the inverse convolutional layer outputs half the size of the feature map horizontally and vertically.

컨볼루셔널 디코더는 마지막으로 소프트맥스 계층(softmax classfier)을 가질 수 있다.The convolutional decoder may finally have a softmax classfier.

도 3은 영상 분할하는 신경망 모델에 대한 예이다. 도 3은 하나의 컨볼루셔널 인코더-디코더를 포함하는 신경망 모델에 대한 예이다. 도 3의 학습네트워크(N)는 도 2의 컨볼루셔널 인코더-디코더에 해당할 수 있다.3 is an example of a neural network model for segmenting an image. 3 is an example of a neural network model including one convolutional encoder-decoder. The learning network N of FIG. 3 may correspond to the convolutional encoder-decoder of FIG. 2.

컨볼루셔널 인코더는 소스 영상에서 특징을 추출하여 특징 맵을 출력한다. 컨볼루셔널 디코더는 특징 맵을 입력받아 일정한 분할 영상을 출력한다. 컨볼루셔널 인코더-디코더를 이용한 영상 분할에 대하여 간략하게 설명한다.The convolutional encoder extracts features from the source image and outputs a feature map. The convolutional decoder receives a feature map and outputs a constant divided image. The video segmentation using the convolutional encoder-decoder will be briefly described.

컨볼루셔널 인코더-디코더는 각 픽셀(pixel-wise)마다 어떤 클래스에 속하는 지에 대한 분류 진행할 수 있다. 컨볼루셔널 인코더-디코더의 마지막 계층은 입력 영상의 가로 크기 × 입력 영상의 세로 크기 × 클래스 개수의 크기를 갖는다. 마지막 계층은 (i,j) 번째 픽셀의 k 클래스에 대한 분류 정보를 의미한다.The convolutional encoder-decoder can classify which class belongs to each pixel-wise. The last layer of the convolutional encoder-decoder has the size of the horizontal size of the input image × the vertical size of the input image × the number of classes. The last layer means classification information for the k class of the (i,j)th pixel.

소프트맥스 계층은 각 픽셀별로 소프트맥스(softmax)를 수행한다. 소프트 맥스 수식의 예는 아래 수학식 1과 같다.The softmax layer performs softmax for each pixel. An example of the soft max equation is shown in Equation 1 below.

input은 소프트맥스 계층에 대한 입력 정보이다. input(i,j,k)는 (i,j) 번째 픽셀의 k 클래스에 대한 분류 정보를 의미한다. output은 소프트맥스 계층의 출력값이다. output(i,j,k)는 (i,j) 번째 픽셀의 k 클래스에 대한 분류 정보를 의미한다.input is input information for the Softmax layer. input(i,j,k) means classification information for the k class of the (i,j) th pixel. output is the output value of the Softmax layer. output(i,j,k) means classification information for the k class of the (i,j)th pixel.

와 같이, 각 픽셀 별로 클래스에 대한 차원(dimension)을 모두 더하면 1의 값이 나온다.

As shown in the figure, if all dimensions of a class are added for each pixel, a value of 1 is obtained.

소프트맥스 계층의 출력값은 각 픽셀별로 클래스에 대한 분류 확률을 나타낸다. 이때 해당 픽셀에 대하여 확률이 가장 높은 클래스로 분류를 수행할 수 있다. The output value of the softmax layer represents a classification probability for a class for each pixel. In this case, classification may be performed with a class having the highest probability for the corresponding pixel.

컨볼루셔널 인코더-디코더도 다른 신경망 모델과 유사하게 손실함수를 갖는다. 학습 과정 또는 업데이트 과정에서 손실함수를 기준으로 신경망의 가중치를 최적화할 수 있다. 예컨대, 가중치 최적화는 경사 하강법(gradient descent method)을 이용할 수 있다. 컴퓨터 장치는 정의된 손실함수에 대하여 학습하고자 하는 각 파라미터의 기울기(gradient) d_parameter를 계산한다. 컴퓨터 장치는 기울기를 최소화하는 값을 찾아 가중치를 최적화할 수 있다. 컴퓨터 장치는 각 파라미터의 기울기를 이용하여 각 파라미터의 값 업데이트할 수 있다. 업데이트되는 파라미터 parameter_new는 다음과 같이 결정될 수 있다. parameter_new = parameter_prev - learnRate* d_parameter이다. 여기서 학습률(learnRate) 은 모델 학습 시에 직접 설정하는 설정 값이다.Convolutional encoder-decoder has a loss function similar to other neural network models. In the learning process or update process, the weight of the neural network can be optimized based on the loss function. For example, the weight optimization may use a gradient descent method. The computer device calculates the gradient d _parameter of each parameter to be learned for the defined loss function. The computer device can optimize the weight by finding a value that minimizes the slope. The computer device may update the value of each parameter using the slope of each parameter. The parameter _new to be updated can be determined as follows. parameter _new = parameter _prev -learnRate* d _parameter . Here, the learning rate (learnRate) is a setting value that is set directly during model training.

손실함수(loss)는 두 부분으로 구성될 수 있다. 손실함수(loss)는 크로스 엔트로피(cross entropy)와 정규화 항(Regularization Term)을 포함할 수 있다. 손실함수는 크로스 엔트로피와 정규화 항을 더한 값이 될 수 있다. 손실함수는 각 픽셀의 크로스 엔트로피로 결정되는 전체 크로스 엔트로피 손실값 및 소프트맥스 함수의 결과값이 동일 클래스에 대하여 주변 픽셀 간에 기준값 범위 내의 유사값이 나오도록 정규화하는 정규화 항을 포함할 수 있다.The loss function may consist of two parts. The loss function may include a cross entropy and a regularization term. The loss function can be the sum of the cross entropy and the normalization term. The loss function may include a normalization term to normalize such that the total cross entropy loss value determined by the cross entropy of each pixel and the result value of the softmax function are similar in the reference value range between neighboring pixels for the same class.

크로스 엔트로피 부분은 아래 수학식 2와 같이 표현될 수 있다. The cross entropy part may be expressed as Equation 2 below.

수학식 2에서 w_k는 k 번째 클래스에 대한 가중치, t(i,j,k)는 (i,j) 픽셀의 k 번째 클래스에 대한 소속 정보, p(i,j,k)는 (i,j) 픽셀의 k 번째 클래스에 대한 소프트맥스 출력 점수이다.In Equation 2, w _k is a weight for the k-th class, t(i,j,k) is affiliation information for the k-th class of (i,j) pixels, and p(i,j,k) is (i, j) The softmax output score for the k-th class of pixels.

t(i,j,k)*log(p(i,j,k))에 각 클래스에 대한 가중치 w_k를 곱한 값은 (i,j)번째 위치의 픽셀의 클래스 k 에 대한 크로스 엔트로피의 손실 값을 의미한다. w_k는 미검출 시, 비용이 큰 쪽에 높은 가중치를 부여하거나, 학습 데이터 셋의 각 클래스에 해당되는 픽셀 수의 비균일성에 대한 보상의 역할을 한다. The value of t(i,j,k)*log(p(i,j,k)) multiplied by the weight w _k for each class is the loss of cross entropy for class k of the pixel at the (i,j)th position. Mean value. When w _k is not detected, a higher weight is given to the higher cost, or the non-uniformity of the number of pixels corresponding to each class of the training data set is compensated.

수학식 2는 손실값을 각 클래스에 대해 합하고, 각 픽셀 위치에 대해 평균을 구하여 크로스 엔트로피 전체 손실 값 계산한다.Equation (2) sums the loss values for each class, calculates the average cross-entropy loss value by averaging for each pixel position.

정규화 항은 아래의 수학식 3과 같이 표현될 수 있다. The normalization term can be expressed as Equation 3 below.

수학식 3에서 w_k는 k 번째 클래스에 대한 가중치, t(i,j,k)는 (i,j) 픽셀의 k 번째 클래스에 대한 소속 정보, p(i,j,k)는 (i,j) 픽셀의 k 번째 클래스에 대한 소프트맥스 출력 점수, regularizer는 loss_{regularizationTerm}에 대한 손실값의 비중을 결정하는 하이퍼 파라미터(hyper parameter)이다.In Equation 3, w _k is a weight for the k-th class, t(i,j,k) is affiliation information for the k-th class of (i,j) pixels, and p(i,j,k) is (i, j) The softmax output score for the k-th class of pixels, the regularizer is a hyper parameter that determines the weight of the loss value for the loss _{regularizationTerm} .

정규화 항은 소프트맥스 함수의 결과 값에 대해 주변 픽셀 간에 동일 클래스에 대해 비슷한 값이 나올 수 있도록, 그 차이를 제곱하여 손실값 계산한다. 즉, 정규화 항은 인접한 픽셀에서 동일 클래스에 대해 유사한 값이 나오도록 정규화하는 역할을 한다. 인접한 픽셀의 범위는 조정될 수 있다. 수학식 3에서는 (i,j) 픽셀을 기준으로 이에 직접 인접한 픽셀(i-1, j) 및 픽셀(i, j-1)과 유사한 값이 나오도록 조정한다. 유사 정도는 수학식 3과는 다른 수식으로 결정될 수도 있다. 결국 정규화 항은 특정 픽셀이 인접 픽셀의 동일 클래스에 대하여 소프트맥스 함수 결과값이 큰 차이 없도록 조정하면 충분하다.The normalization term calculates the loss value by squaring the difference so that similar values can be obtained for the same class between surrounding pixels with respect to the result value of the softmax function. That is, the normalization term serves to normalize such that similar values appear for the same class in adjacent pixels. The range of adjacent pixels can be adjusted. In Equation 3, a value similar to the pixel (i-1, j) and the pixel (i, j-1) directly adjacent to the pixel is adjusted based on the (i,j) pixel. The degree of similarity may be determined by a different formula from Equation 3. After all, it is sufficient to adjust the normalization term so that the result of the softmax function is not significantly different for a specific pixel of the same class of adjacent pixels.

도 3에서 학습네트워크(N)는 각 픽셀에 대한 분류 정보(출력값)를 기준으로 특정한 분할 영상을 출력할 수 있다. 예컨대, 학습네트워크(N)는 동일한 출력값에 대하여 동일 색상으로 픽셀값을 결정할 수 있다. 이를 통해 분할 영상은 결국 분류값에 대한 영역이 분할된 영상이 된다.In FIG. 3, the learning network N may output a specific segmented image based on classification information (output value) for each pixel. For example, the learning network N may determine a pixel value with the same color for the same output value. Through this, the segmented image becomes an image in which the region for the classification value is divided.

도 4는 영상 분할하는 신경망 모델에 대한 다른 예이다. 도 4는 복수의 학습네트워크(N1, N2, ..., Nn)를 포함하는 신경망 모델에 대한 예이다. 도 4에서 개별 학습네트워크(N1, N2, ... 또는 Nn)는 도 3의 학습네트워크와 동일 내지 유사한 동작을 수행한다. 다만 개별 학습네트워크(N1, N2, ... 또는 Nn)는 소스 영상에 대한 각 픽셀에 대한 분류 정보(출력값)를 출력한다. 분할 영상은 복수의 학습네트워크(N1, N2, ... 또는 Nn)가 출력하는 값을 기준으로 생성될 수 있다.4 is another example of a neural network model for segmenting an image. 4 is an example of a neural network model including a plurality of learning networks (N1, N2, ..., Nn). In FIG. 4, the individual learning networks N1, N2, ..., or Nn perform the same or similar operation as the learning network of FIG. However, the individual learning networks N1, N2, ... or Nn output classification information (output value) for each pixel of the source image. The segmented image may be generated based on a value output by a plurality of learning networks (N1, N2, ... or Nn).

개별 학습네트워크(N1, N2, ... 또는 Nn) 각각은 도 3에서 설명한 바와 같은 구성을 갖는다. 따라서 개별 학습네트워크(N1, N2, ... 또는 Nn) 각각은 입력된 소스 영상에 대하여 각 픽셀 별로 분류 정보를 나타내는 출력값을 생성한다.Each of the individual learning networks N1, N2, ... or Nn has a configuration as described in FIG. Accordingly, each of the individual learning networks N1, N2, ... or Nn generates an output value indicating classification information for each pixel for the input source image.

이때 복수의 학습네트워크(N1, N2, ..., Nn)는 풀링 계층의 풀링 크기 및 스트라이드 크기를 서로 다르게 설정하여, 인코더가 출력하는 특징 맵의 크기로 서로 다르게 할 수 있다. 이는 복수의 학습네트워크를 이용하여 다양한 크기의 제품 불량을 검출하기 위한 것이다.At this time, the plurality of learning networks (N1, N2, ..., Nn) may be different from each other by setting the pooling size and stride size of the pooling layer differently, and the size of the feature map output by the encoder. This is to detect product defects of various sizes using a plurality of learning networks.

복수의 학습네트워크(N1, N2, ..., Nn)가 출력한 출력값에 대해서는 일종의 배깅(bagging) 기법을 이용하여 각 픽셀에 대한 최종 출력값(점수)을 결정할 수 있다. 예컨대, 컴퓨터 장치는 각 학습네트워크의 소프트맥스 출력값을 각 학습네트워크의 각 픽셀 별 각 클래스에 대한 점수로 보고, 최댓값, 중앙값, 최빈값, 산술 평균값, 조화 평균값, 기하 평균값 및 절사 평균값 등과 같은 기준 중 어느 하나를 이용하여 각 클래스에 대한 최종 점수를 계산할 수 있다.For output values output from a plurality of learning networks (N1, N2, ..., Nn), a final output value (score) for each pixel may be determined using a kind of bagging technique. For example, the computer device reports the softmax output value of each learning network as a score for each class for each pixel of each learning network, and any of the criteria such as maximum, median, mode, arithmetic mean, harmonic mean, geometric mean, and truncated mean You can calculate the final score for each class using one.

컴퓨터 장치는 각 픽셀에 대한 최종 점수를 기준으로 분할 영상을 생성할 수 있다. 컴퓨터 장치는 각 픽셀의 최종 점수의 값에 따라 특정 색상을 갖는 픽셀값을 설정하여 분할 영상을 출력할 수 있다.The computer device may generate a segmented image based on the final score for each pixel. The computer device may output a divided image by setting a pixel value having a specific color according to the value of the final score of each pixel.

도 5는 영상 분할을 이용한 불량 검출 장치(200)의 구성에 대한 예이다. 불량 검출 장치(200)는 전술한 컴퓨터 장치에 해당한다. 불량 검출 장치(200)는 도 1의 검출 서버(150) 내지 검출 PC(180)에 대한 구성을 도시한다.5 is an example of a configuration of a defect detection apparatus 200 using image segmentation. The defect detection device 200 corresponds to the above-described computer device. The defect detection device 200 shows the configuration of the detection server 150 to the detection PC 180 of FIG. 1.

불량 검출 장치(200)는 전술한 소스 영상 분석 결과 내지 분할 영상을 생성하는 장치이다. 불량 검출 장치(200)는 물리적으로 다양한 형태로 구현될 수 있다. 예컨대, 불량 검출 장치(200)는 PC와 같은 컴퓨터 장치, 네트워크의 서버, 영상 처리 전용 칩셋 등의 형태를 가질 수 있다. 컴퓨터 장치는 스마트 기기 등과 같은 모바일 기기를 포함할 수 있다. The defect detection device 200 is a device that generates the above-described source image analysis result or split image. The defect detection device 200 may be physically implemented in various forms. For example, the defect detection device 200 may take the form of a computer device such as a PC, a server of a network, a chipset dedicated to image processing, and the like. The computer device may include a mobile device such as a smart device.

불량 검출 장치(200)는 저장 장치(210), 메모리(220), 연산장치(230), 인터페이스 장치(240) 및 통신 장치(250)를 포함한다.The defect detection device 200 includes a storage device 210, a memory 220, a computing device 230, an interface device 240, and a communication device 250.

저장 장치(210)는 영상 처리를 위한 신경망 모델을 저장한다. 예컨대, 저장 장치(210)는 도 3과 같은 학습네트워크 또는/및 도 4과 같은 학습네트워크를 저장할 수 있다. 나아가 저장 장치(210)는 영상 처리에 필요한 프로그램 내지 소스 코드 등을 저장할 수 있다. 저장 장치(210)는 입력된 소스 영상 및 생성된 분할 영상을 저장할 수 있다.The storage device 210 stores a neural network model for image processing. For example, the storage device 210 may store a learning network as shown in FIG. 3 or/and a learning network as shown in FIG. 4. Furthermore, the storage device 210 may store programs or source codes required for image processing. The storage device 210 may store the input source image and the generated split image.

메모리(220)는 불량 검출 장치(200)가 수신한 소스 영상 및 분할 영상 생성과정에서 생성되는 데이터 및 정보 등을 저장할 수 있다.The memory 220 may store data and information generated in the process of generating the source image and the divided image received by the defect detection device 200.

인터페이스 장치(240)는 외부로부터 일정한 명령 및 데이터를 입력받는 장치이다. 인터페이스 장치(240)는 물리적으로 연결된 입력 장치 또는 외부 저장 장치로부터 소스 영상을 입력받을 수 있다. 인터페이스 장치(240)는 영상 처리를 위한 각종 신경망 모델을 입력받을 수 있다. 인터페이스 장치(240)는 신경망 모델 생성을 위한 학습데이터, 정보 및 파라미터값을 입력받을 수도 있다.The interface device 240 is a device that receives certain commands and data from the outside. The interface device 240 may receive a source image from a physically connected input device or an external storage device. The interface device 240 may receive various neural network models for image processing. The interface device 240 may receive learning data, information, and parameter values for generating a neural network model.

통신 장치(250)는 유선 또는 무선 네트워크를 통해 일정한 정보를 수신하고 전송하는 구성을 의미한다. 통신 장치(250)는 외부 객체로부터 소스 영상을 수신할 수 있다. 통신 장치(250)는 각종 신경망 모델 및 모델 학습을 위한 데이터도 수신할 수 있다. 통신 장치(250)는 생성한 분할 영상을 외부 객체로 송신할 수 있다.The communication device 250 refers to a configuration that receives and transmits certain information through a wired or wireless network. The communication device 250 may receive a source image from an external object. The communication device 250 may also receive various neural network models and data for model training. The communication device 250 may transmit the generated segmented image as an external object.

통신 장치(250) 내지 인터페이스 장치(240)는 외부로부터 일정한 데이터 내지 명령을 전달받는 장치이다. 통신 장치(250) 내지 인터페이스 장치(240)를 입력장치라고 명명할 수 있다.The communication device 250 to the interface device 240 is a device that receives certain data or commands from the outside. The communication device 250 to the interface device 240 may be referred to as an input device.

연산 장치(230)는 저장장치(210)에 저장된 신경망 모델 내지 프로그램을 이용하여 분량 검출을 위한 정보 내지 분할 영상을 생성한다. 연산 장치(230)는 주어진 학습 데이터를 이용하여 영상 처리 과정에 사용되는 신경망 모델을 학습할 수 있다. 연산 장치(230)는 전술한 과정을 통해 구축된 신경망을 이용하여 소스 영상에 대한 분할 영상을 생성할 수 있다. 연산 장치(230)는 데이터를 처리하고, 일정한 연산을 처리하는 프로세서, AP, 프로그램이 임베디드된 칩과 같은 장치일 수 있다.The computing device 230 uses the neural network model or program stored in the storage device 210 to generate information or segmented images for detecting the quantity. The computing device 230 may train the neural network model used in the image processing process using the given learning data. The computing device 230 may generate a segmented image for the source image using the neural network constructed through the above-described process. The computing device 230 may be a device such as a processor, an AP, or a chip embedded with a program that processes data and processes a certain operation.

또한, 상술한 바와 같은 분량 검출 방법 내지 분할 영상 생성 방법은 컴퓨터에서 실행될 수 있는 실행가능한 알고리즘을 포함하는 프로그램(또는 어플리케이션)으로 구현될 수 있다. 상기 프로그램은 비일시적 판독 가능 매체(non-transitory computer readable medium)에 저장되어 제공될 수 있다.In addition, the above-described method for detecting a quantity or generating a segmented image may be implemented as a program (or application) including an executable algorithm that can be executed on a computer. The program may be stored and provided in a non-transitory computer readable medium.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.The non-transitory readable medium means a medium that stores data semi-permanently and that can be read by a device, rather than a medium that stores data for a short time, such as registers, caches, and memory. Specifically, the various applications or programs described above may be stored and provided in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

본 실시례 및 본 명세서에 첨부된 도면은 전술한 기술에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 전술한 기술의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시례는 모두 전술한 기술의 권리범위에 포함되는 것이 자명하다고 할 것이다.The drawings attached to the present embodiment and the present specification merely show a part of the technical spirit included in the above-described technology, and are easily understood by those skilled in the art within the scope of the technical spirit included in the above-described technical specification and drawings. It will be apparent that all of the examples and specific examples that can be inferred are included in the scope of the above-described technology.

100 : 불량 검출 시스템
111, 112, 113 : 카메라
150 : 검출 서버
180 : 검출 PC
10, 20 : 사용자
200 : 영상 처리 장치
210 : 저장장치
220 : 메모리
230 : 연산장치
240 : 인터페이스장치
250 : 통신장치100: defect detection system
111, 112, 113: Camera
150: detection server
180: detecting PC
10, 20: user
200: image processing device
210: storage device
220: memory
230: computing device
240: interface device
250: communication device

Claims

A computer device inputting a source image including a product object into a learning network; And
The computer device comprises the step of outputting the divided image generated by the learning network,
The learning network is a convolutional encoder-decoder, and a defect detection method using image segmentation based on an artificial neural network that outputs the segmented image in which a defective part is classified from the product object.

According to claim 1,
The encoder extracts a feature in units of pixels to generate a feature map, and the decoder detects a defect using artificial neural network based image segmentation that outputs the segmented image as an input of the feature map.

According to claim 1,
The loss function of the learning network is a similar value within a reference value range between neighboring pixels for a class in which the result of the total cross entropy loss value and the softmax function determined by the cross entropy of each pixel is the same class. A defect detection method using artificial neural network based image segmentation including a normalization term that normalizes to appear.

According to claim 1,
The loss function of the learning network is a function of _summing the first loss function (loss _crossEntropy ) and the second loss function (loss _{regularizationTerm} ), and detecting a defect using artificial neural network based image segmentation determined by the following formula.

,

(Where w _k is the weight for the k-th class, t(i,j,k) is the affiliation information for the k-th class of (i,j) pixels, and p(i,j,k) is (i,j ) Softmax output score for the k-th class of pixels, regularizer is a _{hyperparameter} that determines the weight of the loss value for loss _{regularizationTerm} )

A computer device inputting a source image including product objects into a plurality of learning networks;
Outputting an output value for each pixel by the plurality of learning networks; And
And the computer device classifying a final class for each pixel based on an output value for each pixel output by the plurality of learning networks.
The learning network is a convolutional encoder-decoder, and a defect detection method using image segmentation based on an artificial neural network that outputs the output value for classifying a defective part in the product object.

The method of claim 5,
The loss function of the learning network is a similar value within a reference value range between neighboring pixels for a class in which the result of the total cross entropy loss value and the softmax function determined by the cross entropy of each pixel is the same class. A defect detection method using artificial neural network based image segmentation including a normalization term that normalizes to appear.

The loss function of the learning network is a function of _summing the first loss function (loss _crossEntropy ) and the second loss function (loss _{regularizationTerm} ), and detecting a defect using artificial neural network based image segmentation determined by the following formula.

,

The method of claim 5,
The computer device determines a final score for each pixel as at least one of a maximum value, a median value, a mode value, an arithmetic average value, a harmonic average value, a geometric average value, and a truncated average value of output values for each pixel output by the plurality of learning networks, and A defect detection method using image segmentation based on an artificial neural network that classifies the final class by a final score.

The method of claim 5,
The computer device detects a defect using artificial neural network based image segmentation that generates a segmented image based on a final class for each pixel.

An input device that receives a source image including a product object;
A storage device for storing a plurality of convolutional encoder-decoder generating an output value for distinguishing a defective part from the input image; And
The source image is input to each of the plurality of convolutional encoder-decoders to generate output values for each pixel of the source image, and each of the plurality of convolutional encoder-decoders is output based on output values for each pixel. A defect detection apparatus using artificial neural network based image segmentation, comprising an operation device for classifying a final class for a pixel and generating an output image based on the final class for each pixel.

The method of claim 10,
The plurality of convolutional encoder-decoder is a defect detection apparatus using artificial neural network based image segmentation with different sizes of feature maps output from the encoder.

The method of claim 10,
The loss function of the convolutional encoder-decoder is a function of _summing a first loss function (loss _crossEntropy ) and a second loss function (loss _{regularizationTerm} ), and a defect detection apparatus using artificial neural network based image segmentation determined by the following formula .

,

The method of claim 10,
The computing device determines a final score for each pixel as at least one of a maximum value, a median value, a mode value, an arithmetic mean value, a harmonic mean value, a geometric mean value, and a truncation mean value for each pixel output from the plurality of convolutional encoder-decoders. And, the defect detection device using the artificial neural network based image segmentation to classify the final class by the final score.