KR102342495B1

KR102342495B1 - Method and Apparatus for Creating Labeling Model with Data Programming

Info

Publication number: KR102342495B1
Application number: KR1020210032098A
Authority: KR
Inventors: 이태훈; 김도형; 노유지; 태기현; 허건; 황의종
Original assignee: 에스케이텔레콤 주식회사; 한국과학기술원
Priority date: 2019-04-25
Filing date: 2021-03-11
Publication date: 2021-12-22
Also published as: KR20210031444A

Abstract

데이터 프로그래밍에 기반한 레이블링 모델 생성 방법 및 장치를 개시한다.
본 실시예에 의하면, 부품의 영상정보에 전처리 과정을 수행하여 영상을 가공하고, 여러 가지 기존 영상 처리 기법을 적절하게 조합하는 일반적인 프레임워크(Framework)를 기반으로 레이블링 함수를 프로토 타이핑(Prototyping)하는 레이블링 모델 생성 방법 및 장치를 제공한다.Disclosed is a method and apparatus for generating a labeling model based on data programming.
According to this embodiment, a labeling function is prototyped based on a general framework that processes the image by performing a pre-processing process on the image information of the part, and appropriately combines various existing image processing techniques. A method and apparatus for generating a labeling model are provided.

Description

Method and Apparatus for Creating Labeling Model with Data Programming}

본 실시예는 데이터 프로그래밍에 기반한 레이블링 모델 생성 방법 및 장치에 관한 것이다.This embodiment relates to a method and apparatus for generating a labeling model based on data programming.

이하에 기술되는 내용은 단순히 본 실시예와 관련되는 배경 정보만을 제공할 뿐 종래기술을 구성하는 것이 아니다.The content described below merely provides background information related to the present embodiment and does not constitute the prior art.

일반적으로 기계 학습(Machine Learning)을 이용하여 지도 학습(Supervised Learning)을 수행하기 위해서는 레이블 된 데이터(Labeled Data)를 필요로 한다. 또한, 기계 학습 분야에서 딥 러닝(Deep Learning)을 기반으로 특성 공학(Feature Engineering)이 발전되면서, 레이블 된 데이터를 많이 확보하는 것이 더욱 중요한 과제로 떠올랐다.In general, in order to perform supervised learning using machine learning, labeled data is required. In addition, as feature engineering is developed based on deep learning in the field of machine learning, securing a lot of labeled data has emerged as a more important task.

기존 데이터 레이블링 기법은 자체 학습(Self Learning)과 능동 학습(Active Learning)이 있다. 자체 학습은 준지도 학습(Semi-Supervised Learning) 기법 중 하나로, 데이터의 일부는 레이블링이 되어 있다고 가정하고 레이블 된 데이터를 최대한 이용해서 나머지 레이블 되지 않은 데이터(Unlabeled Data)의 레이블을 예측하는 기법이다. 이 기법은 레이블 된 데이터가 이미 충분히 있다고 가정하기 때문에 레이블이 없는 경우에는 사용할 수 없다. 능동 학습은 사람의 레이블링을 최소화하는 기법으로, 모든 데이터를 레이블링하는 것은 아니고, 모델 정확도의 개선에 가장 도움이 될만한 데이터만 선택적으로 레이블링한다. 이 기법은 레이블링을 완벽하게 할 수 있는 도메인 전문가가 충분히 있다고 가정하기 때문에 그러한 인력을 구하기 어려운 새로운 애플리케이션에서는 사용할 수 없다는 문제가 있다.Existing data labeling techniques include self-learning and active learning. Self-learning is one of semi-supervised learning techniques. It is a technique that assumes that some of the data is labeled and uses the labeled data to the maximum to predict the label of the remaining unlabeled data. This technique cannot be used if there is no label, because it assumes that there is already enough labeled data. Active learning is a technique that minimizes human labeling, and does not label all data, but selectively labels only the data that is most helpful in improving model accuracy. The problem is that this technique cannot be used in new applications where it is difficult to obtain such personnel because it assumes that there are enough domain experts who can complete labeling.

최근에는 레이블을 대량으로 생성할 수 있는 데이터 프로그래밍(Data Programming)이 제안되었고 점점 많은 곳에 활용되고 있다. 도메인 전문가뿐 아니라, 비전문가도 레이블을 자동으로 생성할 수 있는 레이블링 함수(LF: Labeling Function)을 다수 개발하고, 그 함수들의 의견을 하나의 생성 모델(Generative Model)로 취합한다. 레이블링 함수들의 출력값이 생성 모델을 통과하면, 각각의 레이블 되지 않은 데이터의 약한 레이블(Weak Label)이 결정된다. 여기서 약한 레이블이라 함은 사람이 수동으로 생성한 레이블만큼 정확하지 않지만 대신 자동으로 생성이 되고 특정 임계치 이상의 정확성을 갖는 레이블을 의미한다. 데이터 프로그래밍을 통한 학습은 전술한 과정을 통해 얻은 약한 레이블을 이용해서 컨볼루션 신경망(CNN: Convolutional Neural Network)과 같은 분류 모델(Discriminative Model)을 훈련시키는 것으로 마무리된다. 데이터 프로그래밍은 자체 학습에 비해 기존 레이블을 필요로 하지 않고, 능동 학습에 비해 레이블링을 수동으로 할 필요가 없다. 데이터 프로그래밍에서 레이블링 함수를 구현하면 대량의 약한 레이블을 생성할 수 있는데, 현재 방식은 레이블링 함수를 구현할 수 있는 전문 인력이 있다는 가정 하에서, 레이블링 함수가 비교적 간단한 텍스트 데이터에서만 주로 사용된다는 한계가 있다.Recently, data programming that can generate labels in large quantities has been proposed and is being used in more and more places. Not only domain experts but also non-experts develop a number of labeling functions (LFs) that can automatically generate labels, and the opinions of those functions are combined into one generative model. When the output values of the labeling functions pass through the generative model, the weak label of each unlabeled data is determined. Here, the weak label means a label that is not as accurate as a label manually generated by a human, but is automatically generated instead and has an accuracy above a certain threshold. Learning through data programming ends by training a discriminative model such as a convolutional neural network (CNN) using the weak labels obtained through the above process. Data programming does not require traditional labels compared to self-learning, and does not require manual labeling compared to active learning. Implementing a labeling function in data programming can generate a large number of weak labels, but the current method has a limitation in that the labeling function is mainly used only for relatively simple text data, assuming that there is an expert who can implement the labeling function.

본 실시예는, 부품의 영상정보에 전처리 과정을 수행하여 영상을 가공하고, 여러 가지 기존 영상 처리 기법을 적절하게 조합하는 일반적인 프레임워크(Framework)를 기반으로 레이블링 함수를 프로토 타이핑(Prototyping)하는 레이블링 모델 생성 방법 및 장치를 제공하는 데 목적이 있다.In this embodiment, the image is processed by performing a preprocessing process on the image information of the part, and the labeling function is prototyped based on a general framework that appropriately combines various existing image processing techniques. An object of the present invention is to provide a method and apparatus for generating a model.

또한 본 실시예는 레이블링 UI(User Interface)가 존재하는 경우, 크라우드 소싱(Crowd Sourcing) 기법을 기반으로 레이블링 UI를 이용하여 영상에서 보이는 결함을 쉽게 표시하고, 표시한 결함의 패턴들을 각 레이블링 함수로 자동 변환하는 레이블링 모델 생성 방법 및 장치를 제공하는 데 목적이 있다.In addition, in this embodiment, when a labeling UI (User Interface) exists, defects visible in the image are easily displayed using the labeling UI based on the crowd sourcing technique, and the patterns of the displayed defects are converted to each labeling function. An object of the present invention is to provide a method and apparatus for generating a labeling model that automatically converts.

본 실시예의 일 측면에 의하면, 영상 촬영장치로부터 부품 영상정보를 수집하는 영상 수집부; 상기 부품 영상정보에 전처리 기법을 수행하여 가공 영상정보를 생성하는 데이터 전처리부; 상기 가공 영상정보에 영상처리 기법 또는 크라우드 소싱 기법를 적용하여 부품의 결함 여부를 판단하고, 판단 결과에 따라 하나 이상의 레이블링 함수를 생성하는 레이블링 함수 생성부; 및 상기 레이블링 함수를 조합하여 약한 레이블(Weak Label)을 획득하기 위한 레이블링 모델(Labeling Model)을 생성하는 모델 생성부를 포함하되, 상기 모델 생성부는, 개발 데이터셋(Development Dataset)을 기반으로 상기 레이블링 함수를 조합하되, 상기 레이블링 함수의 개수에 따라 레이블링 모델의 종류를 달리하는 것을 특징으로 하는 레이블링 모델 생성 장를 제공한다.According to an aspect of this embodiment, an image collecting unit for collecting part image information from an image capturing device; a data pre-processing unit for generating processed image information by performing a pre-processing technique on the part image information; a labeling function generator for determining whether a part is defective by applying an image processing technique or a crowd sourcing technique to the processed image information, and generating one or more labeling functions according to the determination result; and a model generator for generating a labeling model for obtaining a weak label by combining the labeling function, wherein the model generator includes a development dataset, the labeling function Provide a labeling model creation chapter, characterized in that the type of the labeling model is different according to the number of the labeling functions.

본 실시예의 다른 측면에 의하면, 영상 촬영장치로부터 부품 영상정보를 수집하는 영상 수집과정; 상기 부품 영상정보에 전처리 기법을 수행하여 가공 영상정보를 생성하는 데이터 전처리 과정; 상기 가공 영상정보에 영상처리 기법 또는 크라우드 소싱 기법를 적용하여 부품의 결함 여부를 판단하고, 판단 결과에 따라 하나 이상의 레이블링 함수를 생성하는 레이블링 함수 생성과정; 및 상기 레이블링 함수를 조합하여 약한 레이블(Weak Label)을 획득하기 위한 레이블링 모델(Labeling Model)을 생성하는 모델 생성과정을 포함하되, 상기 모델 생성과정은, 개발 데이터셋(Development Dataset)을 기반으로 상기 레이블링 함수를 조합하되, 상기 레이블링 함수의 개수에 따라 레이블링 모델의 종류를 달리하는 것을 특징으로 하는 레이블링 모델 생성 장치의 레이블링 모델 생성 방법을 제공한다.According to another aspect of the present embodiment, an image collection process of collecting part image information from an image capturing device; a data pre-processing process of generating processed image information by performing a pre-processing technique on the part image information; a labeling function generation process of determining whether a part is defective by applying an image processing technique or a crowd sourcing technique to the processed image information, and generating one or more labeling functions according to the determination result; and a model generation process of generating a labeling model for obtaining a weak label by combining the labeling functions, wherein the model generation process is based on a development dataset It provides a method for generating a labeling model of an apparatus for generating a labeling model, wherein the labeling function is combined, and the type of the labeling model is changed according to the number of the labeling functions.

이상에서 설명한 바와 같이 본 실시예에 의하면, 부품의 영상정보에 전처리 과정을 수행하여 영상을 가공하고, 여러 가지 기존 영상 처리 기법을 적절하게 조합하는 일반적인 프레임워크(Framework)를 기반으로 레이블링 함수를 프로토 타이핑(Prototyping)하는 레이블링 모델 생성 방법 및 장치를 제공하는 효과가 있다.As described above, according to this embodiment, a labeling function is prototyped based on a general framework that processes an image by performing a pre-processing process on image information of parts, and appropriately combines various existing image processing techniques. There is an effect of providing a method and apparatus for generating a labeling model for typing (Prototyping).

또한 본 실시예에 의하면, 레이블링 UI(User Interface)가 존재하는 경우, 크라우드 소싱(Crowd Sourcing) 기법을 기반으로 레이블링 UI를 이용하여 영상에서 보이는 결함을 쉽게 표시하고, 표시한 결함의 패턴들을 각 레이블링 함수로 자동 변환하는 레이블링 모델 생성 방법 및 장치를 제공하는 효과가 있다.In addition, according to this embodiment, when a labeling UI (User Interface) exists, defects visible in the image are easily displayed using the labeling UI based on the crowd sourcing technique, and the patterns of the displayed defects are individually labeled It has the effect of providing a method and apparatus for generating a labeling model that automatically converts it into a function.

또한 본 실시예에 의하면, 영상처리 기법 또는 크라우드 소싱 기법을 포함하는 데이터 프로그래밍에 의해 레이블링에 소요되는 비용 또는 시간을 줄일 수 있는 효과가 있다.In addition, according to the present embodiment, there is an effect of reducing the cost or time required for labeling by data programming including an image processing technique or a crowd sourcing technique.

도 1은 본 실시예에 따른 레이블링 모델 생성 시스템을 설명하기 위해 개략적으로 나타낸 개념도이다.
도 2는 본 실시예에 따른 레이블링 모델 생성 장치의 구조를 개략적으로 나타낸 블록 구성도이다.
도 3은 본 실시예에 따른 레이블링 모델 생성 장치의 구동 과정을 설명하기 위한 블록 구성도이다.
도 4는 본 실시예에 따른 레이블링 UI를 설명하기 위해 개략적으로 나타낸 개념도이다.
도 5는 본 실시예에 따른 레이블링 모델 생성 방법을 설명하기 위한 순서도이다.
도 6은 본 실시예에 따른 영상처리 기법의 적용 방법을 설명하기 위한 순서도이다.
도 7은 본 실시예에 따른 크라우드 소싱 기법의 적용 방법을 설명하기 위한 순서도이다.1 is a conceptual diagram schematically illustrating a labeling model generating system according to the present embodiment.
2 is a block diagram schematically showing the structure of a labeling model generating apparatus according to the present embodiment.
3 is a block diagram illustrating a driving process of the apparatus for generating a labeling model according to the present embodiment.
4 is a conceptual diagram schematically illustrating a labeling UI according to the present embodiment.
5 is a flowchart illustrating a method for generating a labeling model according to the present embodiment.
6 is a flowchart for explaining a method of applying the image processing technique according to the present embodiment.
7 is a flowchart illustrating a method of applying a crowd sourcing technique according to the present embodiment.

이하, 본 실시예를 첨부된 도면을 참조하여 상세하게 설명한다. 본 실시예들을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시 예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Hereinafter, this embodiment will be described in detail with reference to the accompanying drawings. In describing the present embodiments, if it is determined that a detailed description of a related well-known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in the present specification, and the technical idea disclosed herein is not limited by the accompanying drawings, and all changes included in the spirit and scope of the present invention , should be understood to include equivalents or substitutes.

본 실시예에서 '포함'이라는 용어는 명세서 상에 기재된 구성요소, 특징, 단계 또는 이들을 조합한 것이 존재한다는 것이지, 하나 또는 복수 개의 구성요소나 다른 특징, 단계 또는 이들을 조합한 것의 존재 가능성을 미리 배제하지 않는 것으로 이해되어야 할 것이다. 또한, 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다. In this embodiment, the term 'comprising' means that the elements, features, steps, or combinations thereof described in the specification exist, and the possibility of existence of one or a plurality of elements or other features, steps, or combinations thereof is excluded in advance. It should be understood as not In addition, when it is described that a component is "connected", "coupled" or "connected" to another component, the component may be directly connected or connected to the other component, but between each component It should be understood that other components may be “connected,” “coupled,” or “connected.”

도 1은 본 실시예에 따른 레이블링 모델 생성 시스템을 설명하기 위해 개략적으로 나타낸 개념도이다.1 is a conceptual diagram schematically illustrating a labeling model generating system according to the present embodiment.

도 1을 참조하면, 본 실시예에 따른 레이블링 모델 생성 시스템은 대상 부품(110), 레이블링 모델 생성 장치(120) 및 사용자 단말기(130)를 포함한다. 레이블링 모델 생성 시스템에 포함된 구성요소는 반드시 이에 한정되는 것은 아니다.Referring to FIG. 1 , the labeling model generating system according to the present embodiment includes a target part 110 , a labeling model generating device 120 , and a user terminal 130 . Components included in the labeling model generation system are not necessarily limited thereto.

대상 부품(110)은 영상 촬영장치에 의해 촬영된다. 대상 부품(110)은 하이테크(High Tech) 제조 산업에서 각종 기계 설비에 구비되는 다양한 부품을 포함한다. 한편, 영상 촬영장치는 균일한 조도, 위치, 촬영 거리 등을 기반으로 대상 부품(110)을 촬영하여 부품 영상정보를 생성한다. 영상 촬영장치는 부품 영상 정보를 획득하기 위한 카메라, 조명 등의 광학모듈을 포함하는 장치를 의미한다.The target part 110 is photographed by an image photographing device. The target part 110 includes various parts provided in various mechanical equipment in a high-tech manufacturing industry. On the other hand, the image photographing apparatus generates the component image information by photographing the target part 110 based on uniform illuminance, position, shooting distance, and the like. The image photographing device refers to a device including an optical module such as a camera and lighting for acquiring part image information.

레이블링 모델 생성 장치(120)는 영상 촬영장치로부터 부품 영상정보를 수집한다. 여기서 부품 영상정보는 영상 촬영장치에 의해 촬영된 대상 부품(110)의 영상정보를 의미한다. 레이블링 모델 생성 장치(120)는 영상 촬영장치로부터 부품 영상정보를 수집하는 것이 일반적이나, 사용자의 조작 또는 명령에 의해 부품 영상정보를 입력받거나 별도의 데이터베이스(Database)에 저장된 부품 영상정보를 로딩(Loading)할 수도 있다.The labeling model generating apparatus 120 collects part image information from the image capturing apparatus. Here, the part image information means image information of the target part 110 photographed by the image capturing apparatus. The labeling model generating device 120 generally collects parts image information from an image capturing device, but receives part image information by a user's manipulation or command, or loads part image information stored in a separate database (Database). )You may.

레이블링 모델 생성 장치(120)는 수집한 부품 영상정보를 가공하여 가공 영상정보를 생성한다. 여기서 가공 영상정보는 부품 영상정보에 다양한 전처리 기법을 적용하여 획득한 영상으로, 전처리 기법에 대해서는 도 2와 관련하여 후술한다.The labeling model generating apparatus 120 generates processed image information by processing the collected part image information. Here, the processed image information is an image obtained by applying various pre-processing techniques to part image information, and the pre-processing technique will be described later with reference to FIG. 2 .

레이블링 모델 생성 장치(120)는 가공 영상정보를 기반으로 레이블링 함수를 생성한다. 보다 상세하게는, 레이블링 모델 생성 장치(120)는 사용자의 선택정보에 따라 가공 영상정보에 영상처리 기법 또는 크라우드 소싱 기법 중 어느 하나를 적용하여 레이블링 함수를 생성한다. 여기서 사용자의 선택정보는 사용자의 상황에 따라 자신에게 유리한 기법을 선택하는 것을 의미한다.The labeling model generating apparatus 120 generates a labeling function based on the processed image information. More specifically, the labeling model generating apparatus 120 generates a labeling function by applying any one of an image processing technique or a crowd sourcing technique to the processed image information according to the user's selection information. Here, the user's selection information means selecting a technique advantageous to the user according to the user's situation.

예컨대, 사용자는 도메인 전문가에 의한 양질의 레이블을 확보하기 어려운 상황에서 영상처리 분야의 전문가가 존재하는 경우에는 영상처리 기법을, 전문가가 존재하지 않는 경우에는 크라우드 소싱 기법을 선택하는 것이 유리하므로, 자신의 상황에 따라 영상처리 기법 또는 크라우드 소싱 기법 중 하나를 선택한다.For example, in a situation in which it is difficult to secure a high-quality label by a domain expert, it is advantageous for the user to select the image processing technique when there is an expert in the image processing field, and the crowdsourcing technique when there is no expert. Either image processing technique or crowd sourcing technique is selected according to the situation of

레이블링 모델 생성 장치(120)는 생성된 레이블링 함수를 조합하여 생성 모델(Generative Model) 또는 투표 모델(Voting Model)을 생성한다. 생성 모델 또는 투표 모델에 대해서는 도 2와 관련하여 후술한다.The labeling model generating apparatus 120 generates a generative model or a voting model by combining the generated labeling functions. The generation model or voting model will be described later with reference to FIG. 2 .

레이블링 모델 생성 장치(120)는 생성 모델 또는 투표 모델을 이용하여 약한 레이블을 생성한다. 여기서 약한 레이블(Weak Label)은 사용자가 수동으로 생성한 레이블만큼 정확하지 않지만 자동으로 생성되어 특정 임계치 이상의 정확성을 갖는 레이블을 의미한다. 따라서 사용자는 약한 레이블을 이용하면 비용적, 시간적 측면에서 효율성을 획득할 수 있다. 레이블링 모델 생성 장치(120)는 생성된 약한 레이블을 사용자 단말기(130)로 전송한다. The labeling model generating device 120 generates a weak label using a generating model or a voting model. Here, a weak label refers to a label that is not as accurate as a manually generated label by a user, but is automatically generated and has accuracy above a certain threshold. Therefore, the user can obtain efficiency in terms of cost and time by using a weak label. The labeling model generating apparatus 120 transmits the generated weak label to the user terminal 130 .

사용자 단말기(130)는 약한 레이블을 이용하여 분류 모델(Discriminative Model)을 훈련시킬 수 있다. 여기서 분류 모델은 컨볼루션 신경망(CNN: Convolutional Neural Network)을 이용하여 학습되는 것이 바람직하나 반드시 이에 한정되는 것은 아니다.The user terminal 130 may train a discriminative model using the weak label. Here, the classification model is preferably learned using a convolutional neural network (CNN), but is not necessarily limited thereto.

사용자 단말기(130)는 사용자의 키 조작에 따라 네트워크를 경유하여 각종 웹 페이지 데이터를 수신할 수 있는 전자기기를 의미한다. 사용자 단말기(130)는 태블릿 PC(Tablet PC), 랩톱(Laptop), 개인용 컴퓨터(PC: Personal Computer), 스마트폰(Smart Phone), 개인휴대용 정보단말기(PDA: Personal Digital Assistant) 및 이동통신 단말기(Mobile Communication Terminal) 등 중 어느 하나일 수 있다. 사용자 단말기(130)는 네트워크를 경유하여 레이블링 모델 생성 장치(120)에 접속하기 위한 웹 브라우저와 프로그램을 저장하기 위한 메모리, 프로그램을 실행하여 연산 및 제어하기 위한 마이크로프로세서 등을 구비할 수도 있다.The user terminal 130 refers to an electronic device capable of receiving various web page data via a network according to a user's key manipulation. The user terminal 130 includes a tablet PC (Tablet PC), a laptop (Laptop), a personal computer (PC: Personal Computer), a smart phone (Smart Phone), a personal digital assistant (PDA) and a mobile communication terminal ( Mobile Communication Terminal) and the like. The user terminal 130 may include a web browser for accessing the labeling model generating apparatus 120 via a network, a memory for storing a program, a microprocessor for executing and controlling the program, and the like.

도 2는 본 실시예에 따른 레이블링 모델 생성 장치의 구조를 나타낸 블록 구성도이다.2 is a block diagram showing the structure of a labeling model generating apparatus according to the present embodiment.

도 2를 참조하면, 본 실시예에 따른 레이블링 모델 생성 장치(120)는 영상 수집부(210), 데이터 전처리부(220), 레이블링 함수 생성부(230) 및 모델 생성부(240)를 포함한다. 레이블링 모델 생성 장치(120)의 구성요소는 일 실시예에 따른 것으로, 본 실시예를 재현하는데 필수적인 구성요소는 아니고, 일부 구성요소가 추가되거나 변경 또는 삭제될 수 있다.Referring to FIG. 2 , the labeling model generating apparatus 120 according to the present embodiment includes an image collecting unit 210 , a data preprocessing unit 220 , a labeling function generating unit 230 , and a model generating unit 240 . . Components of the labeling model generating apparatus 120 are according to an embodiment, and are not essential to reproduce the present embodiment, and some components may be added, changed, or deleted.

영상 수집부(210)는 영상 촬영장치로부터 부품 영상정보를 수집한다. 보다 상세하게는, 영상 수집부(210)는 여기서 부품 영상정보는 다양한 대상 부품(110)을 촬영하여 획득한 영상정보로, 별도로 구비된 데이터베이스(미도시) 또는 외부의 클라우드 서버(미도시)로부터 로딩되거나, 사용자의 입력정보에 의해 입력받을 수도 있다. 영상 수집부(210)는 부품 영상정보를 수집하여 이미지 데이터로 저장할 수 있으며, 레이블링 모델 생성 장치(120)는 이미지 데이터를 이용하여 대상 부품(110)의 결함 여부를 판단하는데 이용할 수 있다.The image collecting unit 210 collects part image information from the image capturing device. In more detail, the image collecting unit 210 here is the part image information is image information obtained by photographing various target parts 110, from a separately provided database (not shown) or an external cloud server (not shown). It may be loaded or may be input by the user's input information. The image collecting unit 210 may collect part image information and store it as image data, and the labeling model generating apparatus 120 may use the image data to determine whether the target part 110 is defective.

데이터 전처리부(220)는 부품 영상정보에 전처리 기법을 수행하여 가공 영상정보를 생성한다. 여기서 가공 영상정보는 데이터 프로그래밍에 이용되기 용이하도록 부품 영상정보 상의 잡음(Noise)을 제거한 영상정보를 의미한다. 데이터 전처리부(220)는 빛 반사 보정법, 관심영역 설정법 또는 대비강화법을 이용하여 가공 영상정보를 생성한다.The data preprocessor 220 generates processed image information by performing a preprocessing technique on the part image information. Here, the processed image information means image information from which noise is removed from the part image information to be easily used for data programming. The data preprocessor 220 generates processed image information using a light reflection correction method, a region-of-interest setting method, or a contrast enhancement method.

빛 반사 보정법은 부품 영상정보 상에서 대상 부품(110)의 결함으로 인한 픽셀(Pixel)값의 차이보다 빛 반사로 인해 부품 영상정보 상의 전 영역에 걸친 픽셀값의 차이가 크게 벌어지는 경우, 모든 부품 영상정보에 대한 평균 픽셀값을 빼주어 빛 반사를 보정한다.In the light reflection correction method, when the difference in pixel values over the entire area on the part image information due to light reflection is greater than the difference in pixel values due to the defect of the target part 110 on the part image information, all parts image The light reflection is corrected by subtracting the average pixel value for the information.

관심영역 설정법은 부품 영상정보 내의 대상 부품(110)이 차지하는 영역을 관심 영역(RoI: Region of Interest)으로 정의하고, 관심 영역의 바깥 영역을 상수값으로 채우는 방법을 말한다. 관심영역 설정법은 부품 영상정보 상에 대상 부품(110)의 위치가 균일하다고 가정하므로, 특별한 알고리즘을 사용하여 관심 영역을 탐지하지 않고 하드코딩 RoI(hard-coded RoI)를 사용한다.The region of interest setting method refers to a method in which the region occupied by the target part 110 in the part image information is defined as a region of interest (RoI), and the region outside the region of interest is filled with a constant value. Since the ROI setting method assumes that the position of the target part 110 is uniform on the part image information, a hard-coded RoI (RoI) is used without detecting the ROI using a special algorithm.

대비강화법은 부품 영상정보 내에서 결함인 부분과 결함이 아닌 부분의 대비(Contrast)가 작아 육안으로 작은 결함 영역을 식별하기 어려운 경우, 최소-최대 정규화(min-max normalization)를 이용하여 대비를 강화한다. The contrast enhancement method enhances the contrast using min-max normalization when it is difficult to identify small defect areas with the naked eye because the contrast between the defective part and the non-defective part in the part image information is small. do.

레이블링 함수 생성부(230)는 가공 영상정보에 영상처리 기법 또는 크라우드 소싱 기법를 적용하여 부품의 결함 여부를 판단하고, 판단 결과에 따라 적어도 하나 이상의 레이블링 함수를 생성한다. 여기서 레이블링 함수(LF: Labeling Function)는 가공 영상정보를 입력값으로 하고 대상 부품(110)의 표면 결함 여부를 출력값으로 하여 자동으로 결함 여부를 확인하는 함수를 말한다.The labeling function generating unit 230 determines whether a part is defective by applying an image processing technique or a crowd sourcing technique to the processed image information, and generates at least one labeling function according to the determination result. Here, the labeling function (LF: Labeling Function) refers to a function that automatically checks whether there is a defect by using the processed image information as an input value and using the surface defect of the target part 110 as an output value.

보다 상세하게는, 사용자는 영상처리 분야의 전문가가 존재하는 경우에는 영상처리 기법을, 전문가가 존재하지 않는 경우에는 크라우드 소싱 기법을 각각 선택하는 것이 레이블 확보에 유리하므로, 레이블링 함수 생성부(230)는 사용자의 상황에 따라 선택한 선택정보를 기반으로 영상처리 기법 또는 크라우드 소싱 기법을 선택한다. 레이블링 함수 생성부(230)는 가공 영상정보에 영상처리 기법을 적용하는 영상처리부(232) 및 크라우드 소싱 기법을 적용하는 크라우드 소싱부(234)를 포함한다.More specifically, since it is advantageous for the user to secure a label by selecting an image processing technique when there is an expert in the image processing field, and a crowd sourcing technique when there is no expert, the labeling function generator 230 selects an image processing technique or a crowdsourcing technique based on the selection information selected according to the user's situation. The labeling function generator 230 includes an image processing unit 232 that applies an image processing technique to the processed image information, and a crowd sourcing unit 234 that applies a crowd sourcing technique.

본 실시예에 따른 레이블링 함수 생성부(230)는 레이블링 UI(User Interface)의 유무에 따라 영상처리 기법 또는 크라우드 소싱 기법을 선택할 수도 있다. 또한, 레이블링 함수 생성부(230)는 반드시 영상처리 기법 또는 크라우드 소싱 기법 중 하나를 선택할 것은 아니고, 두 기법 모두를 선택하여 레이블링 함수를 생성할 수도 있다.The labeling function generator 230 according to the present embodiment may select an image processing technique or a crowdsourcing technique according to the presence or absence of a labeling user interface (UI). In addition, the labeling function generator 230 does not necessarily select one of the image processing technique and the crowd sourcing technique, but may select both techniques to generate the labeling function.

영상처리부(232)는 영상처리 기법을 기반으로 가공 영상정보를 필터링하여 부품결함 후보군을 분류하고, 부품결함 후보군 중 일정 기준을 만족하는 최종 결함 영상정보만을 이용하여 레이블링 함수를 생성한다. 영상처리부(232)에 대해서는 도 3과 관련하여 후술한다.The image processing unit 232 filters the processed image information based on the image processing technique to classify a component defect candidate group, and generates a labeling function using only the final defect image information that satisfies a predetermined criterion among the component defect candidate groups. The image processing unit 232 will be described later with reference to FIG. 3 .

크라우드 소싱부(234)는 레이블링 UI를 이용한 사용자의 입력정보에 근거하여 가공 영상정보 상에 부품결함 의심영역을 표시하고, 부품결함 의심영역을 기반으로 결함 패턴정보를 생성하여 레이블링 함수로 정의한다. 크라우드 소싱부(234)는 사용자가 가공 영상정보 내에서 대상 부품(110)의 결함이 존재하는 것으로 의심되는 영역을 특정하면 해당 영역을 부품결함 의심영역으로 특정한다. 크라우드 소싱부(234)는 부품결함 의심영역에 표시된 라인을 기반으로 결함 패턴정보를 생성하여 레이블링 함수로 정의한다. 레이블링 UI에 대해서는 도 5와 관련하여 후술한다.The crowd sourcing unit 234 displays the part defect suspected region on the processed image information based on the user's input information using the labeling UI, and generates defect pattern information based on the part defect suspect region and defines it as a labeling function. When the user specifies a region suspected of having a defect of the target part 110 in the processed image information, the crowd sourcing unit 234 specifies the region as a suspected part defect region. The crowd sourcing unit 234 generates defect pattern information based on the line displayed on the suspected part defect region and defines it as a labeling function. The labeling UI will be described later with reference to FIG. 5 .

크라우드 소싱부(234)는 복수의 부품결함 의심영역을 이용하여 결함 패턴정보를 생성하고, 가공 영상정보 내에서 결함 패턴정보에 대응하는 영역을 레이블링 함수로 정의할 수 있다. 예컨대, 크라우드 소싱부(234)는 복수의 사용자에 의해 생성된 부품결함 의심영역을 기반으로 복수의 결함 패턴정보를 생성하고, 생성된 결함 패턴정보를 이용하여 다음 순서에 입력되는 가공 영상정보 상의 부품결함 의심영역을 생성할 수 있다.The crowd sourcing unit 234 may generate defect pattern information using a plurality of regions suspected of defective parts, and define a region corresponding to the defect pattern information in the processed image information as a labeling function. For example, the crowd sourcing unit 234 generates a plurality of defect pattern information based on the suspected part defect region generated by a plurality of users, and uses the generated defect pattern information to generate parts on the processed image information input in the next order. It is possible to create a defective area.

크라우드 소싱부(234)는 상관 계수(Correlation Coefficient), 차이 제곱(Squared Difference), 코사인 유사도(Cosine Similarity) 등을 이용하여 결함 패턴정보와 다음 가공 영상정보 상의 부품결함 의심영역을 비교하여 유사한 정도를 측정할 수 있다. 크라우드 소싱부(234)는 측정한 유사도가 특정 임계치 이상일 경우 해당 부분에 결함이 있다고 판단할 수 있다.The crowd sourcing unit 234 compares the defect pattern information with the suspected part defect area on the next processed image information using a correlation coefficient, squared difference, cosine similarity, etc. to determine the degree of similarity. can be measured When the measured similarity is greater than or equal to a specific threshold, the crowd sourcing unit 234 may determine that there is a defect in the corresponding part.

크라우드 소싱부(234)는 상기 부품결함 의심영역을 표시한 상기 가공 영상정보 중 대표결함 패턴정보를 포함하는 대표결함 영상정보를 선택하고, 상기 대표결함 영상정보를 이용하여 상기 레이블링 함수를 자동으로 생성한다. 예컨대, 전문가는 기 생성된 결함 패턴정보 중 해당 결함의 대표성을 나타낼 수 있다고 판단된 대표결함 패턴정보를 선정한다. 크라우드 소싱부(234)는 가공 영상정보 중에서 대표결함 패턴정보가 존재하는 가공 영상정보를 대표결함 영상정보로 특정하고, 대표결함 영상정보를 기반으로 레이블링 함수를 생성한다. 따라서 크라우드 소싱부(234)는 대표적인 결함패턴을 기반으로 레이블링 함수를 생성하는바 레이블링 모델 생성 장치(120)의 정확성을 높일 수 있다. 예컨대 크라우드 소싱부(234)는 특정 기준(정밀도, 재현율 등) 순으로 결함 패턴정보에 의해 정의된 레이블링 함수 k개를 선정하여 높은 정확도의 레이블을 빠른 시간 내에 획득할 수 있다.The crowd sourcing unit 234 selects representative defect image information including representative defect pattern information from among the processed image information indicating the suspected part defect region, and automatically generates the labeling function using the representative defect image information do. For example, the expert selects representative defect pattern information determined to represent the representativeness of the corresponding defect from among the previously generated defect pattern information. The crowd sourcing unit 234 specifies the processed image information in which the representative defect pattern information exists among the processed image information as the representative defect image information, and generates a labeling function based on the representative defect image information. Accordingly, the crowd sourcing unit 234 generates a labeling function based on a representative defect pattern, thereby increasing the accuracy of the labeling model generating apparatus 120 . For example, the crowd sourcing unit 234 may select k labeling functions defined by the defect pattern information in the order of a specific criterion (precision, recall, etc.) to obtain a high-accuracy label within a short time.

모델 생성부(240)는 레이블링 함수를 조합하여 약한 레이블(Weak Label)을 획득하기 위한 레이블링 모델(Labeling Model)을 생성한다. 여기서 레이블링 모델은 생성 모델(Generative Model) 또는 투표 모델(Voting Model)을 포함한다. The model generator 240 generates a labeling model for obtaining a weak label by combining the labeling functions. Here, the labeling model includes a generative model or a voting model.

모델 생성부(240)는 사용자의 선택에 따라 개발 데이터셋(Development Dataset)을 기반으로 레이블링 함수를 조합하여 확률 모델을 훈련시켜 생성 모델을 생성한다. 생성 모델의 성능은 레이블링 함수 각각이 아닌 레이블링 함수의 조합에 의해 결정되므로, 개발 데이터셋을 다량으로 확보하여 서로 보완이 되는 레이블링 함수 조합을 탐지하는 것이 바람직하다.The model generator 240 generates a generation model by training a probabilistic model by combining a labeling function based on a development dataset according to a user's selection. Since the performance of the generative model is determined by the combination of the labeling functions rather than each of the labeling functions, it is desirable to detect a combination of the labeling functions that complement each other by securing a large amount of the development dataset.

모델 생성부(240)는 레이블링 함수 각각의 결과값을 기반으로 부품의 결함 여부에 대한 투표값을 획득하여 투표 모델을 생성한다. 모델 생성부(240)는 가공 영상정보에 대해 복수의 레이블링 함수 중 하나라도 결함으로 판단한다면 결함이라는 투표값을 획득한다.The model generating unit 240 generates a voting model by obtaining a vote value for whether a part is defective based on the result value of each of the labeling functions. The model generator 240 acquires a vote value of a defect if it is determined that even one of a plurality of labeling functions for the processed image information is a defect.

모델 생성부(240)는 레이블링 함수의 개수에 따라 이용할 모델의 종류를 결정할 수 있다. 예컨대, 레이블링 함수의 개수가 특정 임계값 이상인 경우에는 생성 모델을, 특정 임계값 이하인 경우에는 투표 모델을 이용하는 것이 바람직하다. 투표 모델은 레이블링 함수가 7개인 경우, 재현율 93.54%, 정밀도 83.03%, 정확도 88.88%를 갖고, 생성 모델은 레이블링 함수가 30개인 경우, 재현율 91.75%, 정밀도 81.84%, 정확도 87.57%를 갖는다.The model generator 240 may determine the type of model to be used according to the number of labeling functions. For example, it is preferable to use a generative model when the number of labeling functions is equal to or greater than a specific threshold, and to use a voting model when the number of labeling functions is less than or equal to a specific threshold. The voting model has a recall of 93.54%, a precision of 83.03%, and an accuracy of 88.88% when there are 7 labeling functions, and the generative model has a recall of 91.75%, a precision of 81.84%, and an accuracy of 87.57% when there are 30 labeling functions.

레이블링 모델 생성 장치(120)는 레이블링 함수를 생성한 경우, 생성된 레이블링 함수를 다른 결함을 판단함에 있어 이용할 수 있고, 다른 결함에 대한 정보를 레이블링 함수에 추가로 구현할 수도 있다. 예컨대, 레이블링 모델 생성 장치(120)는 레이블링 함수가 스크래치에 관한 결함패턴을 의미하는 경우, 스크래치가 아닌 다른 종류의 결함에 대해서는 감지할 수 없도록 레이블링 함수를 추가로 구현할 수도 있다.When the labeling model generating apparatus 120 generates the labeling function, the generated labeling function may be used to determine another defect, and information about the other defect may be additionally implemented in the labeling function. For example, when the labeling function means a defect pattern related to a scratch, the labeling model generating apparatus 120 may additionally implement a labeling function so that it cannot detect a defect other than a scratch.

레이블링 모델 생성 장치(120)는 레이블링 모델 생성 프로그램을 탑재하여 레이블링 모델을 생성할 수 있다. 레이블링 모델 생성 장치(120)는 사용자의 조작 또는 명령에 의해 레이블링 모델 생성 프로그램을 구동하여 사용자에게 레이블링 모델을 생성하여 제공한다.The labeling model generating apparatus 120 may generate a labeling model by mounting a labeling model generating program. The labeling model generating apparatus 120 generates and provides a labeling model to the user by driving the labeling model generating program according to the user's manipulation or command.

레이블링 모델 생성 프로그램에 대해 보다 구체적으로 설명하자면, 레이블링 모델 생성 프로그램은 레이블링 모델 생성 장치(120)가 컴퓨터인 경우 컴퓨터에 설치된 프로그램일 수 있다. 레이블링 모델 생성 프로그램은 컴퓨터가 읽을 수 있는 기록매체에 저장되어 사용자 단말기와 레이블링 모델 생성 장치(120)를 통해 활용되어 레이블링 모델 생성 서비스를 제공할 수 있다. 여기서 컴퓨터가 읽은 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 즉, 컴퓨터가 읽을 수 있는 기록매체로 마그네틱 저장매체(예를 들면, 롬, 플로피 디스크, 하드디스크 등), 광학적 판독 매체(예를 들면, 시디롬, 디브이디 등)와 같은 저장매체를 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록매체로 네트워크로 연결된 컴퓨터 서버 및 시스템에 분산되어 레이블링 모델 생성 장치(120)가 읽을 수 있는 코드로 저장되어 실행될 수 있다.To describe the labeling model generating program in more detail, the labeling model generating program may be a program installed in the computer when the labeling model generating apparatus 120 is a computer. The labeling model generating program may be stored in a computer-readable recording medium and utilized through the user terminal and the labeling model generating device 120 to provide a labeling model generating service. Here, the computer-readable recording medium includes all types of recording devices in which data readable by the computer system is stored. That is, the computer-readable recording medium includes a storage medium such as a magnetic storage medium (eg, ROM, floppy disk, hard disk, etc.) and optically readable medium (eg, CD-ROM, DVD, etc.). In addition, as a computer-readable recording medium, it can be stored and executed as a code readable by the labeling model generating apparatus 120 while being distributed in a network-connected computer server and system.

한편, 본 실시예에 따른 레이블링 모델 생성 장치(120)에 레이블링 모델 생성 프로그램이 탑재되는 형태에 대해 설명하자면, 레이블링 모델 생성 장치(120)는 레이블링 모델 생성 프로그램을 임베디드(Embeded) 형태로 탑재하거나, 레이블링 모델 생성 장치(120) 내에 탑재되는 OS(Operating System)에 임베디드 형태로 탑재하거나, 사용자의 조작 또는 명령에 의해 레이블링 모델 생성 장치(120) 내의 OS에 인스톨되는 형태로 탑재할 수 있다.On the other hand, to describe the form in which the labeling model generating program is mounted on the labeling model generating apparatus 120 according to the present embodiment, the labeling model generating apparatus 120 mounts the labeling model generating program in an embedded form, It may be mounted in an embedded form in an OS (Operating System) mounted in the labeling model generating apparatus 120 , or may be installed in an OS in the labeling model generating apparatus 120 by a user's manipulation or command.

도 3은 본 실시예에 따른 레이블링 모델 생성 장치의 구동 과정을 설명하기 위한 블록 구성도이다.3 is a block diagram illustrating a driving process of the apparatus for generating a labeling model according to the present embodiment.

도 3을 참조하면, 본 실시예에 따른 레이블링 모델 생성 장치(120)는 영상 수집부(210)를 이용하여 부품 촬영장치로부터 부품 영상정보를 획득한다. 레이블링 모델 생성 장치(120)는 데이터 전처리부(220)를 이용하여 가공 영상정보를 생성한다. 레이블링 모델 생성 장치(120)는 가공 영상정보에 사용자의 선택에 따라 영상처리 기법 또는 크라우드 소싱 기법을 적용한다. Referring to FIG. 3 , the labeling model generating apparatus 120 according to the present embodiment acquires part image information from the parts photographing apparatus using the image collecting unit 210 . The labeling model generating apparatus 120 generates processed image information using the data preprocessor 220 . The labeling model generating apparatus 120 applies an image processing technique or a crowd sourcing technique to the processed image information according to a user's selection.

영상처리 기법을 사용하는 영상처리부(232)는 필터링부(232_1), 후보군 분류부(232_2) 및 결함판단부(232_3)를 포함한다. 영상처리부(232)에 포함된 구성요소는 이에 한정되는 것은 아니고, 일부 구성요소가 추가, 삭제 또는 변경될 수 있다. The image processing unit 232 using the image processing technique includes a filtering unit 232_1 , a candidate group classification unit 232_2 , and a defect determination unit 232_3 . Components included in the image processing unit 232 are not limited thereto, and some components may be added, deleted, or changed.

필터링부(232_1)는 에지 검출기(Edge Detector)를 기반으로 가공 영상정보 내에서 부품의 결함 여부를 확인하기 위한 특징정보를 추출한다. 여기서 특징정보는 가공 영상정보 내 대상 부품(110)의 표면에 스크래치와 같은 결함에 대응하여 변화된 픽셀값과 이를 기반으로 가공 영상정보 내에서 탐지된 라인정보를 의미한다. 보다 자세하게 설명하자면, 필터링부(232_1)는 에지 검출기로서 웨이브렛 변환(Wavelet Transform), 캐니 에지 검출기(Canny Edge Detection), 적응 임계값(Adaptive Threshold), 형태 변환(Morphology Transformation) 등의 영상처리 알고리즘을 이용할 수 있다.The filtering unit 232_1 extracts feature information for checking whether a part is defective in the processed image information based on an edge detector. Here, the feature information means a pixel value changed in response to a defect such as a scratch on the surface of the target part 110 in the processed image information and line information detected in the processed image information based on this. In more detail, the filtering unit 232_1 is an edge detector and includes image processing algorithms such as Wavelet Transform, Canny Edge Detection, Adaptive Threshold, and Morphology Transformation. is available.

후보군 분류부(232_2)는 특징정보를 기반으로 부품결함 후보군을 분류한다. 후보군 분류부(232_2)는 추출한 특징정보에 대해 선분 검출(Line Segment Detection), 호프 라인 변환(Hough Line Transform) 등을 적용하여 부품결함 후보군을 분류할 수 있으나 반드시 이에 한정되는 것은 아니다.The candidate group classification unit 232_2 classifies the component defect candidate group based on the feature information. The candidate group classification unit 232_2 may classify the component defect candidate group by applying Line Segment Detection, Hough Line Transform, etc. to the extracted feature information, but is not limited thereto.

결함판단부(232_3)는 부품결함 후보군 각각의 영상 내에서 임계값(Thresholding)을 기준으로 대상 부품(110)의 결함 여부를 최종적으로 판단한다. 보다 상세하게는, 결함판단부(232_3)는 특징정보에 포함된 라인의 끊어진 영역에 대응하는 픽셀 정보를 파라미터(Parameter)로 정의하고, 라인의 길이와 임계값을 비교하여 대상 부품(110)의 결함 여부를 판단한다. 여기서 파라미터는 스크래치 결함에 대응되는 라인의 연속성을 정의함에 있어 중간에 끊어진 영역을 몇 픽셀까지 허용할 것인지에 대한 정보를 포함한다. 즉, 결함판단부(232_3)는 파라미터를 기반으로 라인의 길이를 측정할 수 있고, 라인의 길이와 특정 임계값을 비교하여 대상 부품(110)의 결함 유무를 최종적으로 판단할 수 있는 것이다.The defect determination unit 232_3 finally determines whether the target component 110 is defective based on a threshold in each image of the component defect candidate group. In more detail, the defect determination unit 232_3 defines pixel information corresponding to the broken region of the line included in the characteristic information as a parameter, and compares the length of the line with a threshold value of the target part 110 . Determine whether there is a defect. Here, the parameter includes information on how many pixels to allow for a region with a break in the middle in defining the continuity of the line corresponding to the scratch defect. That is, the defect determination unit 232_3 may measure the length of the line based on the parameter and may finally determine whether the target part 110 is defective by comparing the length of the line with a specific threshold value.

레이블링 모델 생성 장치(120)는 크라우드 소싱 기법을 이용하여 적어도 하나 이상의 레이블링 함수를 생성한다. 레이블링 모델 생성 장치(120)는 생성된 레이블링 함수를 조합하여 생성 모델 또는 투표 모델을 생성한다. 레이블링 모델 생성 장치(120)는 생성 모델 또는 투표 모델을 이용하여 약한 레이블을 획득하고, 사용자는 약한 레이블을 기반으로 최종적으로 분류 모델을 생성한다.The labeling model generating apparatus 120 generates at least one or more labeling functions by using a crowd sourcing technique. The labeling model generating device 120 generates a generating model or a voting model by combining the generated labeling functions. The labeling model generating apparatus 120 obtains a weak label by using a generating model or a voting model, and the user finally generates a classification model based on the weak label.

도 4는 본 실시예에 따른 레이블링 모델 생성 장치의 레이블링 UI를 설명하기 위해 개략적으로 나타낸 개념도이다.4 is a conceptual diagram schematically illustrating a labeling UI of the apparatus for generating a labeling model according to the present embodiment.

레이블링 모델 생성 장치(120)는 도메인 전문가가 아닌 일반 사용자(이하, 비전문가라고 한다) 또한 가공 영상정보 내에 부품결함 의심영역을 표시할 수 있다. 따라서, 본 실시예에 따른 레이블링 UI는 비전문가도 쉽게 부품결함 의심영역을 표시할 수 있어 충분히 많은 사용자의 개입을 가능하게 하는바, 다량의 레이블링 함수를 획득할 수 있다.The labeling model generating apparatus 120 may also display a region suspected of having a component defect in the processed image information for a general user who is not a domain expert (hereinafter referred to as a non-expert). Therefore, the labeling UI according to the present embodiment can easily display the suspected part defect region even by a non-expert, thereby enabling the intervention of a sufficiently large number of users, thereby obtaining a large amount of labeling functions.

도 4를 참조하면, 레이블링 UI는 이미지 표시 구간 및 버튼을 포함한다. 이미지 표시 구간은 가공 영상정보 내에 부품결함 의심영역을 표시하는 곳으로, 터치입력, 좌표입력 등으로 다양하게 표시할 수 있다. 부품결함 의심영역을 표시하고, 이를 기반으로 결함 패턴정보를 생성한 뒤, 결함 패턴정보에 대응하는 레이블링 함수를 생성한다.Referring to FIG. 4 , the labeling UI includes an image display section and a button. The image display section is a place to display a region suspected of part defects in the processed image information, and can be variously displayed by touch input, coordinate input, and the like. After displaying the suspected part defect area, generating defect pattern information based on this, a labeling function corresponding to the defect pattern information is generated.

도 5는 본 실시예에 따른 레이블링 모델 생성 방법을 설명하기 위한 순서도이다.5 is a flowchart illustrating a method for generating a labeling model according to the present embodiment.

도 5를 참조하면, 본 실시예에 따른 레이블링 모델 생성 장치(120)는 영상 촬영장치로부터 부품 영상정보를 수집한다(S602). 단계 S602에서 레이블링 모델 생성 장치(120)는 사용자로부터 부품 영상정보를 직접 입력받을 수도 있다.Referring to FIG. 5 , the labeling model generating apparatus 120 according to the present embodiment collects part image information from the image capturing apparatus ( S602 ). In step S602, the labeling model generating apparatus 120 may receive part image information directly from the user.

레이블링 모델 생성 장치(120)는 부품 영상정보에 전처리 기법을 수행하여 가공 영상정보를 생성한다(S604). 단계 S604에서 레이블링 모델 생성 장치(120)는 전처리 기법으로 빛 반사 보정법, 관심영역 설정법 또는 대비강화법을 이용하나 반드시 이에 한정되는 것은 아니다.The labeling model generating apparatus 120 generates processed image information by performing a pre-processing technique on the part image information (S604). In step S604 , the labeling model generating apparatus 120 uses a light reflection correction method, a region of interest setting method, or a contrast enhancement method as a pre-processing method, but is not limited thereto.

레이블링 모델 생성 장치(120)는 레이블링 UI의 존재 여부를 확인한다(S606). 레이블링 모델 생성 장치(120)는 레이블링 UI가 없는 경우, 가공 영상정보에 영상처리 기법을 적용하여 레이블링 함수를 생성한다(S608). 단계 S608에서 가공 영상정보에 영상처리 기법을 적용하는 방법은 도 7과 관련하여 후술한다.The labeling model generating apparatus 120 checks whether the labeling UI exists (S606). When there is no labeling UI, the labeling model generating apparatus 120 generates a labeling function by applying an image processing technique to the processed image information (S608). A method of applying the image processing technique to the processed image information in step S608 will be described later with reference to FIG. 7 .

레이블링 모델 생성 장치(120)는 레이블링 UI가 존재하는 경우, 가공 영상정보에 크라우드 소싱 기법을 적용하여 레이블링 함수를 생성한다(S610). 단계 S610에서 레이블링 모델 생성 장치(120)는 레이블링 UI가 존재하는 경우에도 사용자가 영상처리 기법을 선택한다면, 영상처리 기법을 이용하여 레이블링 함수를 생성할 수도 있다.When the labeling UI exists, the labeling model generating apparatus 120 generates a labeling function by applying a crowdsourcing technique to the processed image information (S610). In step S610, the labeling model generating apparatus 120 may generate a labeling function using the image processing technique if the user selects an image processing technique even when the labeling UI exists.

레이블링 모델 생성 장치(120)는 생성된 레이블링 함수를 조합하여 레이블링 모델을 생성한다(S612). 단계 S612에서 레이블링 모델은 생성 모델 또는 투표 모델을 포함한다.The labeling model generating apparatus 120 generates a labeling model by combining the generated labeling functions (S612). In step S612, the labeling model includes a generation model or a voting model.

레이블링 모델 생성 장치(120)는 레이블링 모델을 기반으로 약한 레이블을 생성한다(S614). 약한 레이블은 최종적으로 분류 모델을 생성하는데 이용된다(S616). The labeling model generating apparatus 120 generates a weak label based on the labeling model (S614). The weak label is finally used to generate a classification model (S616).

도 6은 본 실시예에 따른 영상처리 기법의 적용 방법을 설명하기 위한 순서도이다.6 is a flowchart for explaining a method of applying the image processing technique according to the present embodiment.

도 6을 참조하면, 레이블링 모델 생성 장치(120)는 에지 검출기를 기반으로 가공 영상정보 내에서 특징정보를 추출한다(S702). 단계 S702에서 특징정보는 가공 영상정보 내에서 스크래치 등과 같은 결함에 대응하여 변화된 픽셀값과 이를 기반으로 가공 영상정보 내에서 탐지된 라인 등을 포함하는 정보를 말한다.Referring to FIG. 6 , the labeling model generating apparatus 120 extracts feature information from the processed image information based on the edge detector ( S702 ). In step S702, the feature information refers to information including a pixel value changed in response to a defect such as a scratch in the processed image information and a line detected in the processed image information based on this.

레이블링 모델 생성 장치(120)는 특징정보를 이용하여 부품결함 후보군을 분류한다(S704). 레이블링 모델 생성 장치(120)는 특징정보에 포함된 라인의 끊어진 영역이 존재하는지 여부를 확인한다(S706). 레이블링 모델 생성 장치(120)는 라인의 끊어진 영역에 대응되는 픽셀 정보를 파라미터로 정의한다(S708). 단계 S708에서 파라미터는 스크래치 결함에 대응되는 라인의 연속성을 정의하기 위한 정보를 의미한다.The labeling model generating apparatus 120 classifies the component defect candidate group by using the feature information (S704). The labeling model generating apparatus 120 checks whether a broken region of a line included in the feature information exists ( S706 ). The labeling model generating apparatus 120 defines pixel information corresponding to the broken region of the line as a parameter (S708). In step S708, the parameter means information for defining the continuity of the line corresponding to the scratch defect.

레이블링 모델 생성 장치(120)는 파라미터를 기반으로 라인의 길이를 측정하고, 라인의 길이와 특정 임계값을 비교한다(S710).The labeling model generating apparatus 120 measures the length of the line based on the parameter and compares the length of the line with a specific threshold ( S710 ).

레이블링 모델 생성 장치(120)는 비교한 결과에 따라 부품결함 후보군 각각의 영상 내에서 부품 결함 여부를 판단한다(S712). 단계 S712에서 레이블링 모델 생성 장치(120)는 결함이 존재하는 것으로 판단된 결함 영상정보를 기반으로 레이블링 함수를 생성한다.The labeling model generating apparatus 120 determines whether a component is defective in each image of the component defect candidate group according to the comparison result ( S712 ). In step S712, the labeling model generating apparatus 120 generates a labeling function based on the defect image information determined that the defect exists.

도 7은 본 실시예에 따른 크라우드 소싱 기법의 적용 방법을 설명하기 위한 순서도이다.7 is a flowchart illustrating a method of applying a crowd sourcing technique according to the present embodiment.

도 7을 참조하면, 본 실시예에 따른 레이블링 모델 생성 장치(120)는 레이블링 UI를 이용한 사용자의 입력정보에 근거하여 가공 영상정보 상에 부품결함 의심영역을 표시한다(S802). 단계 S802에서 사용자의 입력정보는 마우스를 이용한 드래그 입력, 키보드를 이용한 좌표입력 등을 포함한다.Referring to FIG. 7 , the apparatus 120 for generating a labeling model according to the present embodiment displays a region suspected of defective parts on the processed image information based on user input information using the labeling UI (S802). In step S802, the user's input information includes a drag input using a mouse, a coordinate input using a keyboard, and the like.

레이블링 모델 생성 장치(120)는 복수의 부품결함 의심영역을 기반으로 결함 패턴정보를 생성한다(S804). 단계 S804에서 부품결함 의심영역은 사용자에 의해 결함 의심 지역으로 표시된 특정 영역을 의미하고, 결함 패턴정보는 부품결함 의심영역 내에 존재하는 결함 패턴, 예컨대 스크래치 등의 결함 패턴을 의미한다.The labeling model generating apparatus 120 generates defect pattern information based on the plurality of suspected parts defect regions (S804). In step S804, the suspected part defect region refers to a specific region marked as a defective region by the user, and the defect pattern information refers to a defect pattern existing in the suspected part defect region, for example, a defect pattern such as a scratch.

레이블링 모델 생성 장치(120)는 사용자가 대표결함 패턴정보를 선택하였는지 여부를 확인한다(S806). 레이블링 모델 생성 장치(120)는 사용자가 대표결함 패턴정보를 선택한 경우, 대표결함 패턴정보가 포함된 대표결함 영상정보를 이용하여 레이블링 함수를 자동으로 생성한다(S808).The labeling model generating apparatus 120 checks whether the user has selected the representative defect pattern information (S806). When the user selects the representative defect pattern information, the labeling model generating apparatus 120 automatically generates a labeling function by using the representative defect image information including the representative defect pattern information (S808).

레이블링 모델 생성 장치(120)는 사용자가 대표결함 패턴정보를 선택하지 않은 경우, 결함 패턴정보에 대응하는 영역을 레이블링 함수로 정의한다(S810).When the user does not select the representative defect pattern information, the labeling model generating apparatus 120 defines an area corresponding to the defect pattern information as a labeling function (S810).

도 5 내지 도 7에서는 단계 S602 내지 단계 S616, S702 내지 S712, S802 내지 S810을 순차적으로 실행하는 것으로 기재하고 있으나, 반드시 이에 한정되는 것은 아니다. 다시 말해, 도 5 내지 도 7에 기재된 단계를 변경하여 실행하거나 하나 이상의 단계를 병렬적으로 실행하는 것으로 적용 가능할 것이므로, 도 5 내지 도 7은 시계열적인 순서로 한정되는 것은 아니다.5 to 7, steps S602 to S616, S702 to S712, and S802 to S810 are sequentially described as being sequentially executed, but the present invention is not limited thereto. In other words, since the steps described in FIGS. 5 to 7 may be changed to execute or to execute one or more steps in parallel, FIGS. 5 to 7 are not limited to a chronological order.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of this embodiment, and a person skilled in the art to which this embodiment belongs may make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present embodiments are intended to explain rather than limit the technical spirit of the present embodiment, and the scope of the technical spirit of the present embodiment is not limited by these embodiments. The protection scope of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the present embodiment.

110: 대상 부품 120: 레이블링 모델 생성 장치
130: 사용자 단말기 210: 영상 수집부
220: 데이터 전처리부 230: 레이블링 함수 생성부
232: 영상처리부 232_1: 필터링부
232_2: 후보군 분류부 232_3: 결함판단부
234: 크라우드 소싱부 240: 모델 생성부110: target part 120: labeling model generating device
130: user terminal 210: image collection unit
220: data preprocessor 230: labeling function generator
232: image processing unit 232_1: filtering unit
232_2: candidate group classification unit 232_3: defect determination unit
234: crowd sourcing unit 240: model generation unit

Claims

an image collecting unit for collecting part image information from an image capturing device;
a data pre-processing unit for generating processed image information by performing a pre-processing technique on the part image information;
a labeling function generator for determining whether a part is defective by applying an image processing technique or a crowd sourcing technique to the processed image information, and generating one or more labeling functions according to the determination result; and
A model generator for generating a labeling model for obtaining a weak label by combining the labeling function,
The labeling function generating unit,
Extracting feature information for checking whether a part is defective in the processed image information, classifying a part defect candidate group from the processed image information using the feature information, and breaking the line included in the feature information By defining pixel information corresponding to an area as a parameter, and comparing the length of the line measured based on the parameter with a preset threshold value for each image of the component defect candidate group, whether the component is defective and an image processing unit for determining and generating the labeling function based on the result of the determination,
The model generation unit,
Combining the labeling function based on a development dataset, but varying the type of labeling model according to the number of the labeling functions
A device for generating a labeling model, characterized in that

According to claim 1,
The model generation unit,
Labeling model generating apparatus, characterized in that detecting a complementary combination of each of the labeling functions based on the development data set.

According to claim 1,
The model generation unit,
Labeling model generating apparatus, characterized in that generating a generative model (generative model) when the number of the labeling function is greater than or equal to a preset threshold, and generating a voting model (voting model) when the number is less than or equal to a preset threshold.

The method of claim 1,
The model generation unit,
A probabilistic model is trained using a combined labeling function to generate a generative model, or a voting model based on a voting value corresponding to each result value of the combined labeling function for whether a part is defective. ), a labeling model generating device, characterized in that it generates.

The method of claim 1,
The data preprocessor,
Labeling model generating apparatus, characterized in that using a light reflection correction method, a region of interest setting method or a contrast enhancement method as the pre-processing method.

The method of claim 1,
The labeling function generating unit,
A crowd that displays a part defect suspected region on the processed image information based on user input information using a labeling UI (User Interface), generates defect pattern information based on the part defect suspected region, and defines it as the labeling function Sourcing Department
Labeling model generating device, characterized in that it further comprises.

7. The method of claim 6,
The image processing unit,
a filtering unit for extracting the feature information from the processed image information based on an edge detector;
a candidate group classification unit for classifying the component defect candidate group by using the feature information; and
A defect determination unit that generates defect image information using a result of the determination regarding whether the part is defective, and generates the labeling function using the defect image information
Labeling model generating device comprising a.

7. The method of claim 6,
The crowd sourcing unit,
Labeling model generating apparatus, characterized in that generating at least one or more of the defect pattern information based on the plurality of suspected parts defect regions, and defining an area corresponding to the defect pattern information in the processed image information as the labeling function .

7. The method of claim 6,
The crowd sourcing unit,
A labeling model, characterized in that selecting representative defect image information including representative defect pattern information from among the processed image information indicating the suspected part defect region, and automatically generating the labeling function using the representative defect image information generating device.

delete

an image collection process of collecting part image information from an image capturing device;
a data pre-processing process of generating processed image information by performing a pre-processing technique on the part image information;
a labeling function generation process of determining whether a part is defective by applying an image processing technique or a crowd sourcing technique to the processed image information, and generating one or more labeling functions according to the determination result; and
Including a model generation process of generating a labeling model for obtaining a weak label by combining the labeling functions,
The labeling function generation process is
Extracting feature information for checking whether a part is defective in the processed image information, classifying a part defect candidate group from the processed image information using the feature information, and breaking the line included in the feature information By defining pixel information corresponding to an area as a parameter, and comparing the length of the line measured based on the parameter with a preset threshold value for each image of the component defect candidate group, whether the component is defective Comprising the process of determining and generating the labeling function based on the result of the determination,
The model creation process is
Combining the labeling function based on a development dataset, but varying the type of labeling model according to the number of the labeling functions
A method for generating a labeling model of a labeling model generating device, characterized in that

12. The method of claim 11,
The labeling function generation process is
A crowd that displays a part defect suspected region on the processed image information based on user input information using a labeling UI (User Interface), generates defect pattern information based on the part defect suspected region, and defines it as the labeling function sourcing process
Labeling model generation method, characterized in that it further comprises.

A computer program stored in a computer-readable recording medium to execute each process included in the method for generating a labeling model according to any one of claims 11 to 12.