KR20190143527A

KR20190143527A - System and Method for Recognitioning Image Pattern using Machine Learning

Info

Publication number: KR20190143527A
Application number: KR1020180066194A
Authority: KR
Inventors: 강동중
Original assignee: 부산대학교 산학협력단
Priority date: 2018-06-08
Filing date: 2018-06-08
Publication date: 2019-12-31
Also published as: KR102102405B1

Abstract

The present invention relates to an apparatus to recognize an image through machine learning, capable of determining an option for obtaining a model parameter applying a reinforcement learning-based exploration technique to option exploration and calculating the best accuracy, and a method thereof. The apparatus includes: a data preprocessing part analyzing the distribution of the positions of objects in an image to be detected, enabling a user to select settings for a net structure or the like, and determining a net structure through analysis on the average size of the objects to be detected; a source data processing part processing learning, verification and test data; a binary file creation part outputting a file made by encoding a data set and information; an option determining part determining an option for obtaining a model parameter calculating the highest accuracy by applying a reinforcement learning-based exploration technique to option exploration; and a learning execution part evaluating model performance through learning using the option determined by the option determining part, while monitoring a learning process and an evaluation process and storing a completely learned model.

Description

System and Method for Recognitioning Image Pattern using Machine Learning

본 발명은 영상 인식에 관한 것으로, 구체적으로 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있도록 한 머신 러닝을 통한 영상 인식을 위한 장치 및 방법에 관한 것이다.TECHNICAL FIELD The present invention relates to image recognition. Specifically, the present invention relates to image recognition through machine learning to apply a reinforcement learning-based search technique to option searching and to determine an option for obtaining model parameters that yield the highest accuracy. An apparatus and method are provided.

인공 지능(Artificial Intelligence, AI)은 인간의 뇌와 뉴런 신경망을 모방해 언젠가는 컴퓨터나 로봇들이 인간처럼 사고하고 행동하게 하는 것이다.Artificial Intelligence (AI) mimics the human brain and neuronal neural networks, someday allowing computers and robots to think and act like humans.

예를 들어, 우리는 사진만으로 개와 고양이를 아주 쉽게 구분할 수 있지만, 컴퓨터는 구분하지 못한다.For example, we can distinguish between dogs and cats very easily with pictures, but not computers.

이를 위해 머신 러닝(Machine Learning, ML) 기법이 고안되었는데, 이 기법은 많은 데이터를 컴퓨터에 입력하고 비슷한 것끼리 분류하도록 하는 기술로서, 저장된 개 사진과 비슷한 사진이 입력되면, 이를 개 사진이라고 컴퓨터가 분류하도록 하는 것이다.The machine learning (ML) technique was devised to input a lot of data into a computer and classify similar things. When a picture similar to a stored dog's picture is input, it is called a dog picture. To classify.

데이터를 어떻게 분류할 것인가에 따라, 의사결정 나무(Decision Tree)나 베이지안 망(Bayesian network), 서포트 벡터 머신(support vector machine, SVM), 그리고 인공 신경망(Artificial neural network) 등 많은 머신러닝 알고리즘이 등장했다.Depending on how the data is to be sorted, many machine learning algorithms emerge, including Decision Trees, Bayesian networks, support vector machines (SVMs), and artificial neural networks. did.

그 중에 인공 신경망 알고리즘에서 파생된 딥 러닝(Deep Learning, DL)은 인공 신경망을 이용하여 데이터를 군집화하거나 분류하는데 사용하는 기술이다.Among them, deep learning (DL), which is derived from an artificial neural network algorithm, is a technique used to cluster or classify data using an artificial neural network.

머신 러닝과 인지 과학에서의 인공 신경망은 생물학의 신경망(동물의 중추 신경계)에서 영감을 얻은 통계학적 학습 알고리즘이다.Artificial neural networks in machine learning and cognitive science are statistical learning algorithms inspired by biological neural networks (the central nervous system of animals).

인공 신경망은 시냅스(synapse)의 결합으로 네트워크를 형성한 인공 뉴런(node)이 학습을 통해 시냅스의 결합 세기를 변화시켜, 문제 해결 능력을 가지는 모델 전반을 가리킨다.Artificial neural network refers to the overall model that has a problem solving ability by artificial neurons (node) formed by the combination of synapses to change the strength of the synapse through learning.

이와 같은 머신 러닝(Machine Learning, ML) 기법은 영상 인식, 음성 인식, 데이터 마이닝, 지능형 로봇 등 다양한 분야에서 폭넓게 활용되고 있다.This machine learning (ML) technique is widely used in various fields such as image recognition, voice recognition, data mining, and intelligent robots.

특히, 영상 인식 분야에서는 인간의 인식 정확도를 웃도는 수치를 달성하였다. In particular, in the field of image recognition, the numerical value exceeding the recognition accuracy of human is achieved.

이처럼 머신 러닝(Machine Learning, ML) 기법은 작업에 대한 높은 완성도를 제공하지만 그에 따른 높은 연산량도 요구된다.As such, machine learning (ML) techniques provide high maturity for the task, but also require high computational complexity.

예를 들어, 물체 위치 탐지 및 부류 결정 문제가 주어졌을 때, 효과적인 학습을 통해 이를 해결하기 위해서는 학습에 사용되는 다양한 옵션을 결정해야 한다.For example, given the problem of object location and class determination, it is necessary to determine the various options used in learning to solve it through effective learning.

사용자가 선택해야 하는 옵션의 종류는 아주 많으며(수십 가지 이상) 이러한 옵션의 선택은 하나하나가 학습과 성능에 영향을 미치게 된다.There are so many options (more than a dozen) to choose from, and every single choice affects learning and performance.

이러한, 옵션 선택은 주어진 문제의 종류나 응용 필드, 데이터 타입, 사용할 딥 넷 종류, 학습률, 데이터 증강 적용 여부 등의 다양한 요소에서 발생하고 영향을 받는다.This option selection occurs and is affected by various factors such as the type of problem or application field, data type, type of deep net to be used, learning rate, and whether data enhancement is applied.

학습을 진행하는 사람은 오랜 기간 딥러닝 학습을 수행해온 경험을 통해 주어진 문제를 분석하여 최적 옵션을 선택하고 있으며, 또한 개발사에서 제공하는 기본 예제들과 사례들을 살펴봄에 의해 옵션들을 선택하게 된다.The person who learns through the experience of deep learning for a long time analyzes a given problem and selects the optimal option, and also selects the option by examining the basic examples and cases provided by the developer.

이 과정을 위해 사용자가 긴 시간을 투입해야 하며 숙련된 데이터 분석 능력을 요구하고 있다.This process requires a long time for the user and requires skilled data analysis skills.

그러나 이런 과정을 통해서도 풀어야할 문제를 효과적으로 해결하기 위한 최적 옵션 선택은 현실적으로 불가능하다.However, even with this process, it is practically impossible to select the optimal option to effectively solve the problem to be solved.

예를 들어, 적용 가부(2가지 상태)만을 가진 옵션이 10가지이고 각 옵션들의 적용 가부에 따른 효과를 테스트한다고 가정하면, 각 옵션의 우선순위에 대한 경우의 수와 가부의 곱인 9! * 2¹⁰가지의 경우의 수가 생기고 이러한 모든 경우에 대해 각각 학습을 수행하고 결과를 분석, 테스트하기는 불가능하다.For example, suppose you have 10 options that have only applicability (two states), and you test the effect of the applicability of each option. The number of cases for the priority of each option is 9! * There are a number of ¹⁰ cases, and it is impossible to do the learning and analyze and test the results for each of these cases.

따라서, 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정에 관한 새로운 기술의 개발이 요구되고 있다.Thus, there is a need for developing a new technique for determining options to obtain model parameters that yield the highest accuracy.

한국등록특허 제10-1850286호Korea Patent Registration No. 10-1850286 한국공개특허 제10-2016-0122452호Korean Patent Publication No. 10-2016-0122452 한국공개특허 제10-2010-0129783호Korean Patent Publication No. 10-2010-0129783

본 발명은 이와 같은 종래 기술의 머신 러닝을 이용하는 영상 인식 기술의 문제를 해결하기 위한 것으로, 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있도록 한 머신 러닝을 통한 영상 인식을 위한 장치 및 방법을 제공하는데 그 목적이 있다.The present invention is to solve the problem of the image recognition technology using the conventional machine learning, the option decision to apply the reinforcement learning-based search method to the option search and obtain the model parameters that yield the highest accuracy An object of the present invention is to provide an apparatus and method for image recognition through machine learning.

본 발명은 구글의 tensorflow 프레임웍을 사용하여 사용자가 탐지하기를 원하는 물체의 위치와 부류를 딥러닝 기술을 이용하여 판별할 수 있도록 한 머신 러닝을 통한 영상 인식을 위한 장치 및 방법을 제공하는데 그 목적이 있다.An object of the present invention is to provide an apparatus and method for image recognition through machine learning, which enables the user to determine the location and class of an object that a user wants to detect using deep learning technology using Google's tensorflow framework. have.

본 발명은 Object의 분포 분석하고, ROI SEL 모듈을 이용하여 실제 처리할 부분을 추출하고, 실시간 처리요구속도 분석, User setting, 물체의 평균크기 분석, 물체가 영상에서 차지하는 비율분석을 통한 적합한 구조 선정을 하는 전처리 구성을 포함하는 머신 러닝을 통한 영상 인식을 위한 장치 및 방법을 제공하는데 그 목적이 있다.The present invention analyzes the distribution of objects, extracts the parts to be actually processed using the ROI SEL module, and selects a suitable structure through real-time processing request speed analysis, user setting, average size analysis of objects, and ratio analysis of objects in the image. It is an object of the present invention to provide an apparatus and method for image recognition through machine learning, including a preprocessing configuration.

본 발명은 학습이 적절한지에 대한 보상값과 현재 상태 분석, 옵션 및 순서 생성, 모델 학습과 정확도 검증, 산출한 모델 옵션 저장 및 보상값을 컨트롤러로 전송하는 과정을 반복하여 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정 구성을 포함하는 머신 러닝을 통한 영상 인식을 위한 장치 및 방법을 제공하는데 그 목적이 있다.The present invention is a model parameter for calculating the best accuracy by repeating the process of analyzing the compensation value and the current state, the option and sequence generation, model training and accuracy verification, the model option storage and the compensation value for the training is appropriate to the controller It is an object of the present invention to provide an apparatus and method for image recognition through machine learning, including an option decision configuration to obtain a.

본 발명의 목적들은 이상에서 언급한 목적들로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.

이와 같은 목적을 달성하기 위한 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치는 탐지할 영상 내에서 물체가 놓여 있는 위치의 분포에 대한 분석을 하고, 넷 구조 등에 대한 세팅을 사용자가 선택할 수 있도록 하고, 탐지할 물체의 평균 크기 분석을 수행하여 넷 구조를 결정하는 데이터 전처리부;학습,검증,테스트 데이터 처리를 하는 소스 데이터 처리부;데이터 셋과 정보를 인코딩한 파일을 출력하는 바이너리 파일 생성부;강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하여 가장 높은 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 수행하는 옵션 결정부;옵션 결정부에서 결정된 모델 옵션을 이용하여 학습을 하여 모델 성능 평가, 학습과정과 평가과정을 모니터링하고 학습 완료된 모델을 저장하는 학습 실행부;를 포함하는 것을 특징으로 한다.The apparatus for image recognition through machine learning according to the present invention for achieving the above object is to analyze the distribution of the position of the object in the image to be detected, so that the user can select the setting for the net structure, etc. A data preprocessor configured to determine a net structure by performing an average size analysis of an object to be detected; a source data processor configured to learn, verify, and test data; a binary file generator configured to output a file encoded with a data set and information; An option decision unit which applies a reinforcement learning-based search method to option search to determine the model parameter that yields the highest accuracy; model determination by learning using the model option determined by the option decision unit. A learning execution unit for monitoring the evaluation, the learning process and the evaluation process, and storing the completed model; Characterized in that it comprises a.

여기서, 데이터 전처리부는, 영상 내 물체의 존재 위치나 위치의 분포를 분석하는 오브젝트 분포 분석부(OBJ LOC DISTRIB) 및 오브젝트 분포 분석부(OBJ LOC DISTRIB)의 오브젝트 분포 분석 결과에 따라 실제 처리할 부분을 추출하는 관심 영역 선택부(ROI SEL)와,검사기기의 실시간 처리 요구속도를 분석하는 처리 요구속도 분석부(TAC-TIME EVAL) 및 딥 넷(Deep-Net) 구성에 요구되는 유저 세팅을 하는 유저 세팅부(USER SETTING)와,물체의 평균크기 분석을 하는 크기 분석부(OBJ SIZE EVAL) 및 물체가 영상에서 차지하는 비율 분석을 하는 비율 분석부(OBJ-IMG RATIO EVAL)와,오브젝트 분포,처리 요구속도,물체 평균 크기,물체 비율 분석을 기초로 적합한 넷 구조를 결정하는 넷 구조 결정부(DEEP-NET STRUC DECESION)를 포함하는 것을 특징으로 한다.Here, the data preprocessing unit may select a part to be actually processed according to an object distribution analysis result of the object distribution analyzer OBJ LOC DISTRIB and the object distribution analyzer OBJ LOC DISTRIB, which analyzes the location or distribution of the object in the image. A user who sets the ROI SEL to be extracted, the TAC-TIME EVAL for analyzing the real-time processing speed of the inspection apparatus, and the user setting required for the deep-net configuration. USER SETTING, OBJ SIZE EVAL, which analyzes the average size of the object, and OBJ-IMG RATIO EVAL, which analyzes the proportion of the object in the image, and object distribution and processing requirements. And a net structure determination unit (DEEP-NET STRUC DECESION) for determining a suitable net structure based on the speed, the average size of the object, and the object ratio analysis.

그리고 소스 데이터 처리부는, 학습, 검증, 테스트 데이터 셋을 처리하는 이미지 데이터 처리부(Image data)와,데이터셋의 Label 정보를 처리하는 XML 파일 처리부(XML Files)와,Deep-Net 정보를 관리하는 모델 관리부(Model Config)와,학습 파라미터 정보를 처리하는 학습 파라미터 정보 처리부(Train Config)와,검증 파라미터 정보를 처리하는 검증 파라미터 정보 처리부(EvalConfig)와,데이터 셋 경로 관리를 하는 데이터 셋 경로 관리부(Input Config)를 포함하는 것을 특징으로 한다.The source data processor may include an image data processor for processing a learning, verification, and test data set, an XML file processor for processing label information of a data set, and a model for managing deep-net information. Model Config, learning parameter information processing unit (Train Config) to process the learning parameter information, verification parameter information processing unit (EvalConfig) to process the verification parameter information, data set path management unit (Input set management) Config).

그리고 옵션 결정부는, 학습이 적절한지에 대한 보상값과 현재 상태를 분석하는 컨트롤러(Controller)와,데이터 변동(Data augmentation) 옵션 및 순서를 생성하는 옵션 샘플러(Data Aug Opt sampler)와,학습 환경(Train Config)을 갱신하는 업데이트부(Train Config Update)와, 학습 실행부(Tensorflow Object Detection API)에 필요한 모델을 소스 데이터로부터 불러와 구성하는 모델 구성부(Model Construction)와,모델 학습과 정확도를 검증하는 검증부(Train & Eval)와,정확도를 분석후 데이터 변동(Data augmentation) 옵션을 버퍼에 저장하는 옵션 저장부(Save Aug Opt)와,가장 높은 정확도를 산출한 모델 옵션을 저장하는 최적 옵션 저장부(Best Option)와,정확도로부터 구해진 보상값을 컨트롤러(Controller)에 제공하는 보상값 제공부(RewardCAL)를 포함하는 것을 특징으로 한다.The option determiner may include a controller that analyzes a reward value and a current state of appropriateness of learning, an option sampler that generates data augmentation options and sequences, and a training environment. Train Config Update, Model Construction for importing and configuring the model needed for the Trainer Object Detection API from the source data, Model Construction, and verifying model training and accuracy. Train and Eval, Save Aug Opt, which stores data augmentation options after analyzing accuracy, and Optimal Options Store, which stores the model options that yield the highest accuracy. (Best Option) and a compensation value providing unit (RewardCAL) for providing a compensation value obtained from the accuracy to the controller (Controller).

그리고 옵션 결정부는, 공간에서 하나의 샘플을 선택했을 때, 이 샘플이 표현하는 옵션의 결합을 통해 학습기의 성능을 평가하고, 평가 성능을 강화학습의 보상(reward)으로 이용하여 Policy gradient 기반으로 최적 분포를 탐지하는 것을 특징으로 한다.The option decision unit evaluates the performance of the learner through the combination of the options represented by this sample when one sample is selected in space, and uses the evaluation performance as a reward for reinforcement learning. Detecting the distribution.

그리고 옵션 결정부는, 탐지 알고리즘으로 LSTM 망(Long Short Term Memory networks)을 사용하고, LSTM 망의 출력은 시간 항(time step)에 따라 홀수항은 옵션선택, 짝수항은 해당 옵션의 가부를 선택하는 것을 특징으로 한다.In addition, the option decision unit uses a long short term memory network (LSTM) as a detection algorithm, and the output of the LSTM network selects an odd term option and an even term selects the option according to a time step. It is characterized by.

그리고 학습 실행부(Final Tensorflow Object Detection API)는, 옵션 결정부에서 결정된 모델 옵션을 이용하여 학습을 하는 학습부(Train Stage)와,학습 과정에서의 모델 성능을 평가하는 모델 평가부(Evaluation State)와,학습 과정과 평가 과정을 모니터링하는 모니터링부(Tensorboard)와,학습 완료된 모델을 저장하는 모델 저장부(ModelExporter)를 포함하는 것을 특징으로 한다.The final tensorflow object detection API includes a train stage for learning by using the model options determined by the option determiner, and a model evaluation state for evaluating model performance in the learning process. And a monitoring unit (Tensorboard) for monitoring the learning process and the evaluation process, and a model storage unit (ModelExporter) for storing the completed model.

다른 목적을 달성하기 위한 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 방법은 탐지할 영상 내에서 물체가 놓여 있는 위치의 분포에 대한 분석을 하고, 넷 구조 등에 대한 세팅을 사용자가 선택할 수 있도록 하고, 탐지할 물체의 평균 크기 분석을 수행하여 넷 구조를 결정하는 데이터 전처리 단계;학습,검증,테스트 데이터 처리를 하는 소스 데이터 처리 단계;데이터 셋과 정보를 인코딩한 파일을 출력하는 바이너리 파일 생성 단계;강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하여 가장 높은 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 수행하는 옵션 결정 단계;옵션 결정 단계에서 결정된 모델 옵션을 이용하여 학습을 하여 모델 성능 평가, 학습과정과 평가과정을 모니터링하고 학습 완료된 모델을 저장하는 학습 실행 단계;를 포함하는 것을 특징으로 한다.The method for image recognition through machine learning according to the present invention for achieving another object is to analyze the distribution of the position of the object in the image to be detected, and to allow the user to select settings for the net structure, etc. A data preprocessing step of determining a net structure by performing an average size analysis of an object to be detected; a source data processing step of processing learning, verifying, and test data; generating a binary file outputting a file encoded with the data set and information; An option decision step of applying a reinforcement learning-based search method to the option search to determine the model parameter that yields the highest accuracy; model determination by learning using the model option determined in the option decision step. Assessment, learning and monitoring of the assessment process and the storage of completed models It characterized in that it comprises a; execution phase.

여기서, 데이터 전처리 단계는, 영상 내 물체의 존재 위치나 위치의 분포를 분석하는 오브젝트 분포 분석 단계와,오브젝트 분포 분석 결과에 따라 실제 처리할 부분을 추출하는 관심 영역 선택 단계와,검사기기의 실시간 처리 요구속도를 분석하는 처리 요구속도 분석 단계와,Deep-Net 구성에 요구되는 유저 세팅을 하는 유저 세팅 단계와,물체의 평균크기 분석을 하는 크기 분석 단계 및, 물체가 영상에서 차지하는 비율 분석을 하는 비율 분석 단계를 포함하는 것을 특징으로 한다.Here, the data preprocessing step may include an object distribution analysis step of analyzing an existence position or a position distribution of an object in the image, a region of interest selection step of extracting an actual part to be processed according to an object distribution analysis result, and real-time processing of an inspection device Processing to analyze the required speed The required speed analysis step, the user setting step to set the user required for Deep-Net configuration, the size analysis step to analyze the average size of the object, and the ratio analysis of the ratio of the object to the image It characterized in that it comprises an analysis step.

그리고 옵션 결정 단계는, 학습이 적절한지에 대한 보상값과 현재 상태를 분석하는 단계와,데이터 변동(Data augmentation) 옵션 및 순서를 생성하는 단계와,학습 환경(Train Config)을 갱신하는 업데이트 단계와,학습 실행부(Tensorflow Object Detection API)에 필요한 모델을 소스 데이터로부터 불러와 구성하는 모델 구성 단계와,모델 학습과 정확도를 검증하는 검증 단계와,정확도를 분석후 데이터 변동(Data augmentation) 옵션을 버퍼에 저장하는 옵션 저장 단계와,가장 높은 정확도를 산출한 모델 옵션을 저장하는 최적 옵션 저장 단계와,정확도로부터 구해진 보상값을 컨트롤러(Controller)에 제공하는 보상값 제공 단계를 포함하는 것을 특징으로 한다.The option determining step may include analyzing a reward value and a current state of appropriateness of learning, generating a data augmentation option and sequence, an updating step of updating a train config, The model construction step of importing and constructing the model required for the Tensorflow Object Detection API from the source data, the verification step of verifying model training and accuracy, and the data augmentation option after analyzing the accuracy in the buffer. And an option storing step for storing an option storing the model option for calculating the highest accuracy, and a step of providing a compensation value for providing the controller with a compensation value obtained from the accuracy.

이와 같은 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치 및 방법은 다음과 같은 효과를 갖는다.The apparatus and method for image recognition through machine learning according to the present invention have the following effects.

첫째, 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있도록 한다.First, we apply reinforcement learning based retrieval techniques to option exploration and make option decisions to obtain model parameters that yield the highest accuracy.

둘째, 구글 tensorflow 프레임웍을 사용하여 사용자가 딥러닝 기술에 대한 전문적 지식 없이 탐지를 원하는 물체의 위치와 부류에 최적화된 모델과 옵션 결정을 할 수 있도록 한다.Second, the Google tensorflow framework allows users to make models and options decisions that are optimized for the location and class of objects they want to detect, without the need for deep learning technology.

셋째, 학습이 적절한지에 대한 보상값과 현재 상태 분석, 옵션 및 순서 생성, 모델 학습과 정확도 검증, 산출한 모델 옵션 저장 및 보상값을 컨트롤러로 전송하는 과정을 반복하여 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있다.Third, model parameters that calculate the best accuracy by repeating the process of analyzing the compensation value and current status of the appropriate training, generating the option and sequence, model training and verifying accuracy, storing the calculated model option and transmitting the compensation value to the controller You can make an option decision.

도 1은 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치의 구성도
도 2 내지 도 5는 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치의 상세 구성도
도 6은 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 방법을 나타낸 플로우 차트1 is a block diagram of an apparatus for image recognition through machine learning according to the present invention
2 to 5 are detailed block diagrams of an apparatus for image recognition through machine learning according to the present invention.
6 is a flowchart illustrating a method for image recognition through machine learning according to the present invention.

이하, 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치 및 방법의 바람직한 실시 예에 관하여 상세히 설명하면 다음과 같다.Hereinafter, a preferred embodiment of an apparatus and method for image recognition through machine learning according to the present invention will be described in detail.

본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치 및 방법의 특징 및 이점들은 이하에서의 각 실시 예에 대한 상세한 설명을 통해 명백해질 것이다.Features and advantages of the apparatus and method for image recognition through machine learning according to the present invention will become apparent from the following detailed description of each embodiment.

도 1은 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치의 구성도이고, 도 2 내지 도 5는 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치의 상세 구성도이다.1 is a configuration diagram of an apparatus for image recognition through machine learning according to the present invention, and FIGS. 2 to 5 are detailed configuration diagrams of an apparatus for image recognition through machine learning according to the present invention.

본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치 및 방법은 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있도록 한 것이다.An apparatus and method for image recognition through machine learning according to the present invention is to apply a reinforcement learning-based search method to the option search and to determine the option to obtain the model parameters that yield the highest accuracy.

이를 위하여 본 발명은 Object의 분포 분석하고, ROI SEL 모듈을 이용하여 실제 처리할 부분을 추출하고, 실시간 처리요구속도 분석, User setting, 물체의 평균크기 분석, 물체가 영상에서 차지하는 비율분석을 통한 적합한 구조 선정을 하는 전처리 구성을 포함할 수 있다.To this end, the present invention analyzes the distribution of objects, extracts the parts to be actually processed using the ROI SEL module, and is suitable for real-time processing request speed analysis, user setting, average size analysis of objects, and ratio analysis of objects in the image. It may include a pretreatment configuration for structure selection.

본 발명은 학습이 적절한지에 대한 보상값과 현재 상태 분석, 옵션 및 순서 생성, 모델 학습과 정확도 검증, 산출한 모델 옵션 저장 및 보상값을 컨트롤러로 전송하는 과정을 반복하여 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정 구성을 포함할 수 있다.The present invention is a model parameter for calculating the best accuracy by repeating the process of the compensation value and current state analysis, option and sequence generation, model training and accuracy verification, model model storage and compensation value for the training is appropriate to the controller It may include an optional decision configuration to obtain.

본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치는 도 1에서와 같이, 탐지할 영상 내에서 물체가 놓여 있는 위치의 분포에 대한 분석을 하고, 넷 구조 등에 대한 세팅을 사용자가 선택할 수 있도록 하고, 탐지할 물체의 평균 크기 분석을 수행하여 넷 구조를 결정하는 데이터 전처리부(100)와, 학습,검증,테스트 데이터 처리를 하는 소스 데이터 처리부(200)와, 데이터 셋과 정보를 인코딩한 파일을 출력하는 바이너리 파일 생성부(300)와, 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하여 가장 높은 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 수행하는 옵션 결정부(400)와, 옵션 결정부(400)에서 결정된 모델 옵션을 이용하여 학습을 하여 모델 성능 평가, 학습과정과 평가과정을 모니터링하고 학습 완료된 모델을 저장하는 학습 실행부(500)를 포함한다.The apparatus for image recognition through machine learning according to the present invention, as shown in Figure 1, analyzes the distribution of the position of the object in the image to be detected, and allows the user to select settings for the net structure, etc. The data preprocessing unit 100 determines a net structure by performing an average size analysis of an object to be detected, a source data processor 200 that processes learning, verification, and test data, and a file that encodes the data set and information. A binary file generator 300 for outputting, an option decision unit 400 for performing an option decision for obtaining a model parameter for calculating the highest accuracy by applying a reinforcement learning-based search technique to an option search, By using the model option determined by the option determiner 400, the model performance is evaluated, the learning process and the monitoring process are monitored, and the completed model is stored. And a learning execution unit 500.

데이터 전처리부(100)의 구성은 도 2에서와 같다.The configuration of the data preprocessor 100 is the same as in FIG. 2.

데이터 전처리부(100)는 Binary vision 및 Blob detector package detector를 통하여 영상 내 물체의 존재 위치나 위치의 분포를 분석하는 오브젝트 분포 분석부(OBJ LOC DISTRIB)(10)와, 오브젝트 분포 분석부(OBJ LOC DISTRIB)(10)의 오브젝트 분포 분석 결과에 따라 실제 처리할 부분을 추출하는 관심 영역 선택부(ROI SEL)(13)와, 검사기기의 실시간 처리 요구속도를 분석하는 처리 요구속도 분석부(TAC-TIME EVAL)(11)와, Deep-Net 구성에 요구되는 유저 세팅을 하는 유저 세팅부(USER SETTING)(14)와, 물체의 평균크기 분석을 하는 크기 분석부(OBJ SIZE EVAL)(12)와, 물체가 영상에서 차지하는 비율 분석을 하는 비율 분석부(OBJ-IMG RATIO EVAL)(15)와, 오브젝트 분포,처리 요구속도,물체 평균 크기,물체 비율 분석을 기초로 적합한 넷 구조를 결정하는 넷 구조 결정부(DEEP-NET STRUC DECESION)(16)를 포함한다.The data preprocessor 100 includes an object distribution analyzer (OBJ LOC DISTRIB) 10 and an object distribution analyzer (OBJ LOC) for analyzing the position or distribution of positions of objects in the image through binary vision and blob detector package detectors. A region of interest selector (ROI SEL) 13 for extracting a portion to be actually processed according to the object distribution analysis result of the DISTRIB) 10, and a process request rate analyzer (TAC-) for analyzing a real-time process request rate of the inspection apparatus. TIME EVAL) 11, USER SETTING 14 for user setting required for Deep-Net configuration, OBJ SIZE EVAL 12 for average size analysis of the object, and OBJ-IMG RATIO EVAL (15), which analyzes the proportion of the object in the image, and the net structure for determining the appropriate net structure based on the object distribution, processing speed, object average size, and object ratio analysis. A DEEP-NET STRUC DECESION 16 is included.

데이터 전처리부(100)는 옵션 결정부(400)에서의 옵션 결정을 보조하고, 미리 사용자가 선택해 주거나, 학습할 데이터의 분석을 통해 결정할 수 있는 선 결정 옵션들에 관한 처리를 하는 것이다.The data preprocessor 100 assists the option decision in the option determiner 400 and processes the predecision options that can be determined by the user in advance or by analyzing the data to be learned.

오브젝트 분포 분석부(OBJ LOC DISTRIB)(10)는 탐지할 영상 내에서 물체가 놓여 있는 위치의 분포에 대한 분석을 수행하는 것이다.The object distribution analyzer 10 analyzes the distribution of the position of the object in the image to be detected.

탐지할 물체의 크기가 테스트 할 전체 영상 크기에 비해 작은 경우, 전체 영상을 뒤져서 물체를 탐지할 경우, 탐지 성능이 떨어지게 되는데, 이 경우, 물체가 존재할 가능성이 높은 가방이나 상자, 팩키지 등의 위치를 먼저 얻어 내는 것이 필요하다.If the size of the object to be detected is smaller than the size of the entire image to be tested, the detection performance is lowered when the object is detected by searching the entire image.In this case, the location of a bag, box, or package that is likely to exist is found. You need to get it first.

또한, 센서의 종류에 따라 영상 내 상하의 마진 일부, 좌우나 중심에서의 일부분은 물체가 항상 존재하지 않는 영역이 있을 수 있고, 이러한 영역을 모두 탐지하는 것은 계산량이나 탐지 성능 면에서 비 효율적이다.In addition, depending on the type of sensor, there may be an area where an object is not always present in a part of the upper and lower margins, the left and right or the center in the image, and detecting all of these areas is inefficient in terms of calculation amount and detection performance.

이와 같은 문제를 해결하기 위하여 오브젝트 분포 분석부(OBJ LOC DISTRIB)(10)는 영상 내 물체의 존재 위치나 위치의 분포를 분석하여 관심 영역 선택부(ROI SEL)(13)에서 탐지할 ROI(Region Of Interest)영역을 선택할 수 있도록 하는 것이다.In order to solve such a problem, the object distribution analyzer 10 analyzes the existence position or position distribution of the object in the image and detects the ROI (Region) to be detected by the ROI SEL 13. Of interest).

그리고 처리 요구속도 분석부(TAC-TIME EVAL)(11)는 검사기기의 실시간 처리 요구속도를 분석하는 것이다.And the processing request rate analysis unit (TAC-TIME EVAL) 11 is to analyze the real-time processing request rate of the inspection equipment.

적용할 검사 기기의 실시간 요구 사항은 응용 분야에 따라 달라지는데, 빠른 탐지 속도를 요구하는 시스템이 있는 반면에, 속도는 느려도 정확도가 더 중요한 시스템도 있다.The real-time requirements of the inspection equipment to be applied vary depending on the application. Some systems require faster detection speeds, while others are slower but more important.

처리 속도와 성능은 서로 tradeoff 관계이며 이에 따라 딥러닝 학습에 적용할 신경망 네트웍의 구조를 선택할 수 있다.The processing speed and performance are tradeoffs, and you can choose the structure of the neural network to apply to deep learning.

즉, 처리 요구속도 분석부(TAC-TIME EVAL)(11)는 유저 세팅부(USER SETTING)(14)를 통하여 처리 속도가 느린 대신 성능이 높은 넷 구조, 속도가 빠른 대신 성능의 손실이 일부 있는 넷 구조 등에 대한 세팅을 사용자가 선택할 수 있도록 하기 위한 것이다.That is, the processing request rate analyzing unit (TAC-TIME EVAL) 11 has a high performance net structure, a high speed structure instead of slow processing, and some loss of performance through the user setting unit (USER SETTING) 14. This is to allow the user to select settings for the net structure.

그리고 크기 분석부(OBJ SIZE EVAL)(12)는 탐지할 물체의 평균 크기 분석을 수행하는 것으로, 탐지 대상 물체가 영상에서 차지하는 평균 비율을 물체 크기의 종횡비와 영역의 분포 분석을 통해 결정할 수 있도록 한다.The OBJ SIZE EVAL 12 performs an average size analysis of the object to be detected, and determines an average ratio of the object to be detected in the image through analysis of the aspect ratio of the object size and the distribution of the area. .

그리고 넷 구조 결정부(DEEP-NET STRUC DECESION)(16)는 오브젝트 분포 분석부(OBJ LOC DISTRIB)(10), 처리 요구속도 분석부(TAC-TIME EVAL)(11), 크기 분석부(OBJ SIZE EVAL)(12)의 분석 결과를 기초하여 적용할 최적의 넷 구조를 결정한다.In addition, the net structure determination unit (DEEP-NET STRUC DECESION) 16 includes an object distribution analyzer (OBJ LOC DISTRIB) 10, a processing request rate analyzer (TAC-TIME EVAL) 11, and an size analyzer (OBJ SIZE). EVAL) 12 determines the optimal net structure to apply based on the analysis results.

그리고 소스 데이터 처리부(200)의 상세 구성은 도 3에서와 같다.The detailed configuration of the source data processing unit 200 is the same as in FIG. 3.

도 3에서와 같이, 소스 데이터 처리부(200)는 학습, 검증, 테스트 데이터 셋을 처리하는 이미지 데이터 처리부(Image data)와, 데이터셋의 Label 정보를 처리하는 XML 파일 처리부(XML Files)와, Deep-Net 정보를 관리하는 모델 관리부(Model Config)와, 학습 파라미터 정보를 처리하는 학습 파라미터 정보 처리부(Train Config)와, 검증 파라미터 정보를 처리하는 검증 파라미터 정보 처리부(EvalConfig)와, 데이터 셋 경로 관리를 하는 데이터 셋 경로 관리부(Input Config)를 포함한다.As shown in FIG. 3, the source data processor 200 may include an image data processor for processing a learning, verification, and test data set, an XML file processor for processing label information of a data set, and a Deep file. Model Config for managing the Net information, the learning parameter information processing unit (Train Config) for processing the learning parameter information, the verification parameter information processing unit (EvalConfig) for processing the verification parameter information, data set path management And a data set path management unit (Input Config).

그리고 옵션 결정부(Optimal Option Selector)(400)의 상세 구성은 도 4에서와 같다.The detailed configuration of the optional option selector 400 is as shown in FIG. 4.

옵션 결정부(400)는 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하는 것으로, 학습이 적절한지에 대한 보상값과 현재 상태를 분석하는 컨트롤러(Controller)(41)와, Data augmentation 옵션 및 순서를 생성하는 옵션 샘플러(Data Aug Opt sampler)(42)와, Train Config를 갱신하는 업데이트부(Train Config Update)(43)와, Tensorflow Object Detection API에 필요한 model을 Source Data로부터 불러와 구성하는 모델 구성부(Model Construction)(44)와, 모델 학습과 정확도를 검증하는 검증부(Train & Eval)(45)와, 정확도를 분석후 Data augmentation 옵션을 Buffer에 저장하는 옵션 저장부(Save Aug Opt)(46)와, 가장 높은 정확도를 산출한 모델 Option을 저장하는 최적 옵션 저장부(Best Option)(47)와, 정확도로부터 구해진 적절한 보상값을 컨트롤러(Controller)(41)에 제공하는 보상값 제공부(RewardCAL)(48)를 포함하고, 이들 과정을 반복하여 최고의 정확도를 산출하는 Model 파라미터를 구하는 것이다.The option determiner 400 applies a reinforcement learning-based search technique to the option search. The controller 41 analyzes the reward value and the current state of the appropriate learning and the data augmentation option. And an optional sampler (Data Aug Opt sampler) 42 for generating a sequence, a Train Config Update 43 for updating Train Config, and a model required for the Tensorflow Object Detection API from the source data. Model Construction (44), Train & Eval (45) to validate model learning and accuracy, and Option Save (Save Aug Opt) to store the data augmentation options in the buffer after analyzing the accuracy. 46, an optimal option storage unit 47 for storing a model option having the highest accuracy, and an appropriate compensation value obtained from the accuracy to the controller 41. Including Study (RewardCAL) (48) It will obtain the Model parameters that yield the best accuracy by repeating these processes.

여기서, Data augmentation은 이미지의 레이블을 변경하지 않고 픽셀을 변화 시키는 방법이며, 변형된 데이터를 이용하여 학습을 진행할 수 있도록 하는 것이다.Here, data augmentation is a method of changing pixels without changing the label of the image, so that learning can be performed using the modified data.

이와 같은 본 발명에 따른 옵션 결정부(400)는 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하는 것이다.The option decision unit 400 according to the present invention applies a retrieval learning-based search technique to the option search.

강화학습의 주요 알고리즘 중 Policy gradient 기법은 복잡한 다차원 공간에서 여러 인자들의 결합이 어떤 확률 분포를 표현한다고 가정할 때, 최적 분포를 이 분포에서 얻은 몇 개의 데이터 샘플링의 반복을 통해 얻을 수 있다는 가정을 이용하는 것이다.Among the main algorithms of reinforcement learning, the policy gradient method uses the assumption that the optimal distribution can be obtained by iterating several data samples obtained from this distribution, assuming that a combination of multiple factors represents a probability distribution. will be.

선택 가능한 여러 옵션들은 각 옵션이 다차원 공간에서 하나의 축을 구성한다고 보고, 옵션들의 결합은 다차원 공간에서 성능에 대한 어떤 확률 분포를 이루게 된다.Several selectable options consider each option to constitute one axis in multidimensional space, and the combination of options results in some probability distribution of performance in multidimensional space.

공간에서 하나의 샘플을 선택했을 때, 이 샘플이 표현하는 옵션의 결합을 통해 학습기의 성능을 평가할 수 있으며, 평가 성능을 강화학습의 보상(reward)으로 이용하여 Policy gradient 기반으로 최적 분포를 탐지한다.When one sample is selected in space, the performance of the learner can be evaluated by combining the options represented by this sample, and the optimal performance is detected based on the policy gradient using the evaluation performance as a reward for reinforcement learning. .

그리고 탐지 알고리즘(Controller)은 딥러닝 학습 기법 중의 하나인 LSTM 망(Long Short Term Memory networks) 등을 사용할 수 있다.In addition, the detection algorithm (Controller) may use Long Short Term Memory networks (LSTM), which is one of deep learning learning techniques.

LSTM 망의 출력은 시간 항(time step)에 따라 홀수항은 옵션선택, 짝수항은 해당 옵션의 가부를 선택한다.The output of LSTM network selects option for odd term and option for even term according to time step.

각 항의 출력은 다음 항의 입력 토큰으로 전달된다.The output of each term is passed to the input token of the next term.

예를 들어, 5개의 옵션 리스트가 있을 때, 첫번째 출력은 5가지 중 하나의 옵션을 선택하고 선택된 옵션은 두번째 출력 항의 입력으로 전달되고, LSTM 망을 통한 두번째 출력은 선택된 옵션의 가부 여부를 출력하고, 두번째 항 출력은 다시 다음 항의 입력으로 전달하는 방식으로 진행된다.For example, when there are five option lists, the first output selects one of five options, the selected option is passed to the input of the second output term, and the second output through the LSTM network outputs whether the selected option is available or not. The second term output is then passed back to the next term.

출력된 옵션의 결합(Data Aug Opt Sampler)을 통해 학습 모델을 구성(Model Construction)하고 최적화 과정(Train & Eval)을 통해 가장 좋은 성능을 내는 분포를 출력(Best Option)하게 된다.The data Aug Opt Sampler is used to construct the training model (Model Construction) and the optimization (Train & Eval) to output the best performance distribution (Best Option).

그리고 학습 실행부(500)의 상세 구성은 도 5에서와 같다.The detailed configuration of the learning execution unit 500 is the same as in FIG. 5.

학습 실행부(Final Tensorflow Object Detection API)(500)는 옵션 결정부(400)에서 결정된 모델 옵션을 이용하여 학습을 하는 학습부(Train Stage)(51)와, 학습 과정에서의 모델 성능을 평가하는 모델 평가부(Evaluation State)(52)와, 학습 과정과 평가 과정을 모니터링하는 모니터링부(Tensorboard)(53)와, 학습 완료된 모델을 저장하는 모델 저장부(ModelExporter)(54)를 포함한다.Final Tensorflow Object Detection API (500) is a training stage (Train Stage) 51 for learning by using the model option determined in the option determiner 400, and evaluates the performance of the model in the learning process A model evaluation unit (Evaluation State) 52, a monitoring unit (Tensorboard) 53 for monitoring the learning process and the evaluation process, and a model storage unit (ModelExporter) 54 for storing the trained model.

이와 같은 구성을 갖는 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치의 처리 과정은 다음과 같다.The processing procedure of the apparatus for image recognition through machine learning according to the present invention having such a configuration is as follows.

도 6은 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 방법을 나타낸 플로우 차트이다.6 is a flowchart illustrating a method for image recognition through machine learning according to the present invention.

먼저, 탐지할 영상 내에서 물체가 놓여 있는 위치의 분포에 대한 분석을 수행한다.(S601)First, an analysis on the distribution of the position where an object is placed in the image to be detected is performed (S601).

그리고 탐지 대상 물체가 영상에서 차지하는 평균 비율을 물체 크기의 종횡비와 영역의 분포 분석을 통해 결정하고(S602), 딥러닝 학습에 적용할 신경망 네트웍의 구조를 선정한다.(S603)The average ratio of the object to be detected in the image is determined by analyzing the aspect ratio of the object size and the distribution of the area (S602), and the structure of the neural network to be applied to the deep learning learning is selected (S603).

이어, 학습, 검증, 테스트 데이터 셋 처리 및 데이터 셋의 Label 정보, Deep-Net 정보, 학습 파라미터 정보, 검증 파라미터 정보, 데이터 셋 경로 처리를 하고, 데이터 셋과 정보를 인코딩한 파일 생성한다.(S604)Subsequently, training, verification, test data set processing, label information, deep-net information, learning parameter information, verification parameter information, and data set path processing of the data set are performed, and a file encoded with the data set and information is generated. )

그리고 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 한다.(S605)Then, reinforcement learning based retrieval technique is applied to option retrieval and option decision is made to obtain model parameters that yield the highest accuracy (S605).

이어, 강화 학습 과정에서 얻은 Model Option을 이용하여 학습을 하여 모델 성능 평가, 학습과정과 평가과정을 모니터링, 학습 완료된 모델을 저장한다.(S606)Subsequently, learning is performed using the Model Option obtained in the reinforcement learning process, model performance evaluation, monitoring the learning process and the evaluation process, and storing the completed model (S606).

여기서, 딥러닝 학습에 적용할 신경망 네트웍의 구조를 선정하기 위한 전처리 과정(S601)(S602)으로, Binary vision 및 Blob detector package detector를 통하여 영상 내 물체의 존재 위치나 위치의 분포를 분석하는 오브젝트 분포 분석 단계와, 오브젝트 분포 분석 결과에 따라 실제 처리할 부분을 추출하는 관심 영역 선택 단계와, 검사기기의 실시간 처리 요구속도를 분석하는 처리 요구속도 분석 단계와, Deep-Net 구성에 요구되는 유저 세팅을 하는 유저 세팅 단계와, 물체의 평균크기 분석을 하는 크기 분석 단계와, 물체가 영상에서 차지하는 비율 분석을 하는 비율 분석 단계를 포함할 수 있다.Here, in the pre-processing process (S601) (S602) for selecting the structure of the neural network to be applied to deep learning learning, the object distribution analyzing the position or distribution of the position of the object in the image through binary vision and blob detector package detector The analysis step, the region of interest selection step for extracting the actual part to be processed according to the object distribution analysis result, the processing request rate analysis step for analyzing the real-time processing request rate of the inspection device, and the user setting required for Deep-Net configuration A user setting step, a size analysis step of analyzing the average size of the object, and a ratio analysis step of analyzing the ratio of the object occupies in the image.

그리고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 하는 단계(S605)는, 학습이 적절한지에 대한 보상값과 현재 상태를 분석하는 단계와, Data augmentation 옵션 및 순서를 생성하는 단계와, Train Config를 갱신하는 업데이트 단계와, Tensorflow Object Detection API에 필요한 model을 Source Data로부터 불러와 구성하는 모델 구성 단계와, 모델 학습과 정확도를 검증하는 검증 단계와, 정확도를 분석후 Data augmentation 옵션을 Buffer에 저장하는 옵션 저장 단계와, 가장 높은 정확도를 산출한 모델 Option을 저장하는 최적 옵션 저장 단계와, 확도로부터 구해진 적절한 보상값을 컨트롤러(Controller)에 제공하는 보상값 제공 단계를 포함할 수 있다.In operation S605, an option decision for obtaining a model parameter that calculates the highest accuracy may include analyzing a compensation value and a current state of whether learning is appropriate, generating a data augmentation option and sequence, and calculating Train Config. An update step to update, a model construction step to load and construct a model required for the Tensorflow Object Detection API from source data, a verification step to verify model learning and accuracy, and an option to store data augmentation options in the buffer after analyzing the accuracy It may include a storing step, an optimal option storing step of storing the model option that has calculated the highest accuracy, and providing a compensation value providing the controller with an appropriate compensation value obtained from the accuracy.

이상에서 설명한 본 발명에 따른 머신 러닝을 통한 영상 인식을 위한 장치 및 방법은 강화학습(reinforcement learning)기반의 탐색 기법을 옵션 탐색에 적용하고 최고의 정확도를 산출하는 모델 파라미터를 구하는 옵션 결정을 할 수 있도록 한 것이다.Apparatus and method for image recognition through machine learning according to the present invention described above can be applied to reinforcement learning-based search method to option search and to determine the option to obtain the model parameters that yield the highest accuracy It is.

이와 같은 본 발명은 구글의 tensorflow 프레임웍을 사용하여 사용자가 탐지하기를 원하는 물체의 위치와 부류를 딥러닝 기술을 이용하여 정확하게 판별할 수 있도록 한다.As described above, the present invention uses Google's tensorflow framework to accurately determine the location and class of the object that the user wants to detect using deep learning technology.

이상에서의 설명에서와 같이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 본 발명이 구현되어 있음을 이해할 수 있을 것이다.It will be understood that the present invention is implemented in a modified form without departing from the essential features of the present invention as described above.

그러므로 명시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 하고, 본 발명의 범위는 전술한 설명이 아니라 특허청구 범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.Therefore, the described embodiments should be considered in descriptive sense only and not for purposes of limitation, and the scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent scope are included in the present invention. It should be interpreted.

100. 데이터 전처리부 200. 소스 데이터 처리부
300. 바이너리 파일 생성부 400. 옵션 결정부
500. 학습 실행부100. Data preprocessor 200. Source data processor
300. Binary file generation unit 400. Option determination unit
500. Learning Execution Unit

Claims

A data preprocessor configured to analyze the distribution of the position of the object in the image to be detected, allow a user to select a setting for the net structure, and determine a net structure by performing an average size analysis of the object to be detected;
A source data processor for learning, verifying, and testing data;
A binary file generator for outputting a file encoded with the data set and the information;
An option decision unit configured to apply a reinforcement learning-based search method to option search to determine an option for obtaining a model parameter that yields the highest accuracy;
Learning execution using the model option determined by the option determiner to learn the model performance evaluation, the learning process and the evaluation process and to store the completed model; learning for image recognition through machine learning comprising a Device.

The data preprocessing unit of claim 1,
An ROI selector for extracting the part to be actually processed according to the object distribution analysis result of the object distribution analyzer OBJ LOC DISTRIB and the object distribution analyzer OBJ LOC DISTRIB. (ROI SEL),
A user setting unit (USER SETTING) for setting a user setting required for a TAC-TIME EVAL and a deep-net configuration for analyzing a real-time processing request rate of an inspection apparatus;
OBJ SIZE EVAL, which analyzes the average size of the object, and OBJ-IMG RATIO EVAL, which analyzes the proportion of the object in the image.
An apparatus for image recognition through machine learning, comprising a DEEP-NET STRUC DECESION for determining a suitable net structure based on an object distribution, a processing speed, an average size of an object, and an object ratio analysis. .

The method of claim 1, wherein the source data processing unit,
An image data processing unit for processing a learning, verification and test data set,
XML file processing unit (XML Files) for processing the label information of the data set,
Model Config that manages Deep-Net information,
A learning parameter information processing unit (Train Config) for processing the learning parameter information,
A verification parameter information processing unit (EvalConfig) for processing the verification parameter information,
Apparatus for image recognition through machine learning, characterized in that it comprises a data set path management unit (Input Config) for data set path management.

The method of claim 1, wherein the option determiner,
Controller that analyzes the current state and the reward value for learning is appropriate,
An optional data augmenter to generate data augmentation options and order,
An update unit (Train Config Update) for updating the training environment (Train Config),
A model construction unit for retrieving and constructing a model required for a training execution unit (Tensorflow Object Detection API) from source data,
Train & Eval to verify model learning and accuracy,
An option storage unit (Save Aug Opt) for storing the data augmentation option in a buffer after analyzing the accuracy;
An optimal option store for storing the model options that yield the highest accuracy,
And a reward value provider (RewardCAL) for providing a compensation value obtained from the accuracy to a controller.

The method of claim 1 or 4, wherein the option determination unit,
When one sample is selected in space, the combination of options represented by this sample is used to evaluate the performance of the learner and to detect the optimal distribution based on policy gradient using the evaluation performance as a reward for reinforcement learning. Apparatus for image recognition through machine learning, characterized in that.

The method of claim 1 or 4, wherein the option determination unit,
The detection algorithm uses Long Short Term Memory networks,
Output of the LSTM network according to the time step (time step), odd terms option selection, even terms select the option or not, the apparatus for image recognition through machine learning.

The method of claim 1, wherein the learning execution unit (Final Tensorflow Object Detection API),
A training stage that trains using the model option determined by the option decision unit,
A model evaluation state for evaluating model performance in the learning process,
A tensorboard for monitoring the learning process and the evaluation process,
Apparatus for image recognition through machine learning, characterized in that it comprises a model storage unit (ModelExporter) for storing the completed model.

A data preprocessing step of analyzing a distribution of positions of objects in the image to be detected, allowing a user to select a setting for a net structure, etc., and determining an net structure by performing an average size analysis of the object to be detected;
A source data processing step of processing learning, verification, and test data;
A binary file generation step of outputting a file encoded with the data set and the information;
An option determination step of applying a reinforcement learning-based search method to the option search to perform an option decision to obtain a model parameter that yields the highest accuracy;
Learning by using the model option determined in the option decision step, the model performance evaluation, the learning process and the learning execution step of storing the completed model; learning for image recognition through machine learning comprising a Way.

The method of claim 8, wherein the data preprocessing step comprises:
An object distribution analyzing step of analyzing an existence position or a distribution of positions of an object in the image;
A region of interest selection step of extracting a portion to be actually processed according to an object distribution analysis result;
A processing request rate analyzing step of analyzing a real-time processing request rate of the inspection apparatus;
User setting step to set user setting required for Deep-Net configuration,
A size analysis step of analyzing the average size of the object, and a ratio analysis step of performing a ratio analysis of the object occupies in the image.

The method of claim 8, wherein the option determination step,
Analyzing the rewards and current status of appropriate learning;
Creating a data augmentation option and order,
An update step to update the Train Config,
A model construction step of importing and constructing a model required for a training execution unit (Tensorflow Object Detection API) from source data,
Validation steps to verify model training and accuracy,
An option storage step of storing data augmentation options in a buffer after analyzing the accuracy;
An optimal option storage step that stores the model options that yield the highest accuracy,
And a compensation value providing step of providing a compensation value obtained from the accuracy to a controller.