KR20200005324A

KR20200005324A - Method and system for recognizing images based on machine learning

Info

Publication number: KR20200005324A
Application number: KR1020180078910A
Authority: KR
Inventors: 이훈희; 정다운; 최한림; 손승희; 류동영; 주광혁
Original assignee: 한국항공우주연구원
Priority date: 2018-07-06
Filing date: 2018-07-06
Publication date: 2020-01-15
Also published as: KR102112754B1

Abstract

Disclosed are a machine learning-based image recognition method and a machine learning-based image recognition system. The machine learning-based image recognition method according to an embodiment of the present invention comprises the steps of: recording data about a background fragment selected by processing image for training in a machine learning DB; determining whether the background fragment is included in an input actual image; and retrieving the data from the machine learning DB when the background fragment is included in the actual image and outputting information recognized in relation to the actual image on a screen by using the retrieved data.

Description

Machine learning based image recognition method and machine learning based image recognition system {METHOD AND SYSTEM FOR RECOGNIZING IMAGES BASED ON MACHINE LEARNING}

본 발명은 기계학습 기반으로 카메라 영상으로부터 객체(Object)와 함께 배경(Background)을 통합적으로 인식하는 기술에 관한 것으로, 학습용 영상에서 객체의 역할을 할 수 있는 유익한 배경조각을 선정해 기계학습 하고 이를 객체와 조합하여, 입력된 실제 영상에 대한 인식 성능을 향상시킬 수 있는 기계학습 기반의 영상 인식 방법 및 기계학습 기반의 영상 인식 시스템에 관한 것이다.The present invention relates to a technology for integrating a background with an object from a camera image based on a machine learning. The present invention selects an advantageous background piece that can serve as an object in a learning image and machine learning the same. The present invention relates to a machine learning based image recognition method and a machine learning based image recognition system which can be combined with an object to improve the recognition performance of an input real image.

도 1은 종래의 일실시예에 따른 기계학습 시스템에서 영상을 인식하는 일례를 나타내는 도면이다.1 is a diagram illustrating an example of recognizing an image in a machine learning system according to an exemplary embodiment.

도 1을 참조하면, 종래의 일실시예에 따른 기계학습 시스템(100)은 기계학습을 위해 사용되는 학습용 영상을 처리하는 메커니즘이 주로 사람이 정의하는 객체(Object) 중심적이었다.Referring to FIG. 1, in the machine learning system 100 according to an exemplary embodiment, a mechanism for processing a learning image used for machine learning is mainly object-oriented.

예를 들어, 기계학습 시스템(100)은 학습용 영상 내의 자전거, 비행기, 사람 등과 같은 특정화된 객체 이외의 것을 전부 배경으로 간주하여 분할하거나 제거해야 할 대상으로 연구가 되어 왔다.For example, the machine learning system 100 has been studied as an object to be divided or removed in consideration of all the objects other than a specified object such as a bicycle, an airplane, a person, etc. in a learning image as a background.

예를 들어, 임의의 지역을 원거리에서 촬영한 영상에서, 기계학습 시스템(100)은 해당 영상을 '집'과 '도로' '숲' 등으로 분할할 수는 있지만, '집'이 아닌 '도로'와 '숲'과 같은 배경에서 단일 객체 역할을 할 수 있는 요소를 특정해서 학습할 수는 없었다.For example, in an image captured at a distance from a certain area, the machine learning system 100 may divide the image into 'house' and 'road' and 'forest', but not 'house' but 'road' We couldn't learn by specifying elements that could act as a single object in the background, such as' and 'forest'.

또한 영상 속 객체의 그림자 역시 배경으로 간주되었으며 배경을 구분하거나 인식하기 위한 처리는 오히려 성능을 저하시키는 원인이 되었다.In addition, the shadow of the object in the image was regarded as the background, and the processing for classifying or recognizing the background caused the performance deterioration.

종래의 기계학습 시스템(100)은 분류 기술, 위치 찾기 기술, 다중 감지 기술, 분할 기술, 정합 기술 등에 다양하게 응용되고 있다. 예를 들어, 분류 기술은 객체인 '고양이'가 있는 영상을 분류하는 기술이고, 위치 찾기 기술은 영상 속 객체 '고양이'의 위치를 특정하는 기술이고, 다중 감지 기술은 영상 속 '고양이', '오리', '개'와 같은 다양한 객체를 동시에 인식하는 기술이고, 분할 기술은 영상 내에서 각 영상 내에서 객체의 모양을 결정해 분할, 분리해 내는 기술을 지칭할 수 있다.Conventional machine learning system 100 has been applied to a variety of classification technology, location technology, multiple sensing technology, segmentation technology, matching technology. For example, classification technology is a technology for classifying an image having an object 'cat', and a location locating technology is a technology for specifying the position of an object 'cat' in an image, and multiple sensing technology is a 'cat' and ' It is a technology for recognizing various objects such as 'duck' and 'dog' at the same time, and the segmentation technology may refer to a technology for determining, dividing, and separating the shape of an object in each image.

따라서 상술의 기계학습 시스템(100)을 이용한 다양한 시스템에서는, 입력된 영상에서 객체를 추출해 내지 못할 경우 매우 불안정한 상태에 놓이게 될 수 있다.Therefore, in various systems using the machine learning system 100 described above, when the object cannot be extracted from the input image, it may be in a very unstable state.

이에 따라, 기존의 객체 중심적 기계학습 시스템(100)에 의해 객체로 정의할 수 없는 형태가 영상 속에서 감지된다면 이를 객체 역할이 가능한 배경조각으로서 이용하여 영상에서 정보를 인식할 수 있도록 기술이 요구되고 있다.Accordingly, if a form that cannot be defined as an object is detected in the image by the existing object-oriented machine learning system 100, a technology is required to recognize the information in the image by using it as a background fragment capable of acting as an object. have.

본 발명의 실시예는 학습용 영상에서 객체(Object)를 제외한 배경의 일부(이하, '배경조각')를 의미 있는 객체로서 추출하여, 기존의 객체 중심의 기계학습에서 배경을 인식 못하는 문제를 해결하고, 객체와 배경조각과의 조합에 의해 인식 성능을 향상시키는 것을 목적으로 한다.An embodiment of the present invention is to extract a part of the background (hereinafter, 'background fragment') from the learning image as a meaningful object, to solve the problem of not recognizing the background in the existing object-centered machine learning The aim is to improve recognition performance by combining objects with background fragments.

또한, 본 발명의 실시예는 객체와 배경조각들 간의 관계 정보를 이용하여 각 객체의 위치를 보다 정확하게 계산하고 이를 바탕으로 카메라의 움직임 정보를 취득하는 것을 목적으로 한다.In addition, an embodiment of the present invention aims to more accurately calculate the position of each object by using the relationship information between the object and the background pieces, and obtain motion information of the camera based on this.

본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 방법은, 학습용 영상을 처리하여 선정한 배경조각에 관한 데이터를 기계학습 DB에 기록하는 단계와, 입력되는 실제 영상 내에 상기 배경조각이 포함되는지 판단하는 단계, 및 상기 판단 결과 포함되는 경우 상기 기계학습 DB에서 상기 데이터를 검색하고, 검색된 상기 데이터를 이용하여 상기 실제 영상과 관련하여 인식되는 정보를 화면에 출력하는 단계를 포함한다.Machine learning-based image recognition method according to an embodiment of the present invention, the step of recording the data on the background pieces selected by processing the learning image in the machine learning DB, and determines whether the background pieces are included in the actual image input And retrieving the data from the machine learning DB if the determination result is included, and outputting information recognized in relation to the actual image to the screen using the retrieved data.

또한, 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 시스템은, 학습용 영상을 처리하여 선정한 배경조각에 관한 데이터를, 기계학습 DB에 기록하는 학습 처리부, 및 입력되는 실제 영상 내에, 상기 배경조각이 포함되는지 판단하고, 상기 판단 결과 포함되는 경우, 상기 기계학습 DB에서 상기 데이터를 검색하고, 검색된 상기 데이터를 이용하여, 상기 실제 영상과 관련하여 인식되는 정보를 화면에 출력하는 인식 처리부를 포함한다.In addition, the machine learning-based image recognition system according to an embodiment of the present invention, the learning processing unit for recording the data on the background pieces selected by processing the learning image, in the machine learning DB, and the input in the actual image, the background And a recognition processor configured to determine whether a piece is included, and if the determination result is included, retrieve the data from the machine learning DB and use the retrieved data to output information recognized in relation to the real image on a screen. do.

본 발명의 일실시예에 따르면, 학습용 영상에서 객체를 제외한 배경의 일부(이하, '배경조각')를 의미 있는 객체로서 추출하여, 기존의 객체 중심의 기계학습에서 배경을 인식 못하는 문제를 해결하고, 객체와 배경조각과의 조합에 의해 인식 성능과 정확도를 높일 수 있다.According to an embodiment of the present invention, by extracting a part of the background (hereinafter, 'background fragment') from the learning image as a meaningful object, to solve the problem of not recognizing the background in the existing object-oriented machine learning In addition, recognition performance and accuracy can be improved by combining objects with background fragments.

본 발명의 일실시예에 따르면, 입력된 실제 영상에서 객체가 추출되지 않는 경우에도, 학습된 배경조각을 이용해 위치, 방향 등의 항법 정보를 인식할 수 있는 환경을 제공할 수 있다.According to an embodiment of the present invention, even when an object is not extracted from the input real image, it is possible to provide an environment that can recognize navigation information such as position and direction using the learned background fragment.

본 발명의 일실시예에 따르면, 객체와 배경조각들 간의 관계 정보를 이용하여 각 객체의 위치를 보다 정확하게 계산하고 이를 바탕으로 카메라의 움직임 정보를 취득할 수 있다.According to an embodiment of the present invention, the position of each object may be more accurately calculated by using the relationship information between the object and the background pieces, and the motion information of the camera may be obtained based on this.

본 발명의 일실시예에 따르면, 수치 지형 정보를 참조하는 시스템, 영상에서 정합 기준점을 찾는 시스템, 영상에서 항법 정보를 추출하는 시스템, 및 기계학습 기반의 응용제품 등에 적용 가능한 통합 인식 시스템을 제공할 수 있다.According to an embodiment of the present invention, a system for referencing digital terrain information, a system for finding a matching reference point in an image, a system for extracting navigation information from an image, and an integrated recognition system applicable to machine learning-based applications may be provided. Can be.

도 1은 종래의 일실시예에 따른 기계학습 시스템에서 영상을 인식하는 일례를 나타내는 도면이다.
도 2는 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 시스템의 구성을 도시한 블록도이다.
도 3a은 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상에서 추출하려는 윈도우 영역의 구성을 도시한 도면이다.
도 3b는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상에서 복수의 윈도우 영역을 추출하는 일례를 도시한 도면이다.
도 4는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상을 이용하여 배경조각을 기계학습하는 과정을 도시한 도면이다.
도 5는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 실제 영상으로부터 배경조각을 인식하여 정보를 출력하는 과정을 도시한 도면이다.
도 6은 본 발명의 일실시예에 따른 영상 인식 시스템에서, 복수의 윈도우 영역을 순차적으로 추출하는 과정을 나타낸 도면이다.
도 7a, 7b는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 배경 영역을 복수의 격자로 분할하여 윈도우 영역을 추출하는 일례를 도시한 도면이다.
도 8은 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상에서 배경조각을 선정하여 기계학습 DB를 구축하는 과정을 도시한 도면이다.
도 9는 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 방법의 순서를 도시한 흐름도이다.1 is a diagram illustrating an example of recognizing an image in a machine learning system according to an exemplary embodiment.
2 is a block diagram illustrating a configuration of a machine learning based image recognition system according to an exemplary embodiment of the present invention.
3A is a diagram illustrating a configuration of a window region to be extracted from a training image in an image recognition system according to an exemplary embodiment of the present invention.
3B illustrates an example of extracting a plurality of window regions from a training image in an image recognition system according to an exemplary embodiment of the present invention.
FIG. 4 is a diagram illustrating a process of machine learning a background piece using an image for learning in an image recognition system according to an embodiment of the present invention.
5 is a diagram illustrating a process of outputting information by recognizing a background fragment from an actual image in an image recognition system according to an exemplary embodiment of the present invention.
6 is a diagram illustrating a process of sequentially extracting a plurality of window regions in an image recognition system according to an embodiment of the present invention.
7A and 7B illustrate an example of extracting a window region by dividing a background region into a plurality of grids in an image recognition system according to an exemplary embodiment of the present invention.
8 is a diagram illustrating a process of constructing a machine learning DB by selecting a background piece from a learning image in an image recognition system according to an embodiment of the present invention.
9 is a flowchart illustrating a procedure of a machine learning based image recognition method according to an embodiment of the present invention.

이하, 첨부된 도면들을 참조하여 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 방법 및 영상 인식 시스템에 대해 상세히 설명한다. 그러나, 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, a machine learning based image recognition method and an image recognition system according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited or limited by the embodiments. Like reference numerals in the drawings denote like elements.

도 2는 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 시스템의 구성을 도시한 블록도이다.2 is a block diagram illustrating a configuration of a machine learning based image recognition system according to an exemplary embodiment of the present invention.

도 2를 참조하면, 본 발명의 일실시예에 따른 영상 인식 시스템(200)은, 학습 처리부(210), 기계학습 DB(220) 및 인식 처리부(230)를 포함하여 구성할 수 있다.Referring to FIG. 2, the image recognition system 200 according to an embodiment of the present invention may include a learning processor 210, a machine learning DB 220, and a recognition processor 230.

학습 처리부(210)는 학습용 영상을 처리하여 선정한 배경조각에 관한 데이터를, 기계학습 DB(220)에 기록한다.The learning processor 210 records data related to the background pieces selected by processing the learning image in the machine learning DB 220.

즉, 학습 처리부(210)는 기계학습 과정에서 입력되는 학습용 영상 속 배경 영역의 일부를, 객체(Object)의 역할이 가능한 '배경조각'으로서 선정할 수 있다.That is, the learning processor 210 may select a part of the background area in the learning image input during the machine learning process as a 'background fragment' capable of acting as an object.

여기서 배경 영역은 학습용 영상에서 자전거, 사람, 돌고래, 자동차, 비행기, 집 등과 같은 특정화된 객체를 제외한 전 영역을 지칭할 수 있으며, 학습 처리부(210)는 학습용 영상에서 객체의 추출과 함께 혹은 학습용 영상에서 추출되는 객체가 없더라도, 배경조각의 선정을 수행할 수 있다.Here, the background area may refer to the entire area excluding a specified object such as a bicycle, a person, a dolphin, a car, an airplane, a house, and the like in the learning image, and the learning processor 210 may extract the object from the learning image or the learning image. Even if no object is extracted from, the background fragment can be selected.

구체적으로, 학습 처리부(210)는 학습용 영상의 배경 영역에서 배경조각의 후보가 되는 적어도 하나의 윈도우(window) 영역을 추출하고, 추출된 윈도우 영역 중에서 인식 성능이 높게 산출되는 윈도우 영역을, 객체의 역할이 가능한 배경조각으로 선정할 수 있다.In detail, the learning processor 210 extracts at least one window area that is a candidate for background fragments from the background area of the training image, and selects a window area of which the recognition performance is high from the extracted window area. It can be selected as a possible background piece.

예를 들어, 도 3a 및 도 3b를 참조하면, 학습 처리부(210)는 도 3b에 도시한 학습용 영상(320)으로부터, 도 3a에 도시한 복수의 파라미터를 가지는 사각형 형태의 윈도우 영역(Window 1, Window 2, Window 3, Window 4)(310)을 추출할 수 있다.For example, referring to FIGS. 3A and 3B, the learning processor 210 may have a rectangular window area Window 1 having a plurality of parameters shown in FIG. 3A from the learning image 320 shown in FIG. 3B. Window 2, Window 3, and Window 4) 310 may be extracted.

이때, 학습 처리부(210)는 음영, 해상도, 밝기, 색상, 위치 및 방향 중 적어도 하나의 설정된 촬영 조건에 따라, 대량의 학습용 영상(320)을 실데이터에서 찾거나, 가상으로 합성한 학습용 영상(320)을 준비할 수 있다.In this case, the learning processor 210 may find a large amount of the learning images 320 in real data or virtually synthesize the learning images according to at least one set shooting condition among shadow, resolution, brightness, color, position, and direction. 320) can be prepared.

복수의 파라미터는 윈도우 영역(310)의 가로 크기와, 세로 크기, 수평 움직임 폭, 수직 움직임 폭, 중심점 위치, 회전각 및 다른 윈도우 영역 간의 관계 정보(예를 들면 거리, 간격) 중 적어도 하나일 수 있다.The plurality of parameters may be at least one of the horizontal size of the window area 310 and the relationship information (eg, distance and spacing) between the vertical size, the horizontal movement width, the vertical movement width, the center point position, the rotation angle, and other window regions. have.

각 파라미터의 값은 사전에 설정될 수 있으나, 각 윈도우 영역(310)을 추출하는 과정에서 변경될 수 있다.The value of each parameter may be set in advance, but may be changed in the process of extracting each window region 310.

예를 들어, 도 6의 학습용 영상(630, 640)을 참조하면, 학습 처리부(210)는 학습용 영상(630)에서 설정한 4개의 윈도우 영역(Window 1, Window 2, Window 3, Window 4) 간의 관계 정보(예를 들어, 거리, 간격, 방향 등)에 따라 중심점 위치를 조정하여 윈도우 영역의 분포를 설정하고, 윈도우 영역 각각의 파라미터를 조정하여, 학습용 영상(640)에서와 같이 윈도우 영역의 크기를 변경할 수 있다.For example, referring to the learning images 630 and 640 of FIG. 6, the learning processing unit 210 is configured between four window areas (Window 1, Window 2, Window 3, and Window 4) set in the learning image 630. The distribution of the window area is set by adjusting the center point position according to the relationship information (for example, distance, spacing, direction, etc.), and the parameters of each window area are adjusted to adjust the size of the window area as in the training image 640. Can be changed.

또한 상기 윈도우 영역의 형태는 사각형으로 한정되지 않고 삼각형이나 원형 등 어떠한 모양이든 가능하지만, 본 명세서에서는 파라미터의 조정을 통한 윈도우 영역의 제어가 용이하도록 사각형의 윈도우 영역을 추출하는 것을 예시한다.In addition, although the shape of the window area is not limited to a rectangle and may be any shape such as a triangle or a circle, the present specification exemplifies extracting the window area of the rectangle to facilitate control of the window area by adjusting parameters.

학습 처리부(210)는 학습용 영상의 배경 영역의 일부 또는 전 영역에서 배경조각을 선정할 수 있으며, 전 영역에서 객체 역할이 가능한 배경조각을 선정하게 될 경우 학습용 영상('달 표면')의 객체 지도를 자동으로 생성할 수 있다.The learning processor 210 may select a background fragment from a part or the entire area of the background of the learning image, and when selecting a background fragment capable of acting as an object in the entire area, the object map of the learning image (the moon surface) Can be generated automatically.

학습 처리부(210)는 다양한 방식으로 학습용 영상의 배경 영역에서 윈도우 영역을 추출할 수 있다.The learning processor 210 may extract the window area from the background area of the learning image in various ways.

일례로, 학습 처리부(210)는 상기 배경 영역을 격자 구조로 분할하고, 격자 구조로 분할한 배경 영역 중에서 추출할 n개의 윈도우 영역에 대한 상기 파라미터를 설정할 수 있다.For example, the learning processor 210 may divide the background area into a lattice structure and set the parameter for n window areas to be extracted from the background area divided into the lattice structure.

예를 들어, 도 7a를 참조하면, 학습 처리부(210)는 배경 영역을, 배경 영역 전체(격자 1)와, 배경 영역을 4등분한 영역(격자 2 내지 격자 5)을 포함해, 5개 영역의 격자로 분할할 수 있다.For example, referring to FIG. 7A, the learning processor 210 includes a background region, an entire background region (lattice 1), and an area in which the background region is divided into four quarters (lattices 2 to 5). Can be divided into a grid of.

학습 처리부(210)는 배경 영역으로부터 정해진 개수(n)의 윈도우 영역을 추출할 수 있다.The learning processor 210 may extract a predetermined number n of window areas from the background area.

도 7b를 참조하면, 학습 처리부(210)는 격자 구조의 배경 영역에서 추출하려는 총 4개의 윈도우 영역의 중심점 위치와 가로 세로 크기를 설정하여, 예컨대 격자 1에서 윈도우 영역(Window 1, Window 2)을 추출하고, 격자 5에서 윈도우 영역(Window 3, Window 4)을 추출할 수 있다.Referring to FIG. 7B, the learning processor 210 sets the center point positions and horizontal and vertical sizes of a total of four window regions to be extracted from the background region of the grid structure, for example, selecting the window regions Window 1 and Window 2 from the grid 1. The window regions (Window 3, Window 4) can be extracted from the grid 5.

다른 일례로, 학습 처리부(210)는 배경 영역을 방향에 따라 동부, 서부, 남부, 북부와 같은 복수의 지역으로 나누고 각 지역을 크기에 따라 다시 세부 구역으로 분할하여 추출할 수도 있다.As another example, the learning processor 210 may divide the background area into a plurality of areas such as east, west, south, and north according to directions, and divide and extract each area into detailed areas according to the size.

또한, 학습 처리부(210)는 앞서 추출한 윈도우 영역을 이용하여, 다른 윈도우 영역과의 위치 관계 또는 거리 관계에 따라 나머지 윈도우 영역을 추출할 수도 있다.In addition, the learning processor 210 may extract the remaining window area according to a positional relationship or a distance relationship with another window area by using the previously extracted window area.

구체적으로, 도 6을 참조하면, 학습 처리부(210)는 학습용 영상(610 내지 640) 속 배경 영역 내 임의의 영역을 제1 윈도우 영역(Window 2)으로서 추출하고, 상기 제1 윈도우 영역에 대한 중심점에서 반경 m 이내의 원(C2) 상에 위치하는 제2 윈도우 영역(Window 1)을 추출하고, 상기 제1 및 제2 윈도우 영역의 중심점을 이은 선(L1)으로부터 일정 거리 이내에 위치하는 제3 윈도우 영역(Window 4, Window 3)을 추출할 수 있다.Specifically, referring to FIG. 6, the learning processor 210 extracts an arbitrary area in the background area of the learning images 610 to 640 as the first window area Window 2, and a center point for the first window area. Extracts a second window region Window 1 located on a circle C2 within a radius m, and locates a third window located within a predetermined distance from a line L1 connecting the center points of the first and second window regions. Areas (Window 4, Window 3) can be extracted.

학습 처리부(210)는 추출된 n개의 윈도우 영역 중에서 상기 배경조각을 선정할 수 있다. 일례로, 학습 처리부(210)는 학습용 영상 내 배경 영역을 적어도 포함하여, 윈도우 영역을 추출하고, 상기 윈도우 영역의 추출 개수가, 사용자가 정한 m(상기 m은 1 이상의 자연수)에 도달하면, 추출된 m개의 윈도우 영역 각각을, 상기 배경조각으로 선정할 수 있다.The learning processor 210 may select the background pieces from the extracted n window areas. For example, the learning processor 210 extracts a window region including at least a background region in a training image, and extracts the extracted window region when the number of extraction of the window region reaches a predetermined m (the m is one or more natural numbers). Each of the m window areas may be selected as the background fragment.

즉, 학습 처리부(210)는 도 4에 도시된 것처럼 학습용 영상의 배경 영역에서 정해진 m(m=4)개의 윈도우 영역이 모두 선택되면 각 윈도우 영역을 배경조각으로 선정할 수 있다.That is, as shown in FIG. 4, when all m (m = 4) window regions selected from the background region of the learning image are selected, the learning processor 210 may select each window region as the background fragment.

구체적으로, 학습 처리부(210)는 상기 추출된 n개의 윈도우 영역 각각에 대해, 콘볼루셔널 인공신경망 기반의 인공신경망 계층의 중간 혹은 말단에서 계산된 중심점 위치 오차, 중심점 기준으로 한 윈도우 영역의 회전각 오차, 실제 촬영 지역 식별 유무 및 가로 크기 오차 및 세로 크기 오차, 윈도우 영역 간의 거리 오차 중 적어도 하나를 조합하여, 학습에 사용된 윈도우 영역을 제외한 다른 윈도우 영역을 사용하여 성능을 산출하고, 상기 성능이 최소 기준값을 만족하지 않은 윈도우 영역을 삭제하고, 삭제한 개수 만큼, 상기 윈도우 영역을 추가로 추출하고, 상기 추가 추출된 윈도우 영역에 대해 성능을 산출하는 과정을 반복할 수 있다.In detail, the learning processor 210 may calculate a center point position error calculated at the middle or the end of the convolutional neural network layer based on the extracted n window regions, and the rotation angle of the window region based on the center point. A combination of at least one of an error, whether the actual shooting area is identified, a horizontal size error and a vertical size error, and a distance error between the window areas to calculate a performance using a window area other than the window area used for learning. The process of deleting the window area that does not satisfy the minimum reference value, extracting the window area as much as the deleted number, and calculating the performance of the additional extracted window area may be repeated.

여기서, 상기 성능은 불특정 윈도우 영역에 관한 감지율과, 추출된 윈도우 영역을 이용한 위치 식별의 정확도 중 적어도 하나를 포함할 수 있다.Here, the performance may include at least one of a detection rate for the unspecified window area and an accuracy of position identification using the extracted window area.

특히, 정확도는 감지된 윈도우 영역이 배경 내에 객체로서 식별 역할을 할 수 있는지를 나타내는 것으로, 종래의 일반 객체 추출 시 정확도를 판단하는 데 적용하는 최소 기준값 이상의 값으로 설정될 수 있다. 즉 학습 처리부(210)는 일반 객체 수준의 정확도를 가지는 윈도우 영역을, 배경조각으로 선정할 수 있다.In particular, the accuracy indicates whether the sensed window area can serve as an object in the background, and may be set to a value equal to or greater than a minimum reference value used to determine accuracy in conventional general object extraction. That is, the learning processor 210 may select a window region having the accuracy of the general object level as the background fragment.

학습 처리부(210)는 산출된 성능이 최소 기준값을 만족하지 않은 윈도우 영역을 삭제하여, 배경조각의 후보에서 제외할 수 있다. 또한, 학습 처리부(210)는 삭제한 개수 만큼, 상기 윈도우 영역을 추가로 추출하고, 상기 추가 추출된 윈도우 영역에 대해, 성능을 산출할 수 있다.The learning processor 210 may delete the window area whose calculated performance does not satisfy the minimum reference value and exclude the candidate from the background fragment. In addition, the learning processor 210 may additionally extract the window area as much as the deleted number, and calculate the performance of the additional extracted window area.

실시예에 따라, 학습 처리부(210)는 학습용 영상과 별도로 마련된 검증용 데이터를 이용하여, 상기 n개의 윈도우 영역 별 성능을 오름차순으로 정렬한 후 상기 성능이 상위인 윈도우 영역을 상기 배경조각으로서 재선정할 수도 있다.According to an exemplary embodiment, the learning processor 210 sorts the performance of the n window areas in ascending order using verification data provided separately from the learning image, and then reselects the window areas having the higher performance as the background pieces. You may.

이때, 학습 처리부(210)는 상기 학습용 영상에 대한 상기 성능을 고려하여 상기 m을 조정할 수 있다.In this case, the learning processor 210 may adjust the m in consideration of the performance of the learning image.

예를 들어, 학습 처리부(210)는 상기 학습용 영상에서 추출한 m개의 윈도우 영역 각각의 성능에 대한 평균치를 상기 학습용 영상의 평균 성능으로서 간주하고, 상기 성능이 최소기준값 보다 높으면 m을 높이는 조정을 하여 최소 기준값 보다 다소 성능이 낮은 윈도우를 더 채택할 수 있다.For example, the learning processor 210 considers an average value of the performance of each of the m window regions extracted from the learning image as the average performance of the learning image, and if the performance is higher than the minimum reference value, adjusts to increase m to minimize the minimum. A window with slightly lower performance than the reference value can be adopted.

학습 처리부(210)는 m개의 윈도우 영역이 배경조각으로 선정되면, 고유ID 및 명칭 중 적어도 하나를 각 윈도우 영역에 부여할 수 있다.When the m window areas are selected as the background pieces, the learning processor 210 may assign at least one of a unique ID and a name to each window area.

예를 들어, 학습 처리부(210)는 중심점 위치와 가로 크기, 세로 크기, 방향을 참조하여, 윈도우 영역(Window 1)에 명칭 '북부지역 B구역'을 부여하고, 윈도우 영역(Window 2)에 명칭 '동부지역 D구역'을 부여하고, 윈도우 영역(Window 3)에 명칭 '남부지역 A구역'을 부여하고, 윈도우 영역(Window 4)에 명칭 '서부지역 C구역'을 부여할 수 있다.For example, the learning processor 210 may refer to the center point position, the horizontal size, the vertical size, and the direction, to give the window region Window 1 a name “northern region B zone”, and to give the window region Window 2 a name. 'Eastern Zone D' may be assigned, the name 'Southern Zone A' may be assigned to the window region (Window 3), and the name 'West Zone C' 'may be assigned to the window region (Window 4).

학습 처리부(210)는 배경조각으로 선정된 윈도우 영역에 부여되는 고유ID, 명칭, 중심점 위치, 가로 크기, 세로 크기, 윈도우 영역의 회전각도 및 성능(감지율, 정확도) 중 적어도 하나를 포함하는 데이터를, 기계학습 DB(220)에 기록할 수 있다.The learning processor 210 may include data including at least one of a unique ID, a name, a center point position, a horizontal size, a vertical size, a rotation angle of a window region, and performance (detection rate and accuracy) assigned to a window region selected as a background fragment. It can be recorded in the machine learning DB (220).

기계학습 DB(220)는 학습용 영상을 처리하여 감지하고 식별된 객체, 선정된 배경조각 및 이 객체와 배경조각을 학습한 결과인 인공신경망 계수(가중치, 바이어스 등)와 계층 구조에 관한 데이터를 기록, 유지한다.The machine learning DB 220 processes and detects a learning image and records data about the identified object, the selected background fragment, and artificial neural network coefficients (weights, biases, etc.) and the hierarchical structure resulting from learning the object and the background fragment. , Keep.

일례로, 기계학습 DB(220)는 추출된 객체의 크기, 모양, 영상 속 위치, 방향, 다른 객체와의 관계 중 적어도 하나의 객체데이터를 해당 객체와 연관시켜 기록할 수 있다.For example, the machine learning DB 220 may record at least one object data among the size, shape, position in the image, direction, and relationship with other objects of the extracted object in association with the corresponding object.

또한, 기계학습 DB(220)는 선정된 배경조각의 파라미터, 고유ID, 명칭, 객체 또는 다른 배경조각과의 관계 중 적어도 하나의 데이터를 해당 배경조각과 연관시켜 기록할 수 있다.In addition, the machine learning DB 220 may record at least one piece of data of a parameter of a selected background piece, a unique ID, a name, an object, or a relationship with another background piece in association with the background piece.

인식 처리부(230)는 입력되는 실제 영상 내에, 상기 배경조각이 포함되는지 판단한다.The recognition processor 230 determines whether the background fragment is included in the actual image to be input.

인식 처리부(230)는 기계학습 DB(220)으로부터 학습된 인공신경망 계수와 계층 구조를 이용하여 상기 판단 결과 포함되는지 여부를 계산한다. 상기 배경조각에 대응되는 상기 데이터를 검색하고, 검색된 상기 데이터를 이용하여, 상기 실제 영상과 관련한 인식 정보를 화면에 출력한다.The recognition processor 230 calculates whether the determination result is included using the artificial neural network coefficients and the hierarchical structure learned from the machine learning DB 220. The data corresponding to the background fragment is searched for, and the recognition information related to the actual image is output to the screen using the searched data.

예를 들어, 도 5를 참조하면, 인식 처리부(230)는 입력되는 실제 영상(510)에서 배경조각의 유무를 판단하고, 실제 영상(510) 내에 배경조각이 존재하면, 해당 배경조각에 대응하여 기록된 데이터(520)를 기계학습 DB(220)에서 검색하여 화면에 출력할 수 있다.For example, referring to FIG. 5, the recognition processor 230 determines whether there is a background fragment in the inputted real image 510, and if a background fragment exists in the real image 510, the recognition processor 230 corresponds to the corresponding background fragment. The recorded data 520 may be retrieved from the machine learning DB 220 and output on the screen.

이를 통해, 인식 처리부(230)는 입력된 실제 영상에서 객체가 식별되지 않는 경우에도, 학습된 배경조각을 이용해 위치, 방향 등의 항법 정보를 인식할 수 있다.In this way, the recognition processor 230 may recognize the navigation information such as the position and the direction by using the learned background fragment even when the object is not identified in the input real image.

또한, 인식 처리부(230)는 상기 배경조각과 연관된 위치, 크기, 회전각도 및 중심점 위치 중 적어도 하나의 데이터에 기초하여, 상기 실제 영상으로부터 인식되는 촬영 카메라 또는 객체의 움직임에 관한 인식 정보를 작성해 출력할 수 있다.In addition, the recognition processor 230 generates and outputs recognition information regarding a movement of a photographing camera or an object recognized from the real image, based on at least one of a position, a size, a rotation angle, and a center point position associated with the background fragment. can do.

즉, 인식 처리부(230)는 객체와 배경조각들 간의 관계 정보를 이용하여 각 객체의 위치를 보다 정확하게 계산하고 이를 바탕으로 카메라의 움직임 정보를 취득할 수 있다.That is, the recognition processor 230 may calculate the position of each object more accurately by using the relationship information between the object and the background pieces, and acquire the motion information of the camera based on this.

다른 일례로, 인식 처리부(230)는 실제 영상에서 객체가 식별되는 경우, 기계학습 DB(220) 내의 객체에 관해 기록된 객체데이터를, 배경조각에 관한 데이터와 조합하여, 상기 인식 정보를 작성해 출력할 수 있다.As another example, when an object is identified in an actual image, the recognition processor 230 generates the recognition information by combining the object data recorded about the object in the machine learning DB 220 with the data on the background pieces, and outputs the recognition information. can do.

예를 들어, 인식 처리부(230)는 기계학습 DB(220) 내의 객체를 학습한 인공신경망 계수 및 구조를 이용하여 배경조각을 재학습(Transfer Learning)할 수 있다. 결국, 배경조각에 관한 데이터를 객체('비행기')에 관한 객체데이터와 조합하여, 실제 영상과 관련하여 인식되는 정보(520)로서 화면에 출력할 수도 있다.For example, the recognition processor 230 may re-learn background fragments using artificial neural network coefficients and structures that have learned objects in the machine learning DB 220. As a result, the data related to the background fragment may be combined with the object data related to the object ('plane') and output on the screen as the information 520 recognized in relation to the actual image.

이와 같이, 인식 처리부(230)는 학습용 영상에서 객체를 제외한 배경의 일부(이하, '배경조각')를 의미 있는 객체로서 추출하여, 기존의 객체 중심의 기계학습에서 배경을 인식 못하는 문제를 해결하고, 객체와 배경조각과의 조합에 의해 인식 성능과 정확도를 높일 수 있다.As described above, the recognition processor 230 extracts a part of the background (hereinafter, the background fragment) excluding the object from the learning image as a meaningful object, and solves the problem of not recognizing the background in the existing object-oriented machine learning. In addition, recognition performance and accuracy can be improved by combining objects with background fragments.

도 3a은 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상에서 추출하려는 윈도우 영역의 구성을 도시한 도면이고, 도 3b는 학습용 영상에서 복수의 윈도우 영역을 추출하는 일례를 도시한 도면이다.3A is a diagram illustrating a configuration of a window region to be extracted from a training image in an image recognition system according to an embodiment of the present invention, and FIG. 3B is a diagram illustrating an example of extracting a plurality of window regions from a training image. .

도 3a 및 도 3b를 참조하면, 본 발명의 일실시예에 따른 영상 인식 시스템은, 학습용 영상(320)의 배경 영역으로부터 도 3a에 도시한 복수의 파라미터를 가지는 사각형 형태의 윈도우 영역(Window 1, Window 2, Window 3, Window 4)(310)을 추출할 수 있다.3A and 3B, the image recognition system according to an exemplary embodiment of the present invention includes a rectangular window area (Window 1, having a plurality of parameters shown in FIG. 3A) from a background area of the training image 320. Window 2, Window 3, and Window 4) 310 may be extracted.

여기서 윈도우 영역(310)은 사각형으로 한정되지 않고 삼각형이나 원형 등 어떠한 모양이든 가능하며, 복수의 파라미터는 중심점 위치, 중심점 기준으로 한 윈도우 영역의 회전각, 가로 크기 및 세로 크기, 윈도우 영역 간 거리 중 적어도 하나일 수 있다.Here, the window area 310 is not limited to a rectangle and may have any shape such as a triangle or a circle, and the plurality of parameters may include a center point position, a rotation angle of the window area based on the center point, a horizontal size and a vertical size, and a distance between the window areas. There may be at least one.

도 4는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 학습용 영상을 이용하여 배경조각을 기계학습하는 과정을 도시한 도면이다.4 is a diagram illustrating a process of machine learning a background piece using an image for learning in an image recognition system according to an embodiment of the present invention.

도 4에는 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 시스템 내 학습 처리부(400)에서, 학습용 영상을 처리하여 선정한 배경조각에 관한 데이터를, 기계학습 DB에 기록하는 구체적인 과정을 설명하고 있다.FIG. 4 illustrates a detailed process of recording data on a background fragment selected by processing a learning image in a learning processing unit 400 in a machine learning-based image recognition system according to an embodiment of the present invention, in a machine learning DB. have.

일례로, 학습 처리부(400)는 기계학습을 위해 입력된 학습용 영상(도 3b의 320 참조)으로부터, 배경조각의 후보가 되는 윈도우 영역의 위치 및 사이즈를 선정하여 정해진 개수의 윈도우 영역을 추출할 수 있다.For example, the learning processor 400 may extract a predetermined number of window areas by selecting a position and a size of a window area that is a candidate for background fragments from the learning image (see 320 of FIG. 3B) input for machine learning. have.

구체적으로, 학습 처리부(400)는 기계학습을 위한 학습용 영상에서 윈도우 영역의 크기와 위치, 개수를 선정한다. 윈도우 영역을 구성하기 위해 앞서 설명한 윈도우 영역의 설정 가능한 파라미터를 이용하는데 만약 학습 시간과 학습 시스템의 컴퓨팅 성능이 떨어지는 경우 무작위로 파라미터의 값을 결정하여 윈도우 영역을 구성할 수 있다.In detail, the learning processor 400 selects the size, position, and number of window regions in the learning image for machine learning. In order to configure the window area, the above-described configurable parameters of the window area are used. If the learning time and the computing performance of the learning system are inferior, the window area may be configured by randomly determining the parameter value.

이때 가장 좋은 것은 영상의 모든 영역에 대해서 최대한 많은 개수의 다양한 조합으로 생성된 데이터를 이용하는 것이다. 특히 달 표면과 같이 대기가 없는 곳은 태양의 조명 조건 만이 유일한 변수이므로 이러한 정적 환경에서 학습 처리부(400)는 한번에 객체 역할이 가능한 최대의 배경 조각을 찾아내 달 표면 전체 영역의 객체 지도를 자동으로 생성할 수 있다.The best thing to do is to use the data generated in as many different combinations as possible for every area of the image. In particular, in the absence of the atmosphere, such as the moon surface, only the lighting conditions of the sun are the only variables. In such a static environment, the learning processor 400 finds the largest background fragment that can serve as an object at a time, and automatically maps the object map of the entire lunar surface area. Can be generated.

또한, 학습 처리부(400)는 기존의 기계학습 방식과 위치, 방향 및 크기에 기초하여 추출된 윈도우 영역의 인식의 정확도를 산출하기 위한 판별 네트워크를 구성할 수 있다.In addition, the learning processor 400 may configure a determination network for calculating the accuracy of the recognition of the extracted window region based on the existing machine learning method, position, direction, and size.

학습 처리부(400)는 윈도우 영역의 파라미터 구성에 따라 생성된 데이터를 기존의 기계학습 방법(예, 영상의 경우 CNN)을 이용하여 학습을 시키되, 객체 역할을 할 수 있는지 판별하기 위해 위치, 방향, 크기, mAP 등에 기초해 정확도를 산출하는 판별 네트워크(산출식)를 말단에 부착할 수 있다.The learning processor 400 learns the data generated according to the parameter configuration of the window area by using a conventional machine learning method (eg, CNN in the case of an image), but determines the position, direction, A discriminant network (calculation) that calculates accuracy based on size, mAP, etc., can be attached at the end.

여기서 산출식은 기존의 YOLO, SSD, Faster RCNN 등을 사용하여도 무방하며, 학습 처리부(400)는 감지(추출)된 윈도우 영역의 개수, 윈도우 영역 별 위치와 방향, 크기를 사용하여 정확도를 산출할 수도 있다.Here, the calculation formula may use existing YOLO, SSD, Faster RCNN, etc., and the learning processor 400 may calculate accuracy using the number of detected (extracted) window regions, the position, direction, and size of each window region. It may be.

또한, 학습 처리부(400)는 산출된 정확도에 따라 윈도우 영역을 오름차순으로 정렬하고, 최소 기준값 이상의 상위의 정확도를 가지는 윈도우 영역을 배경조각으로 선정할 수 있다.In addition, the learning processor 400 may sort the window areas in ascending order according to the calculated accuracy, and select a window area having an accuracy higher than the minimum reference value as the background fragment.

이때 학습 처리부(400)는 학습용과는 별도로 분류된 검증용 데이터를 이용하여 산출된 윈도우 영역 별 정확도를 오름차순으로 정렬할 수 있다.In this case, the learning processor 400 may sort the accuracy for each window area calculated in ascending order using the verification data classified separately from the learning use.

학습 처리부(400)는 윈도우 영역 별 정확도가 최소 기준값을 상회하면 객체 역할을 할 수 있는 배경조각으로 간주(선정)하고, 배경조각으로 선정된 윈도우 영역에 대해 식별할 수 있는 명칭이나 고유ID를 부여할 수 있다.The learning processor 400 considers (selects) a background fragment that can serve as an object when the accuracy of each window region exceeds a minimum reference value, and assigns a name or a unique ID to identify the window region selected as the background fragment. can do.

여기서 학습 처리부(400)는 기존의 객체 중심 기계학습 시스템에서 학습용 영상으로부터 추출한 객체(예를 들어 "비행기")의 정확도 판별 시 적용되는 기준 정확도와 동일한 값을 최소 기준값으로 사용 함으로써, 학습용 영상에서 객체와 배경조각을 같은 수준으로 취급 가능하도록 할 수 있다.Here, the learning processor 400 uses the same reference accuracy as the minimum reference value applied when determining the accuracy of an object (for example, "airplane") extracted from the training image in the existing object-oriented machine learning system as an object in the training image. And background pieces can be handled at the same level.

이후, 학습 처리부(400)는 배경조각으로 선정된 윈도우 영역 각각에 대해 명칭 또는 고유ID를 부여하여, 기계학습 DB에 기록할 수 있다.Thereafter, the learning processor 400 may assign a name or a unique ID to each of the window areas selected as the background pieces, and record the name or unique ID in the machine learning DB.

예를 들어, 학습 처리부(400)는 윈도우 영역(Window 1)에 명칭 '북부지역 B구역'을 부여하고, 윈도우 영역(Window 2)에 명칭 '동부지역 D구역'을 부여하고, 윈도우 영역(Window 3)에 명칭 '남부지역 A구역'을 부여하고, 윈도우 영역(Window 4)에 명칭 '서부지역 C구역'을 부여할 수 있다.For example, the learning processor 400 assigns the name 'Northern region B zone' to the window region Window 1, gives the name 'Eastern region D region' to the window region Window 2, and sets the window region (Window). 3) The name 'Southern Region A' may be assigned, and the name 'West Region C' may be assigned to the window area (Window 4).

도 5는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 실제 영상으로부터 배경조각을 인식하여 정보를 출력하는 과정을 도시한 도면이다.5 is a diagram illustrating a process of outputting information by recognizing a background fragment from an actual image in an image recognition system according to an exemplary embodiment of the present invention.

도 5에는, 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 시스템 내 인식 처리부(500)에서, 입력되는 실제 영상(510)에서 인식되는 정보(520)를 화면에 출력하는 구체적인 과정이 도시되어 있다.FIG. 5 illustrates a detailed process of outputting the information 520 recognized in the actual image 510 input by the recognition processor 500 in the machine learning-based image recognition system according to an embodiment of the present invention. It is.

여기서, 인식 처리부(500)는 전이 학습(Transfer Learning)된 기계학습 시스템을 이용하여, 배경조각의 위치와 방향, 크기, 인식평가에 근거해, 입력되는 실제 영상(510)으로부터 인식되는, 실제 촬영 지역과 촬영 카메라의 움직임 혹은 객체의 움직임을 포함한 다양한 정보(520)를 작성해 출력할 수 있다.Here, the recognition processing unit 500 is a real photographing, which is recognized from the inputted real image 510 based on the position, direction, size, and recognition evaluation of the background fragment using a machine learning system that has been transferred learning. Various information 520 may be generated and output, including the movement of the region and the photographing camera or the movement of the object.

인식 처리부(500)는 실제 영상(510)이 입력되면, 일반적인 객체("비행기")와 배경조각을 분류하여 실제 영상(510) 속에서 정보(520)를 인식해 출력할 수 있다. 여기서 정보(520)는 객체와 배경조각 자체일 수 있고, 기계학습 과정에서 객체와 배경조각에 관해 기록된 데이터(명칭이나 고유ID 등)를 조합한 정보일 수 있다.When the real image 510 is input, the recognition processor 500 may classify a general object (“airplane”) and a background fragment and recognize and output the information 520 in the real image 510. The information 520 may be an object and a background fragment itself, or may be information obtained by combining data (name or unique ID, etc.) recorded about the object and the background fragment during a machine learning process.

다른 일례로, 인식 처리부(500)는 비행기, 우주선, 차량 등을 학습시킨 종래의 시스템에 앞서 선정된 구역 4개를 추가로 전이학습시키거나 혹은 처음부터 같이 학습시킬 수 있으며, 학습 후에 객체("비행기")를 영상(510) 위에 등장시키면 주변 구역의 위치와 이름을 인식 결과(520)로서 화면에 출력할 수 있다.In another example, the recognition processing unit 500 may further learn four transition zones previously selected from the conventional system in which airplanes, spacecrafts, vehicles, and the like are learned, or learn them from the beginning. Plane ”) on the image 510, the location and name of the surrounding area may be output on the screen as a recognition result 520.

도 6은 본 발명의 일실시예에 따른 영상 인식 시스템에서, 복수의 윈도우 영역을 순차적으로 추출하는 과정을 나타낸 도면이다.6 is a diagram illustrating a process of sequentially extracting a plurality of window regions in an image recognition system according to an embodiment of the present invention.

도 6을 참조하면, 영상 인식 시스템은 객체가 감지되지 않는 학습용 영상(610 내지 640)의 경우, 배경 영역에서 객체의 역할이 가능한 복수의 윈도우 영역을 추출할 수 있다.Referring to FIG. 6, in the case of the learning images 610 to 640 in which an object is not detected, the image recognition system may extract a plurality of window areas that may serve as objects in the background area.

이때, 영상 인식 시스템은 앞서 추출한 윈도우 영역을 이용하여, 다른 윈도우 영역과의 위치 관계 또는 거리 관계에 따라 나머지 윈도우 영역을 순차적으로 추출할 수 있다.In this case, the image recognition system may sequentially extract the remaining window regions according to a positional relationship or a distance relation with other window regions using the previously extracted window region.

즉, 영상 인식 시스템은 기계학습 과정에서 이미 배경 영역인 '달 표면'의 크레이터의 중심점과 윈도우 영역과의 관계를 알고 있으므로, 실제 영상에서 다수의 윈도우 영역의 사이즈와 크기, 방향 위치 관계 거리를 고려해서 윈도우 영역을 순차적으로 감지해 나갈 수 있다.That is, the image recognition system already knows the relationship between the center point of the crater of the moon surface, which is the background area, and the window area in the machine learning process. You can detect window areas sequentially.

구체적으로, 영상 인식 시스템은 배경 영역인 '달 표면'에서 크레이터를 객체로 간주해 배경조각으로 선정하기 위해, 배경 영역에서 음영이나 무늬, 색상이 상이한 부분을 윈도우 영역('Window 1')으로 감지할 수 있다.In detail, the image recognition system detects a portion of the background area that has different shades, patterns, and colors as the window area ('Window 1') in order to select the crater as the background fragment in the background area 'moon surface'. Can be.

또한 영상 인식 시스템은 하나의 윈도우 영역('Window 2')이 감지되면, 감지된 'Window 2'의 중심점으로부터 일정 거리 이내의 원('C1') 주위에 크레이터 중심이 존재하고, 다시 일정 거리 이내의 원('C2') 주위에서 두 번째 윈도우 영역('윈도우 1')를 감지할 수 있다.In addition, when a window area ('Window 2') is detected, the image recognition system has a crater center around a circle ('C1') within a predetermined distance from the detected center point of 'Window 2', and then again within a certain distance. A second window area ('Window 1') can be detected around the circle 'C2'.

또한 영상 인식 시스템은 감지된 2개의 윈도우 영역('Window 2', 'Window 1')의 중심점을 이은 선('L1')으로부터 크레이터 중심점 및 다른 윈도우 영역('윈도우 3', '윈도우 4')의 위치를 찾을 수 있다.In addition, the image recognition system may include a crater center point and other window regions ('Window 3' and 'Window 4') from the line (L1) connecting the center points of the two detected window regions ('Window 2' and 'Window 1'). Find the location of.

영상 인식 시스템은 정해진 개수('4개')의 윈도우 영역이 모두 감지되면, Triangulation과 Bundle Adjustment를 이용하여 실제 영상을 촬영하고 있는 카메라의 움직임 정보를 얻을 수 있다.If the image recognition system detects a predetermined number ('four') of window areas, the camera may obtain motion information of the camera photographing the actual image using triangulation and bundle adjustment.

또한 영상 인식 시스템은 파라미터 값 조정을 통해 각 윈도우 영역의 크기를 변경(보정)하고, 윈도우 영역의 크기가 변경되면, 각 윈도우 영역의 크기 변경 변화를 통해서도 해당 영상을 촬영하는 카메라의 움직임 정보를 얻을 수 있다.In addition, the image recognition system changes (corrects) the size of each window area by adjusting parameter values, and when the size of the window area is changed, obtains motion information of the camera capturing the corresponding image by changing the size of each window area. Can be.

영상 인식 시스템은 각 윈도우 영역을 배경조각으로 선정하여 명칭과 고유ID를 부여하고, 이를 카메라의 움직임 정보와 함께 기계학습 DB에 기록할 수 있다.The image recognition system selects each window area as a background piece, assigns a name and a unique ID, and records the window area together with the camera motion information in the machine learning DB.

이를 통해 영상 인식 시스템은 종래 시스템과 비교하여 객체의 위치 식별 정확도를 향상시킬 수 있으며, 객체가 감지되지 않는 경우에도 항법 정보 및 카메라의 움직임 정보를 용이하게 획득할 수 있다.As a result, the image recognition system can improve the positional identification accuracy of the object compared to the conventional system, and can easily obtain the navigation information and the camera motion information even when the object is not detected.

도 7a, 7b는 본 발명의 일실시예에 따른 영상 인식 시스템에서, 배경 영역을 복수의 격자로 분할하여 윈도우 영역을 추출하는 일례를 도시한 도면이다.7A and 7B illustrate an example of extracting a window region by dividing a background region into a plurality of grids in an image recognition system according to an exemplary embodiment of the present invention.

도 7a 및 도 7b를 참조하면, 본 발명의 일실시예에 따른 영상 인식 시스템은, 학습용 영상의 배경 영역을 p개(상기 p는 1 이상의 자연수, 예를 들어, p=5) 영역의 격자로 분할하고, 상기 p개 영역의 격자에서의 중심점 위치와, 가로 크기 및 세로 크기를 설정하여 윈도우 영역을 추출할 수 있다.7A and 7B, the image recognition system according to an exemplary embodiment of the present invention may include a grid of p background regions (p is one or more natural numbers, for example, p = 5) of a training image. The window region may be extracted by dividing and setting a center point position, a horizontal size, and a vertical size in the grid of the p regions.

예를 들어, 영상 인식 시스템은 도 7a에 도시된 것처럼 학습용 영상의 배경 영역을, 배경 영역 전체(격자 1)와, 배경 영역을 4등분한 영역(격자 2 내지 격자 5)을 포함해, 5개 영역의 격자로 분할할 수 있다.For example, the image recognition system includes five background regions of the training image including the entire background region (lattice 1) and the region (lattice 2 to grid 5) that is divided into four quarters of the background region, as shown in FIG. 7A. Can be divided into a grid of areas.

또한, 영상 인식 시스템은 도 7b에 도시된 것처럼 격자 각각에서 추출하려는 윈도우 영역의 중심점 위치와 가로 세로 크기를 설정하여, 예컨대 격자 1에서 윈도우 영역(Window 1, Window 2)을 추출하고, 격자 5에서 윈도우 영역(Window 3, Window 4)을 추출할 수 있다.Also, as shown in FIG. 7B, the image recognition system sets the center point position and the horizontal and vertical size of the window area to be extracted from each grid, for example, extracts the window areas Window 1 and Window 2 from grid 1, You can extract window regions (Window 3, Window 4).

다른 일례로, 영상 인식 시스템은 배경 영역을 3개 영역의 격자로 분할한 경우, 격자 1에서 2개 윈도우 영역의 중심점 위치를 설정하고, 격자 2에서 1개 윈도우 영역의 중심점 위치를 설정하고, 격자 3에서 중심점 위치를 설정하지 않으면, 윈도우 영역의 총 추출 개수는 격자 별로 합산하면 '3'이 될 수 있다.As another example, when the image recognition system divides a background area into three grids, the image recognition system sets a center point position of two window regions in grid 1, sets a center point position of one window region in grid 2, If the center point position is not set at 3, the total number of extractions of the window area may be '3' when summed by the grids.

도 8은 본 발명의 다른 실시예에 따른 영상 인식 시스템의 구성을 도시한 도면이다.8 is a diagram illustrating a configuration of an image recognition system according to another embodiment of the present invention.

도 8을 참조하면, 본 발명의 일실시예에 따른 영상 인식 시스템(800)은, 학습 처리부(810)와, 인식 처리부(820) 및 기계학습 DB(830)를 포함하여 구성할 수 있다.Referring to FIG. 8, the image recognition system 800 according to the exemplary embodiment of the present invention may include a learning processor 810, a recognition processor 820, and a machine learning DB 830.

학습 처리부(810)는 학습용 영상 내 배경 영역을 적어도 포함하여, n(상기 n은 1 이상의 자연수)개의 윈도우 영역을 추출하고, n개의 윈도우 영역 각각에 대해 산출된 성능을 고려하여 선택한 후보 영역의 개수가, 사용자가 정한 m(상기 m은 1 이상의 자연수)개에 도달하면, 상기 m개의 후보 영역 각각을, 상기 배경조각으로서 선정하여 기계학습 DB(830)에 유지할 수 있다.The learning processor 810 includes at least a background region in the training image, extracts n (where n is a natural number of 1 or more) window regions, and selects the number of candidate regions selected in consideration of the performance calculated for each of the n window regions. When the number of m (m is one or more natural numbers) set by the user is reached, each of the m candidate areas may be selected as the background pieces and maintained in the machine learning DB 830.

일례로, 학습 처리부(810)는 n개의 윈도우 영역 각각에 대해, 콘볼루셔널 인공신경망(도 8의 '인공신경망 A') 기반의 인공신경망 계층의 중간 혹은 말단에서 계산된 중심점 위치 오차, 중심점 기준으로 한 윈도우 영역의 회전각 오차, 실제 촬영 지역 식별 유무, 가로 크기 오차와 세로 크기 오차 및 윈도우 영역 간의 거리 오차 중 적어도 하나를 조합하여, 상기 성능을 산출할 수 있다.For example, the learning processor 810 may calculate, for each of the n window regions, a center point position error and a center point reference calculated at the middle or the end of the artificial neural network layer based on the convolutional neural network (“artificial neural network A” of FIG. 8). The performance may be calculated by combining at least one of a rotation angle error of one window area, whether or not an actual photographing area is identified, a horizontal size error and a vertical size error, and a distance error between the window area.

또한, 학습 처리부(810)는 n개의 윈도우 영역 중에서 상기 성능이 최소 기준값을 만족하는 윈도우 영역을, 후보 영역으로 선택할 수 있다.In addition, the learning processor 810 may select a window area whose performance satisfies the minimum reference value from among n window areas as a candidate area.

또한, 학습 처리부(810)는 n개의 윈도우 영역을 상기 성능에 따라 오름차순으로 정렬했을 때 최소 기준값 이상이면서 상위에 정렬되는 윈도우 영역을, 상기 후보 영역으로 선택할 수도 있다.In addition, the learning processor 810 may select, as the candidate region, a window region that is higher than the minimum reference value and aligned above when the n window regions are arranged in ascending order according to the performance.

예를 들어, 학습 처리부(810)는 n이 '10'이고, m이 '3'으로 설정된 경우, 격자 구조로 분할한 배경 영역에서 격자를 하나의 윈도우 영역으로 하여 10개의 윈도우 영역을 순차적으로 추출할 수도 있고, 배경 영역에서 위치를 지정하지 않고 10개의 윈도우 영역을 랜덤하게 추출할 수도 있다.For example, when n is '10' and m is set to '3', the learning processing unit 810 sequentially extracts 10 window regions from the background region divided into a grid structure as one window region. Alternatively, ten window regions may be randomly extracted without specifying a position in the background region.

학습 처리부(810)는 추출한 10개의 윈도우 영역에 대한 성능(정확도, 감지율 포함)을 산출하고, 산출한 성능이 최소기준값('c') 이상인 6개의 윈도우 영역을 모두 후보 영역으로 선택할 수도 있고, 또는 산출한 성능에 따라 10개의 윈도우 영역을 오름차순으로 정렬했을 때 상위 3개의 윈도우 영역을, 후보 영역으로 선택할 수 있다. 선택된 후보 영역의 개수가 미리 정해진 m개('3개')에 도달 함에 따라, 학습 처리부(810)는 성능이 우수한 m개('3개')의 후보 영역을 배경조각으로 선정할 수 있다.The learning processor 810 may calculate the performance (including accuracy and detection rate) of the extracted 10 window areas, and select all six window areas whose calculated performance is equal to or greater than the minimum reference value 'c' as a candidate area. Alternatively, when the ten window regions are sorted in ascending order according to the calculated performance, the upper three window regions can be selected as candidate regions. As the number of the selected candidate regions reaches m predetermined values ('3'), the learning processor 810 may select m candidate regions having excellent performance ('3') as the background pieces.

이때, 상기 후보 영역의 개수가 상기 m개에 도달하지 않으면, 학습 처리부(810)는 상기 성능이 최소 기준값을 만족하지 않은 윈도우 영역을 삭제하고, 삭제한 개수 만큼, 상기 윈도우 영역을 추가로 추출하여, 추가 추출된 윈도우 영역에 대해 성능을 산출할 수 있다.In this case, if the number of candidate regions does not reach the m number, the learning processor 810 deletes the window regions whose performance does not satisfy the minimum reference value, and additionally extracts the window regions by the deleted number. In addition, the performance may be calculated for the additional extracted window region.

학습 처리부(810)는 상기 m개의 후보 영역 각각의 성능을 평균한 값이, 최소 기준값 보다 임계치 이상 크면, 상기 m을 증가시키는 조정을 할 수 있다.The learning processor 810 may adjust to increase the m when a value obtained by averaging the performance of each of the m candidate areas is larger than a minimum reference value.

다시 말해, 학습 처리부(810)는 배경조각으로 선정된 각 후보 영역의 성능 평균치가 최소 기준값 보다 월등하게 큰 경우에는, 배경조각으로 선정되지 않은 나머지 후보 영역 중에서 배경조각을 추가로 선정할 수 있도록 상기 m을 증가시킬 수 있다.In other words, when the performance average of each candidate region selected as the background fragment is significantly greater than the minimum reference value, the learning processor 810 may further select the background fragment from the remaining candidate regions not selected as the background fragment. m can be increased.

학습 처리부(810)는 상기 증가된 m에 도달할 때까지, 상기 n개의 윈도우 영역을 상기 성능에 따라 오름차순으로 정렬했을 때 상위에 정렬되는 윈도우 영역 중에서 순서대로 상기 후보 영역을 추가로 선택하고, 추가로 선택한 후보 영역을, 배경조각으로서 더 선정할 수 있다.The learning processing unit 810 further selects the candidate areas in order from the window areas arranged above when the n window areas are sorted in ascending order according to the performance, until the increased m is reached, and the addition is performed. The candidate region selected by can be further selected as the background fragment.

학습 처리부(810)는 기계학습 DB(820) 내 객체에 관해 산출된 성능을 고려하여 상기 최소 기준값을 결정하고, 상기 배경조각을 기계학습 DB(820)에 기록 시, 상기 배경조각에 관해 산출된 성능을 더 고려하여, 상기 최소 기준값을 조정할 수 있다.The learning processor 810 determines the minimum reference value in consideration of the performance calculated for the objects in the machine learning DB 820, and calculates the background pieces when the background pieces are recorded in the machine learning DB 820. In further consideration of performance, the minimum reference value can be adjusted.

다시 말해, 최소 기준값은, 기존의 일반적인 기계학습을 통해 학습되어 기계학습 DB(820)에 유지된 객체들의 성능을 이용하여 결정될 수 있으며 이를 통해 객체와 동급의 성능(정확도와 감지율)을 가지는 후보 영역이, 배경조각으로서 선정되도록 할 수 있다.In other words, the minimum reference value may be determined by using the performance of objects that are learned through existing general machine learning and maintained in the machine learning DB 820, and thus candidates having the same performance (accuracy and detection rate) as the objects. The area can be selected as a background piece.

기계학습 DB(830)에 기록된 배경조각은, 전이 기계학습을 통해 인공신경망(도 8의 '인공신경망 B')에 학습될 수 있으며, 학습 처리부(810)는 기존의 기계학습을 통해 상기 인공신경망에 학습된 객체와, 상기 배경조각에 대한 성능을 산출하고, 객체 및 배경조각의 성능에 대한 평균치 혹은 최소값을 이용하여 상기 최소 기준값을 피드백 조정할 수 있다.Background pieces recorded in the machine learning DB (830) can be learned in the artificial neural network ('artificial neural network B' of Figure 8) through the transfer machine learning, the learning processing unit 810 is the artificial through the existing machine learning The minimum reference value may be feedbacked by calculating a performance of the object trained on the neural network and the background fragment, and using an average value or a minimum value of the performance of the object and the background fragment.

이를 통해, 학습 처리부(810)는 최소한 이미 선정된 객체와 배경조각의 성능을 유지할 수 있으며, 반복적인 기계학습을 통해 배경조각으로 선정되는 윈도우 영역들의 성능 평균치를 높일 수 있어, 장기적으로 영상 인식과 정확도를 높일 수 있다.Through this, the learning processor 810 can maintain the performance of at least the previously selected objects and background pieces, and can increase the average performance of the window areas selected as the background pieces through repetitive machine learning. You can increase the accuracy.

학습 처리부(810)는 학습용 영상에 대한 상기 성능을 인식하고, 상기 인식된 성능을 고려하여 상기 m과 상기 n을 조정할 수 있다.The learning processor 810 may recognize the performance of the learning image, and adjust the m and n in consideration of the recognized performance.

이를 통해, 학습 처리부(810)는 촬영 지역, 기상 환경, 시간대 등에 따라 해상도와 명암이 다른 학습용 영상 각각에 대해, 배경조각 선정을 위한 최적화된 m과 n을 결정하도록 할 수 있다.In this way, the learning processor 810 may determine the optimized m and n for selecting the background pieces for the learning images having different resolutions and contrasts according to the photographing region, the weather environment, and the time zone.

인식 처리부(820)는 실제 영상이 입력되면, 기계학습 DB(830)에 기록된 객체 및 배경조각을 이용하여 정보를 인식하는 기능을 할 수 있다.When the actual image is input, the recognition processor 820 may perform a function of recognizing information using the object and the background fragment recorded in the machine learning DB 830.

이하, 도 9에서는 본 발명의 실시예들에 따른 기계학습 기반의 영상 인식 시스템(200)의 작업 흐름을 상세히 설명한다.Hereinafter, the workflow of the machine learning-based image recognition system 200 according to embodiments of the present invention will be described in detail.

도 9은 본 발명의 일실시예에 따른 기계학습 기반의 영상 인식 방법의 순서를 도시한 흐름도이다.9 is a flowchart illustrating a procedure of a machine learning based image recognition method according to an embodiment of the present invention.

본 실시예에 따른 기계학습 기반의 영상 인식 방법은, 상술한 기계학습 기반의 영상 인식 시스템(200)에 의해 수행될 수 있다.The machine learning based image recognition method according to the present embodiment may be performed by the machine learning based image recognition system 200 described above.

도 9을 참조하면, 단계(910)에서, 영상 인식 시스템(200)은, 학습용 영상을 처리하여 선정한 배경조각을, 기계학습 DB에 기록한다.Referring to FIG. 9, in step 910, the image recognition system 200 records a background fragment selected by processing a learning image in a machine learning DB.

즉, 영상 인식 시스템(200)은 학습용 영상의 배경 영역에서 배경조각의 후보가 되는 적어도 하나의 윈도우(window) 영역을 추출하고, 추출된 윈도우 영역 중에서 인식 정확도가 높게 산출되는 윈도우 영역을, 객체의 역할이 가능한 배경조각으로 선정할 수 있다.That is, the image recognition system 200 extracts at least one window region that is a candidate for background fragments from the background region of the training image, and selects a window region of which the recognition accuracy is high among the extracted window regions. It can be selected as a possible background piece.

예를 들어, 도 3a 및 도 3b를 참조하면, 영상 인식 시스템(200)은 도 3b에 도시한 학습용 영상(320)으로부터, 도 3a에 도시한 복수의 파라미터를 가지는 사각형 형태의 4개의 윈도우 영역(Window 1, Window 2, Window 3, Window 4)을 추출할 수 있다.For example, referring to FIGS. 3A and 3B, the image recognition system 200 may include four window regions having a rectangular shape having a plurality of parameters shown in FIG. 3A from the training image 320 illustrated in FIG. 3B. You can extract Window 1, Window 2, Window 3, and Window 4).

또한, 영상 인식 시스템(200)은 배경 영역('달 표면')에서 태양광 등에 의해 밝기가 다른 배경 영역과 상이한 부분을 포함하도록 윈도우 영역을 추출하거나, 크레이터(구덩이)와 같이 음영이 상이한 부분을 포함하도록 윈도우 영역을 추출한 후, 앞서 추출한 윈도우 영역을 이용하여, 다른 윈도우 영역과의 위치 관계 또는 거리 관계에 따라 나머지 윈도우 영역을 추출할 수도 있다.In addition, the image recognition system 200 extracts a window region to include a portion different from the background region having different brightness by sunlight or the like in the background region ('moon surface'), or a portion having a different shade such as a crater (pit). After extracting the window area to include, the remaining window area may be extracted according to a positional relationship or a distance relationship with another window area by using the previously extracted window area.

또한, 영상 인식 시스템(200)은 학습용 영상의 배경 영역에서 정해진 개수('4개')의 윈도우 영역이 모두 추출되면, 각각의 윈도우 영역 각각에 대해 산출되는 정확도가 최소 기준값을 상회하는 경우에 각 윈도우 영역을 배경조각으로 선정할 수 있다.In addition, when the image recognition system 200 extracts a predetermined number ('four') of window regions from the background region of the training image, the accuracy calculated for each window region exceeds the minimum reference value. The window area can be selected as the background fragment.

여기서 상기 정확도는 딥 러닝 기반의 FAST 객체 탐색 기법 중 하나인 욜로(YOLO), SSD, Faster RCNN 중 어느 하나에 기초한 산출식에 따라 산출될 수 있다.The accuracy may be calculated according to a calculation formula based on one of YOLO, SSD, and Faster RCNN, which are one of deep learning based FAST object discovery techniques.

영상 인식 시스템(200)은 선정된 배경조각의 파라미터, 크기, 모양, 영상 속 위치, 방향, 객체 또는 다른 배경조각과의 관계 중 적어도 하나의 데이터를 해당 배경조각과 연관시켜 기계학습 DB(220)에 기록할 수 있다.The image recognition system 200 associates at least one piece of data of a parameter, a size, a shape, a position in an image, a direction, an object, or another background fragment of the selected background fragment with the background fragment and machine learning DB 220. Can be written on.

마찬가지로, 영상 인식 시스템(200)은 학습용 영상을 처리하여 추출한 객체의 크기, 모양, 영상 속 위치, 방향, 다른 객체와의 관계 중 적어도 하나의 객체데이터를 해당 객체와 연관시켜 기계학습 DB(220)에 기록할 수 있다.Similarly, the image recognition system 200 associates at least one object data among the size, shape, position, direction, and relationship with other objects of the object extracted by processing the learning image and the machine learning DB 220. Can be written on.

단계(920)에서, 영상 인식 시스템(200)은, 실제 영상이 입력되는지 확인한다. 실제 영상이 입력되지 않는 경우, 단계(920)를 반복 수행하여 실제 영상의 입력을 대기한다.In operation 920, the image recognition system 200 determines whether an actual image is input. If the actual image is not input, step 920 is repeated to wait for input of the actual image.

실제 영상이 입력되는 경우, 단계(930)에서, 영상 인식 시스템(200)은, 상기 실제 영상 내에 상기 배경조각이 포함되는지 판단한다.When an actual image is input, in operation 930, the image recognition system 200 determines whether the background fragment is included in the actual image.

상기 실제 영상 내에 상기 배경조각이 포함되는 경우, 단계(940)에서, 영상 인식 시스템(200)은, 상기 배경조각에 관한 데이터를 이용하여, 상기 실제 영상으로부터 인식되는 정보를 작성해 화면에 출력한다.When the background fragment is included in the real image, in operation 940, the image recognition system 200 generates information recognized from the real image by using the data about the background fragment and outputs it to the screen.

예를 들어, 도 5를 참조하면, 영상 인식 시스템(200)은 입력되는 실제 영상(510)에서 배경조각의 유무를 판단하고, 실제 영상(510) 내에 배경조각이 존재하면, 해당 배경조각에 대응하여 기록된 데이터(520)를 기계학습 DB(220)에서 검색하여 화면에 출력할 수 있다.For example, referring to FIG. 5, the image recognition system 200 determines whether a background fragment exists in the input real image 510, and if a background fragment exists in the real image 510, the image recognition system 200 corresponds to the background fragment. The recorded data 520 may be retrieved from the machine learning DB 220 and output on the screen.

이를 통해, 영상 인식 시스템(200)은 입력된 실제 영상에서 객체가 추출되지 않는 경우에도, 학습된 배경조각을 이용해 위치, 방향 등의 항법 정보를 인식할 수 있다.In this way, the image recognition system 200 may recognize navigation information such as position and direction by using the learned background fragments even when the object is not extracted from the input real image.

또한, 영상 인식 시스템(200)은 객체와 배경조각들 간의 관계 정보를 이용하여 각 객체의 위치를 보다 정확하게 계산하고 이를 바탕으로 카메라의 움직임 정보를 취득할 수 있다.In addition, the image recognition system 200 may calculate the position of each object more accurately by using the relationship information between the object and the background pieces, and acquire motion information of the camera based on this.

또한, 영상 인식 시스템(200)은 배경조각에 관한 데이터를 객체('비행기')에 관한 객체데이터와 조합하여, 실제 영상과 관련하여 인식되는 정보로서 화면에 출력할 수 있다.In addition, the image recognition system 200 may combine the data on the background pieces with the object data on the object ('plane'), and output the information on the screen as information recognized in relation to the actual image.

이와 같이, 본 발명에 따르면, 학습용 영상에서 객체의 역할을 할 수 있는 유익한 배경조각을 선정해 기계학습 하고 이를 객체와 조합하여, 입력된 실제 영상에 대한 인식 성능을 향상시킬 수 있다.As described above, according to the present invention, it is possible to improve the recognition performance of the input real image by selecting and machine learning beneficial background pieces that can act as objects in the learning image and combining them with the object.

본 발명의 실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Method according to an embodiment of the present invention can be implemented in the form of program instructions that can be executed by various computer means may be recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described by the limited embodiments and the drawings as described above, various modifications and variations are possible to those skilled in the art from the above description. For example, the described techniques may be performed in a different order than the described method, and / or components of the described systems, structures, devices, circuits, etc. may be combined or combined in a different form than the described method, or other components. Or even if replaced or substituted by equivalents, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are within the scope of the claims that follow.

200: 영상 인식 시스템
210: 학습 처리부
220: 기계학습 DB
230: 인식 처리부200: image recognition system
210: learning processing unit
220: machine learning DB
230: recognition processing unit

Claims

Recording the selected background fragment by processing the learning image in a machine learning DB;
Determining whether the background fragment is included in an actual input image; And
If included as a result of the determination,
Retrieving data corresponding to the background fragment from the machine learning DB, and outputting recognition information related to the actual image on the screen by using the retrieved data.
Machine learning based image recognition method comprising a.

The method of claim 1,
Extracting n window regions including at least a background region in the learning image, wherein n is one or more natural numbers; And
When the number of candidate regions selected in consideration of the calculated performance for each of the n window regions reaches m determined by the user (where m is a natural number of 1 or more),
Selecting each of the m candidate regions as the background fragment
Machine learning based image recognition method further comprising.

The method of claim 2,
Adjusting the m if the average value of the performance of each of the m candidate areas is greater than or equal to a threshold than the minimum reference value;
Further selecting the candidate region from among window regions arranged at an upper side when the n window regions are sorted in ascending order according to the performance until the increased m is reached; And
Further selecting the additionally selected candidate region as the background fragment
Machine learning based image recognition method further comprising.

The method of claim 3,
Determining the minimum reference value in consideration of the performance calculated on the object in the machine learning DB; And
Adjusting the minimum reference value by further considering performance calculated with respect to the background fragment when the background fragment is recorded in the machine learning DB;
Machine learning based image recognition method further comprising.

The method of claim 2,
For each of the n window areas, the center point position error calculated at the middle or the end of the convolutional neural network layer based on the convolutional neural network, the rotation angle error of the window area based on the center point, the presence or absence of the actual photographing area, and the horizontal size error And calculating at least one of the vertical size error and the distance error between the window area to calculate the performance.
Selecting a window area whose performance satisfies a minimum reference value among the n window areas as the candidate area;
If the number of candidate regions does not reach the m number,
Deleting window areas whose performance does not satisfy the minimum reference value, and further extracting the window areas by the deleted number; And
Repeating the calculating step for the additionally extracted window region
Machine learning based image recognition method further comprising.

The method of claim 5,
The calculating step,
Calculating the performance, including at least one of a detection rate regarding a ratio of a background fragment selected from the window region extracted from the training image, and an accuracy of position identification using the window region selected as the background fragment;
Machine learning based image recognition method comprising a.

The method of claim 5,
Arranging the performance for each of the n window regions in ascending order using the verification data provided separately from the training image, and selecting the window region having the higher performance as the background fragment.
Machine learning based image recognition method further comprising.

The method of claim 2,
Recognizing the performance of the learning image; And
Adjusting m and n in consideration of the recognized performance
Machine learning based image recognition method further comprising.

The method of claim 2,
The recording step,
At least one of a unique ID, a name, a center point position, a horizontal size, a vertical size, a rotation angle of a window region, a performance, a distance between different background fragments, and an artificial neural network coefficient and structure after machine learning Recording the data in the machine learning DB corresponding to the background pieces;
Machine learning based image recognition method comprising a.

The method of claim 2,
Extracting the window area,
Dividing the background area into a grid structure; And
Setting a center point position of the n window regions to be extracted from the background region divided into the grid structure, and at least one parameter of a horizontal size, a vertical size, and a rotation angle
Machine learning based image recognition method comprising a.

The method of claim 1,
The outputting step,
Creating and outputting a movement of a photographing camera or an object as the recognition information based on at least one of a position, a size, a direction, a center point, and another distance between the background pieces with respect to the background pieces;
Machine learning based image recognition method comprising a.

The method of claim 1,
When an object is extracted from the real image,
The outputting step,
Creating and outputting the recognition information by combining the object data recorded about the object with the data on the background pieces;
Machine learning based image recognition method comprising a.

A learning processor that records the selected background pieces by processing the learning image in a machine learning DB; And
It is determined whether the background fragment is included in the actual image to be input. If the determination result is included, the data corresponding to the background fragment is retrieved from the machine learning DB, and the searched data is used to search for the background image. Recognition processing unit for outputting the recognition information on the screen
Machine learning based image recognition system comprising a.

The method of claim 13,
The learning processing unit,
Extracting n window regions including at least a background region in the training image, wherein n is a natural number of 1 or more;
When the number of candidate regions selected in consideration of the calculated performance for each of the n window regions reaches m determined by the user (where m is a natural number of 1 or more),
Each of the m candidate regions is selected as the background fragment.
Machine learning based image recognition system.

The method of claim 14,
The learning processing unit,
If the value obtained by averaging the performance of each of the m candidate areas is greater than or equal to the threshold value, the m value is adjusted to increase.
Further selecting the candidate region from among window regions arranged at an upper side when the n window regions are sorted in ascending order according to the performance until the increased m is reached,
Selecting the additionally selected candidate region as the background fragment
Machine learning based image recognition system.

The method of claim 15,
The learning processing unit,
The minimum reference value is determined in consideration of the performance calculated for the object in the machine learning DB, and when the background piece is recorded in the machine learning DB, the minimum reference value is further considered in consideration of the performance calculated for the background fragment. To adjust
Machine learning based image recognition system.

The method of claim 14,
The learning processing unit,
For each of the n window areas, the center point position error calculated at the middle or the end of the convolutional neural network layer based on the convolutional neural network, the rotation angle error of the window area based on the center point, the presence or absence of the actual photographing area, and the horizontal size error And calculating at least one of the vertical size error and the distance error between the window area to calculate the performance,
Selecting a window area whose performance satisfies a minimum reference value among the n window areas as the candidate area,
If the number of candidate regions does not reach the m number,
Deleting the window area whose performance does not satisfy the minimum reference value, and extracting the window area as much as the deleted number;
For the additionally extracted window area, calculate the performance
Machine learning based image recognition system.

The method of claim 13,
The recognition processing unit,
Based on at least one piece of data of the position, size, direction, center point, and distance between the other background fragments with respect to the background fragments, the motion of the photographing camera or the object is created and output as the recognition information.
Machine learning based image recognition system.