KR101900185B1

KR101900185B1 - Method and Device for Learning Image for Object Recognition

Info

Publication number: KR101900185B1
Application number: KR1020170022489A
Authority: KR
Inventors: 이철희
Original assignee: 연세대학교 산학협력단
Priority date: 2017-02-20
Filing date: 2017-02-20
Publication date: 2018-09-18
Also published as: KR20180096164A

Abstract

객체 인식을 위한 영상 학습 방법 및 방치가 개시된다. 개시된 장치는, 학습할 입력 영상을 입력받는 영상 입력부; 상기 입력 영상을 입력받아 학습하며, 상기 입력 영상의 특정 영역에 대한 필터링을 수행하는 필터부를 포함하는 학습부를 포함하되, 상기 필터부는 다수의 필터, 미리 설정된 조건에 해당될 경우 상기 다수의 필터 중 특정 필터의 필터링 값을 미리 설정된 값으로 설정하는 스위치부를 포함한다. 개시된 방법 및 장치에 의하면, 딥러닝을 이용한 객체 인식에서 인식률을 향상시키고 다양한 배경에 대상 객체가 위치하더라도 높은 인식률로 객체를 인식할 수 있는 장점이 있다. An image learning method for object recognition and neglect is disclosed. The disclosed apparatus includes: a video input unit receiving an input video to be learned; And a learning unit that receives and learns the input image and performs a filtering on a specific region of the input image. The filter unit includes a plurality of filters, and, if the predetermined condition is satisfied, And a switch unit for setting a filtering value of the filter to a predetermined value. According to the disclosed method and apparatus, it is possible to improve the recognition rate in object recognition using deep learning, and to recognize an object at a high recognition rate even when a target object is located in various backgrounds.

Description

TECHNICAL FIELD [0001] The present invention relates to an image learning method and apparatus for object recognition,

본 발명의 실시예들은 영상 학습 방법에 관한 것으로서, 더욱 상세하게는 영상으로부터 객체를 효과적으로 인식하기 위한 영상 학습 방법에 관한 것이다. Embodiments of the present invention relate to an image learning method, and more particularly, to an image learning method for effectively recognizing an object from an image.

근래에 들어 딥러닝은 얼굴 인식, 전신 인식, 자세 인식, 음성 인식, 객체 인식, 데이터 마이닝 등 다양한 인식에 적용되고 있다. 특히, 영상으로부터 특정 객체를 인식하는 객체 인식과 딥러닝 학습 네트워크를 접목시키는 연구는 다양한 방식으로 활발히 이루어지고 있다. In recent years, deep learning has been applied to various recognition such as face recognition, body recognition, posture recognition, speech recognition, object recognition, and data mining. Particularly, researches for combining object recognition and deep learning learning network that recognize a specific object from video are actively performed in various ways.

딥러닝에 의한 객체 인식의 가장 큰 장점은 기존에는 연구자들이 인식을 위해 구축한 최적의 특징(SIFT, LBP, HOG 등) 설계에 많은 노력이 필요했지만 딥러닝은 데이터로부터 자연슬운 특징을 스스로 학습할 수 있어 해당 분야 전문가의 지식이나 응용 분야의 제한을 적게 받는다는 점이다. 또한, 일반적으로 데이터를 분류하기 위한 특징 공간은 쉽게 구분되지 않는 manifold space로 볼 수 있는데, 딥러닝은 다층 구조에 의한 여러 번의 비선형 변환으로 manifold를 풀어 데이터간의 분류가 쉽게 되어 인식 성능의 향상에 크게 기여할 수 있는 장점이 있다. The most important advantage of object recognition by deep learning is that it requires a great deal of effort in designing optimal features (SIFT, LBP, HOG, etc.) that researchers have built for recognition in the past, but deep learning, And it is less subject to the knowledge and application limitations of experts in the field. In general, the feature space for classifying the data can be regarded as a manifold space which is not easily distinguished. Deep learning is a method of solving the manifold by solving the manifold by a plurality of nonlinear transformations by a multi-layered structure, There is an advantage to contribute.

이러한 장점에도 불구하고, 딥러닝을 이용한 객체 인식은 여전히 초기 단계여서 인식률이 높지 않으며, 특히 학습 영상에 배경과 객체가 함께 포함되는 경우, 배경 정보도 함께 학습되어 동일한 객체가 상이한 배경으로 입력될 경우 인식률이 현저히 떨어지는 문제점이 있다. Despite these advantages, object recognition using deep learning is still in the early stage and the recognition rate is not high. In particular, when the background image and the object are included in the learning image, the background information is also learned and the same object is input as a different background There is a problem that the recognition rate is remarkably low.

도 10은 CNN을 이용하여 인형 영상을 학습한 후 인형를 제거거나 변경한 영상을 나타낸 도면이다. FIG. 10 is a view showing images of puppets removed or modified after learning puppet images using CNN.

도 10을 참조하면, 대표적인 딥러닝인 구조의 하나인 CNN(Convolutional Neural Network)으로 (a) 영상인 인형에 대한 영상을 학습 시킨 후 (b) 인형을 제거하고 배경만 존재하는 영상과 (c) 인형을 컵으로 교체한 영상을 인식시킬 때, CNN은 여전히 (b) 영상과 (c) 영상을 배경 정보가 유사함으로 여전히 인형으로 인식할 수 있는 문제점이 있다. 10, a CNN (Convolutional Neural Network), which is one of typical deep-learning structures, is used to (a) learn an image of a doll, (b) remove a doll, CNN still has the problem that (b) image and (c) image can still be recognized as a doll because the background information is similar when recognizing an image of a doll with a cup replaced.

본 발명은 인식률을 향상 시킬 수 있는 객체 인식을 위한 영상 학습 장치 및 방법을 제안한다. The present invention proposes an image learning apparatus and method for object recognition capable of improving the recognition rate.

또한, 본 발명은 배경으로부터 대상 객체를 효과적으로 인식하기 위한 영상 학습 장치 및 방법을 제안한다. In addition, the present invention proposes an image learning apparatus and method for effectively recognizing a target object from the background.

본 발명의 일 측면에 따르면, 학습할 입력 영상을 입력받는 영상 입력부; 상기 입력 영상을 입력받아 학습하며, 상기 입력 영상의 특정 영역에 대한 필터링을 수행하는 필터부를 포함하는 학습부를 포함하되, 상기 필터부는 다수의 필터, 미리 설정된 조건에 해당될 경우 상기 다수의 필터 중 특정 필터의 필터링 값을 미리 설정된 값으로 설정하는 스위치부를 포함하는 객체 인식을 위한 영상 학습 장치가 제공된다. According to an aspect of the present invention, there is provided an image processing apparatus including: a video input unit receiving input video to be learned; And a learning unit that receives and learns the input image and performs a filtering on a specific region of the input image. The filter unit includes a plurality of filters, and, if the predetermined condition is satisfied, There is provided an image learning apparatus for object recognition including a switch unit for setting a filtering value of a filter to a preset value.

상기 미리 설정된 값은 ‘0’이다. The preset value is '0'.

상기 필터부는, 상기 다수의 필터 각각이 필터링하는 영역이 객체 영역인지 또는 배경 영역인지를 판단하는 컨트롤러를 더 포함하고, 상기 스위치부는 상기 컨트롤러가 특정 필터의 필터링 영역이 배경 영역이라고 판단할 경우 해당 필터의 필터링 값을 0으로 설정한다. Wherein the filter unit further comprises a controller for determining whether an area to be filtered by each of the plurality of filters is an object area or a background area, and when the controller determines that the filtering area of the specific filter is a background area, Is set to zero.

상기 컨트롤러는 객체의 특징에 대한 기초 정보를 이용하여 필터링 영역이 객체 영역인지 또는 배경 영역인지 여부를 판단한다. The controller determines whether the filtering region is an object region or a background region using basic information about the characteristics of the object.

상기 학습부는 상기 필터부에 의한 필터링을 통해 특징 지도들을 생성하는 과정 및 상기 특징 지도들을 공간적으로 통합하여 사이즈를 축소시키는 서브샘플링 과정을 반복적으로 수행하며, 상기 스위치부는 반복적인 필터링 과정마다 상기 배경 영역에 대한 필터링 값을 0으로 설정한다.Wherein the learning unit repeatedly performs a process of generating feature maps through filtering by the filter unit and a sub-sampling process of spatially integrating the feature maps to reduce the size, and the switch unit repeatedly performs filtering 0.0 > 0 < / RTI >

본 발명의 다른 측면에 따르면, 학습할 입력 영상을 입력받는 영상 입력부; 상기 입력 영상을 입력받아 학습하며, 상기 입력 영상의 특정 영역에 대한 필터링을 수행하는 필터부를 포함하는 학습부를 포함하되,상기 필터부는 다수의 필터, 상기 다수의 필터 각각의 필터링 값에 적용할 영역별 가중치를 결정하여 적용하는 영역별 가중치 적용부를 포함하는 객체 인식을 위한 영상 학습 장치가 제공된다. According to another aspect of the present invention, there is provided an image processing apparatus including: a video input unit receiving an input video to be learned; And a learning unit that receives and learns the input image and performs a filtering on a specific region of the input image. The filter unit includes a plurality of filters, a plurality of filters to be applied to the filtering values of the plurality of filters, There is provided an image learning apparatus for recognizing an object including a region-specific weight applying unit for determining and applying a weight.

상기 필터부는, 상기 다수의 필터 각각이 필터링하는 영역이 객체 영역인지 또는 배경 영역인지를 판단하는 컨트롤러를 더 포함하고, 상기 영역별 가중치 적용부는 상기 컨트롤러의 판단 결과에 기초하여 영역별 가중치를 결정한다. The filter unit may further include a controller for determining whether an area to be filtered by each of the plurality of filters is an object area or a background area, and the weighting application unit for each area determines a weight for each area based on a determination result of the controller .

상기 영역별 가중치 적용부는 필터링 영역이 객체 영역일 경우 상대적으로 높은 가중치를 적용하고, 필터링 영역이 배경 영역일 경우 상대적으로 낮은 가중치를 적용한다. The weight applying unit according to each region applies a relatively high weight when the filtering region is the object region and a relatively low weight when the filtering region is the background region.

상기 학습부는 상기 필터부에 의한 필터링을 통해 특징 지도들을 생성하는 과정 및 상기 특징 지도들을 공간적으로 통합하여 사이즈를 축소시키는 서브샘플링 과정을 반복적으로 수행하며, 상기 영역별 가중치 적용부는 반복적인 필터링 과정마다 설정된 영역별 가중치를 적용한다. Wherein the learning unit repeatedly performs a process of generating feature maps through filtering by the filter unit and a sub-sampling process of reducing the size by spatially integrating the feature maps, and the weighting unit for each region repeatedly performs filtering Applies the set weight for each area.

본 발명의 또 다른 측면에 따르면, 학습할 입력 영상을 입력받는 단계(a); 상기 입력 영상을 입력받아 학습하며, 상기 입력 영상의 특정 영역에 대한 필터링을 수행하는 학습 단계(b)를 포함하되, 상기 단계 (b)는 미리 설정된 조건에 해당될 경우 상기 필터링에 따른 필터링 값을 미리 설정된 값으로 설정하는 객체 인식을 위한 영상 학습 방법이 제공된다. According to another aspect of the present invention, there is provided an image processing method comprising the steps of: (a) receiving an input image to be learned; (B) learning the input image by learning and performing filtering on a specific region of the input image, wherein the step (b) includes the step of filtering the filtering value according to the predetermined condition There is provided an image learning method for object recognition which is set to a preset value.

본 발명의 또 다른 측면에 따르면, 학습할 입력 영상을 입력받는 단계(a); 상기 입력 영상을 입력받아 학습하며, 상기 입력 영상의 특정 영역에 대한 필터링을 수행하는 학습 단계(b)를 포함하되, 상기 단계 (b)는 상기 필터링에 따른 필터링 값에 적용할 영역별 가중치을 결정하여 적용하는 단계를 포함하는 객체 인식을 위한 영상 학습 방법이 제공된다. According to another aspect of the present invention, there is provided an image processing method comprising the steps of: (a) receiving an input image to be learned; And a learning step (b) of receiving and learning the input image and performing filtering on a specific area of the input image. In step (b), a weight for each area to be applied to the filtering value according to the filtering is determined An image learning method for object recognition is provided.

본 발명의 또 다른 측면에 따르면, 레퍼런스 영상으로부터 객체를 추출하는 객체 추출부; 상기 추출된 객체에 적용된 다수의 배경 영상을 생성하는 배경 생성부; 상기 다수의 배경 영상 각각과 상기 추출된 객체를 합성하여 동일 객체에 대한 다수의 입력 영상으로 이루어진 입력 영상 세트를 생성하는 입력 영상 세트 생성부; 상기 입력 영상 세트에 포함된 입력 영상들을 학습하는 학습부를 포함하는 객체 인식을 위한 영상 학습 장치가 제공된다. According to another aspect of the present invention, there is provided an image processing apparatus including an object extracting unit for extracting an object from a reference image; A background generator for generating a plurality of background images applied to the extracted object; An input image generation unit for generating an input image set composed of a plurality of input images for the same object by combining the plurality of background images and the extracted objects; And a learning unit for learning input images included in the input image set.

본 발명의 또 다른 측면에 따르면, 레퍼런스 영상으로부터 객체를 추출하는 단계(a); 상기 추출된 객체에 적용된 다수의 배경 영상을 생성하는 단계(b); 상기 다수의 배경 영상 각각과 상기 추출된 객체를 합성하여 동일 객체에 대한 다수의 입력 영상으로 이루어진 입력 영상 세트를 생성하는 단계(c); 및 상기 입력 영상 세트에 포함된 입력 영상들을 학습하는 단계(d)를 포함하는 객체 인식을 위한 영상 학습 방법이 제공된다. According to still another aspect of the present invention, there is provided a method of extracting a reference image, comprising: (a) extracting an object from a reference image; (B) generating a plurality of background images applied to the extracted object; (C) generating an input image set including a plurality of input images for the same object by combining the plurality of background images and the extracted objects; And a step (d) of learning input images included in the input image set.

본 발명에 의하면, 딥러닝을 이용한 객체 인식에서 인식률을 향상시키고 다양한 배경에 대상 객체가 위치하더라도 높은 인식률로 객체를 인식할 수 있는 장점이 있다. According to the present invention, there is an advantage in that recognition accuracy is improved in object recognition using deep learning, and an object can be recognized at a high recognition rate even if a target object is located in various backgrounds.

도 1은 본 발명의 제1 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도.
도 2는 본 발명의 제1 실시예에 따라 생성되는 입력 영상 세트의 일례를 도시한 도면.
도 3은 본 발명의 제2 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도.
도 4는 본 발명의 제2 실시예에 따른 영상 학습 장치에서 필터부의 동작 구조를 도시한 도면.
도 5는 본 발명의 제2 실시예에 따른 객체 인식을 위한 영상 학습 장치에서 학습부의 학습 동작 구조를 도시한 도면.
도 6은 본 발명의 제3 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도.
도 7은 본 발명의 제3 실시예에 따른 영상 학습 장치에서 필터부의 동작 구조를 도시한 도면.
도 8은 본 발명의 제1 실시예 및 제2 실시예가 결합된 영상 학습 방법의 전체적인 흐름을 도시한 순서도.
도 9는 본 발명의 제1 실시예 및 제3 실시예가 결합된 영상 학습 방법의 전체적인 흐름을 도시한 순서도.
도 10은 CNN을 이용하여 인형 영상을 학습한 후 인형을 제거하거나 다른 물체로 대체한 영상을 나타낸 도면.1 is a block diagram illustrating a configuration of an image learning apparatus for object recognition according to a first embodiment of the present invention;
2 is a diagram illustrating an example of an input image set generated according to the first embodiment of the present invention;
3 is a block diagram illustrating a configuration of an image learning apparatus for object recognition according to a second embodiment of the present invention;
4 is a diagram illustrating an operation structure of a filter unit in an image learning apparatus according to a second embodiment of the present invention.
5 is a diagram illustrating a learning operation structure of a learning unit in an image learning apparatus for object recognition according to a second embodiment of the present invention.
FIG. 6 is a block diagram illustrating a configuration of an image learning apparatus for object recognition according to a third embodiment of the present invention; FIG.
7 is a diagram illustrating an operation structure of a filter unit in an image learning apparatus according to a third embodiment of the present invention.
8 is a flowchart showing an overall flow of an image learning method combined with the first embodiment and the second embodiment of the present invention;
9 is a flowchart showing an overall flow of an image learning method combined with the first embodiment and the third embodiment of the present invention.
10 is a view showing an image obtained by learning a doll image using CNN and then removing a doll or replacing it with another object.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다. While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.

이하에서, 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 제1 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도이다. 1 is a block diagram illustrating a configuration of an image learning apparatus for object recognition according to a first embodiment of the present invention.

도 1을 참조하면, 본 발명의 제1 실시예에 따른 객체 인식을 위한 영상 학습 장치는 객체 추출부(100), 배경 생성부(102), 입력 영상 세트 생성부(104), 영상 입력부(106) 및 학습부(108)를 포함한다. 1, an image learning apparatus for object recognition according to a first embodiment of the present invention includes an object extraction unit 100, a background generation unit 102, an input image generation unit 104, an image input unit 106 And a learning unit 108. [0050]

객체 추출부(100)는 학습에 사용할 레퍼런스 영상으로부터 객체를 별도로 추출한다(100). 학습에 사용할 레퍼런스 영상은 배경 및 객체를 포함하며, 객체 추출부(100)는 객체만을 영상으로부터 추출한다. The object extracting unit 100 extracts an object separately from a reference image to be used for learning (100). The reference image to be used for learning includes a background and an object, and the object extraction unit 100 extracts only the object from the image.

레퍼런스 영상으로부터의 객체 추출은 알려진 다양한 방식으로 이루어질 수 있을 것이다. 일례로, 작업자가 객체 영역을 직접 특정하여 객체 추출이 이루어질 수도 있을 것이며, 다른 예로 알려진 다양한 객체 추출 소프트웨어를 이용할 수도 있을 것이다. Object extraction from a reference image may be done in a variety of known ways. For example, an object may be extracted by an operator by directly specifying an object region, and various other object extraction software may be used as another example.

배경 생성부(102)는 객체 추출부(100)에서 추출된 객체에 적용할 다수의 배경 영상을 생성한다. 여기서, 배경 영상은 미리 준비된 영상일 수도 있으며 동적으로 생성되는 영상일 수도 있다. The background generating unit 102 generates a plurality of background images to be applied to the object extracted by the object extracting unit 100. Here, the background image may be a prepared image or a dynamically generated image.

본 발명의 일 실시예에 따르면, 랜덤 생성기를 이용하여 각 픽셀별로 무작위 수가 배정된 배경이 배경 생성부(102)에 의해 생성될 수 있다. According to an embodiment of the present invention, a background in which a random number is assigned to each pixel using a random generator may be generated by the background generating unit 102.

입력 영상 세트 생성부(104)는 객체 추출부(100)에서 추출된 객체와 배경 생성부(102)에서 생성된 다수의 배경 중 하나를 합성하여 입력 영상을 생성하며, 이러한 입력 영상은 객체와 각 배경으로 이루어지고, 따라서 동일 객체에 대해 배경이 서로 다른 다수의 입력 영상으로 이루어진 입력 영상 세트가 입력 영상 세트 생성부(104)에 의해 생성된다. 입력 영상 세트를 구성하는 입력 영상의 수는 설정에 따라 다양하게 변경될 수 있을 것이다. The input image set generation unit 104 generates an input image by combining an object extracted by the object extraction unit 100 and one of a plurality of backgrounds generated in the background generation unit 102, The input image set generation unit 104 generates an input image set including a plurality of input images having different backgrounds for the same object. The number of input images constituting the input image set may be variously changed according to the setting.

하나의 객체에 다수의 배경이 합성된 영상이므로 입력 영상 세트를 구성하는 입력 영상들은 객체는 동일하면서 배경만이 상이한 영상이다. 물론, 입력 영상 세트를 구성하는 입력 영상에는 객체 추출에 사용된 레퍼런스 영상이 포함될 수도 있을 것이다. Since an image is composed of a plurality of backgrounds, an input image constituting an input image set is an image having the same objects but different backgrounds. Of course, the input image constituting the input image set may include the reference image used for object extraction.

도 2는 본 발명의 제1 실시예에 따라 생성되는 입력 영상 세트의 일례를 도시한 도면이다. 2 is a diagram illustrating an example of an input image set generated according to the first embodiment of the present invention.

설명의 편의상 도 2에는 두 개의 입력 영상만이 도시되어 있으며, 우측의 영상은 랜덤 생성기에 의해 무작위로 생성된 배경이 객체에 결합된 입력 영상이다. For the convenience of explanation, only two input images are shown in FIG. 2, and the right image is an input image in which a background randomly generated by a random generator is combined with an object.

영상 입력부(106)는 입력 영상 세트 생성부(104)에 의해 생성된 입력 영상들을 학습부(108)에 입력한다. 이와 같이 하나의 객체에 대해 다수의 배경이 결합된 입력 영상 세트로 학습을 시킬 경우 다양한 배경에 객체가 위치하더라도 인식 성능을 향상시킬 수 있게 된다. The image input unit 106 inputs the input images generated by the input image set generation unit 104 to the learning unit 108. Thus, when learning is performed with an input image set having a plurality of backgrounds connected to one object, recognition performance can be improved even if objects are located in various backgrounds.

학습부(108)는 입력 영상에 대한 학습을 수행한다. 학습부(108)는 DNN(Deep Neural Network) 및 CNN(Convolutional Neural Network)과 같은 다양한 종류의 신경망 네트워크를 포함할 수 있다. 학습부(108)는 필터링에 의한 특징 추출과 서브 샘플링 과정 등을 통해 입력된 영상의 특징 정보를 학습한다. The learning unit 108 performs learning on the input image. The learning unit 108 may include various types of neural network such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN). The learning unit 108 learns feature information of an input image through feature extraction and sub-sampling by filtering.

한편, 객체 인식 시 객체의 일부분에만 도 1에 도시된 구조의 학습 장치가 적용될 수도 있고, 이와 같이 객체의 일부분에 대해서만 다양한 배경에 대해 학습이 이루어질 경우 가려짐 등에 대하여 강인한 성능을 가질 수 있다는 점을 당업자라면 이해할 수 있을 것이다. On the other hand, a learning apparatus having a structure shown in FIG. 1 may be applied to only a part of an object when an object is recognized, and a robust performance against a case where a learning is performed for various backgrounds only for a part of an object Those skilled in the art will understand.

도 3은 본 발명의 제2 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도이다. 3 is a block diagram illustrating a configuration of an image learning apparatus for object recognition according to a second embodiment of the present invention.

도 3을 참조하면, 본 발명의 제2 실시예에 따른 영상 학습 장치는 영상 입력부(300) 및 학습부(310)를 포함하되, 학습부(310)는 필터부(312), 스위치부(314), 컨트롤러(316) 및 서브샘플링부(318)를 포함한다. 3, the image learning apparatus according to the second embodiment of the present invention includes an image input unit 300 and a learning unit 310. The learning unit 310 includes a filter unit 312, a switch unit 314 A controller 316, and a subsampling unit 318.

도 3에는 도시되어 있지 않으나, 도 1의 제1 실시예에 따라 생성되는 입력 영상 세트가 도 3의 영상 입력부(300)로 입력될 수도 있다. Although not shown in FIG. 3, an input image set generated according to the first embodiment of FIG. 1 may be input to the image input unit 300 of FIG.

학습부(310)는 DNN 및 CNN을 포함하는 신경망 학습 네트워크이며, 입력 영상에 대한 학습을 수행하고, 본 발명에는 필터부(312)에 스위치부(314) 및 컨트롤러(316)가 결합되어 있는 구조를 가진다. The learning unit 310 is a neural network learning network including DNN and CNN and performs learning on an input image. In the present invention, a structure in which a switch unit 314 and a controller 316 are coupled to a filter unit 312 .

CNN 및 DNN을 포함하는 딥 러닝 학습 네트워크는 입력 영상의 각 영역에 대해 필터부(312)가 복수의 필터를 적용하여 특징 지도를 생성한다. 일례로, CNN에서는 다수의 콘볼루션 필터를 영상의 특정 영역별로 적용하여 다수의 특징 지도를 생성한다. In the deep learning learning network including CNN and DNN, the filter unit 312 applies a plurality of filters to each region of the input image to generate a feature map. In CNN, for example, a plurality of convolution filters are applied to specific regions of an image to generate a plurality of feature maps.

또한, 서브샘플링부(316)는 필터부(312)에 의해 생성되는 다수의 특징 지도를 공간적으로 통합하여 그 사이즈를 축소시키고, 위치의 변화 등에 강인한 특징을 추출할 수 있도록 한다. 필터부(312)의 필터링과 서브샘플링부(316)의 서브샘플링은 반복적으로 수행되는 구조를 가진다. In addition, the sub-sampling unit 316 may spatially integrate a plurality of feature maps generated by the filter unit 312 to reduce its size and extract features that are robust to changes in position. The filtering of the filter unit 312 and the sub-sampling of the sub-sampling unit 316 are repeatedly performed.

이와 같은 반복 작업을 통해 학습부(310)는 점, 선, 면 등의 저수준의 특징에서부터 복잡하고 의미 있는 고수준의 특징까지 다양한 수준의 특징을 추출해낼 수 있게 된다. Through such repetitive operations, the learning unit 310 can extract various levels of features from low-level features such as points, lines, and surfaces to complex and meaningful high-level features.

도 4는 본 발명의 제2 실시예에 따른 영상 학습 장치에서 필터부의 동작 구조를 도시한 도면이다. 4 is a diagram illustrating an operation structure of a filter unit in an image learning apparatus according to a second embodiment of the present invention.

앞서 설명한 바와 같이, 필터부(312)는 다수의 필터를 포함하며, 도 4에는 필터부를 구성하는 다수의 필터 중 하나의 필터에 대한 동작 구조가 도시되어 있다. As described above, the filter unit 312 includes a plurality of filters, and the operational structure for one of the plurality of filters constituting the filter unit is shown in Fig.

도 4를 참조하면, 필터(400)의 필터링 영역은 입력 영상(410)에 비해 작은 사이즈를 가지며, 입력 영상의 특정 영역별로 필터(400)를 이동시켜가면서 필터링을 수행한다. 일례로, 필터링은 입력 영상의 특정 영역에 대한 픽셀값과 필터의 내적 연산을 포함할 수 있다. Referring to FIG. 4, the filtering region of the filter 400 has a size smaller than that of the input image 410, and performs filtering while moving the filter 400 according to specific regions of the input image. In one example, filtering may include pixel values for a particular region of the input image and an inner product of the filter.

본 발명의 바람직한 실시예에 따르면, 필터에는 스위치부(314)가 결합되며, 스위치부(314)의 동작은 컨트롤러(316)의 판달 결과에 의해 제어된다. 스위치부(314)는 필터의 필터링 영역에 의해 온/오프가 제어될 수 있다. According to a preferred embodiment of the present invention, the filter is coupled with a switch portion 314, and the operation of the switch portion 314 is controlled by the results of the controller 316's tally. The switch portion 314 can be controlled on / off by the filtering region of the filter.

필터의 필터링 영역이 입력 영상에서 배경 영역에 해당될 경우, 스위치부는 오프 상태가 되어 필터링 값이 강제적으로 0이 되도록 한다. 한편, 필터의 필터링 영역이 입력 영상에서 객체 영역일 경우 스위치부는 온 상태가 되도록 하여 정상적인 필터링이 이루어지도록 한다. When the filtering area of the filter corresponds to the background area in the input image, the switch part is turned off so that the filtering value is forcibly set to zero. On the other hand, when the filtering region of the filter is an object region in the input image, the switch portion is turned on to perform normal filtering.

컨트롤러(316)에는 입력 영상에서 현재 필터링 영역이 배경 영역인지 또는 객체 영역인지 여부를 판단할 수 있는 기초 정보가 제공될 수 있으며, 컨트롤러(316)는 제공된 기초 정보를 이용하여 현재 필터링 영역이 배경 영역인지 또는 객체 영역인지 여부를 판단하여 스위치부(314)의 온/오프 제어를 위한 정보를 제공한다. The controller 316 may be provided with basic information for determining whether the current filtering region is the background region or the object region in the input image and the controller 316 may use the provided basic information to determine whether the current filtering region is the background region Or the object area, and provides information for on / off control of the switch unit 314.

일례로, 컨트롤러(316)에 제공되는 기초 정보는 객체의 특징 정보일 수 있으며, 보다 구체적인 예로 객체의 색상 정보가 기초 정보로 제공될 수 있을 것이다. 예를 들어, 객체가 바나나인 경우 바나나의 색상 정보인 노란색이 기초 정보로 컨트롤러(316)에 입력될 수 있다. 이 경우 컨트롤러(316)는 필터링 영역이 노란색과 무관할 경우 해당 영역을 배경 영역으로 판단하며, 스위치부를 오프시켜 필터링에 의한 출력이 강제적으로 0이 되도록 한다. 한편, 필터링 영역이 노란색일 경우 해당 영역을 객체 영역으로 판단하며, 스위치를 온시켜 정상적인 필터링이 이루어지도록 한다. For example, the basic information provided to the controller 316 may be characteristic information of the object, and more specifically, the color information of the object may be provided as basic information. For example, when the object is a banana, yellow which is color information of the banana may be input to the controller 316 as basic information. In this case, if the filtering region is irrelevant to yellow, the controller 316 determines that the region is a background region and turns off the switch unit so that the output by filtering is forcibly set to zero. On the other hand, when the filtering area is yellow, the corresponding area is determined as the object area, and the switch is turned on to perform normal filtering.

물론, 기초 정보로 객체의 색상 정보 이외에도 다양한 정보가 제공될 수 있다는 점은 당업자에게 있어 자명할 것이며, 기초 정보의 변경이 본 발명의 사상 및 범주에 영향을 미치지 않는다는 점 역시 자명할 것이다. 일례로, 수동으로 제작한 객체 영역 지도를 사용할 수 있음은 물론이다. Of course, it will be obvious to those skilled in the art that various information other than the color information of the object can be provided as the basic information, and it will be apparent that the change of the basic information does not affect the spirit and scope of the present invention. For example, it is of course possible to use a manually generated object area map.

나아가, 컨트롤러(316)는 기초 정보가 아닌 별도의 판단 알고리즘에 의해 현재의 필터링 영역이 객체 영역인지 또는 배경 영역인지 여부를 판단할 수도 있을 것이다. Further, the controller 316 may determine whether the current filtering area is an object area or a background area by a judgment algorithm other than basic information.

이와 같이, 필터에 스위치부를 적용함으로써 배경에 대해서는 특징이 학습에 사용되지 않도록 할 수 있으며, 따라서 객체에 대한 효율적인 학습이 이루어질 수 있을 것이며, 객체와 함께 존재하는 배경으로 인하여 입력 영상이 객체로 인지되는 오류를 저감시킬 수 있을 것이다. In this way, by applying the switch unit to the filter, the feature can be prevented from being used for learning by the switch, so that efficient learning of the object can be performed, and the input image is recognized as an object Errors will be reduced.

도 5는 본 발명의 제2 실시예에 따른 객체 인식을 위한 영상 학습 장치에서 학습부의 학습 동작 구조를 도시한 도면이다. 5 is a diagram illustrating a learning operation structure of a learning unit in an image learning apparatus for object recognition according to a second embodiment of the present invention.

도 5를 참조하면, 학습부(312)는 필터링(500, 504)과 서브샘플링(502, 506)을 반복적으로 수행한다. Referring to FIG. 5, the learning unit 312 repeatedly performs filtering (500, 504) and subsampling (502, 506).

도 5에 도시된 바와 같이, 필터링에 의해 특징 지도(Feature Map)가 생성된 후 서브 샘플링이 이루어지면, 특징 지도의 사이즈가 축소된다. 사이즈가 축소된 특징 지도에 대해 다시 필터링이 이루어지며, 필터링을 통해 또 다른 특징 지도가 생성된다. As shown in FIG. 5, when a feature map is generated by filtering and sub-sampling is performed, the size of the feature map is reduced. Filtering is again performed on the reduced size feature map, and another feature map is generated through filtering.

도 4와 같은 필터의 구조는 모든 스테이지의 필터에 대해 적용될 수 있을 것이다. 즉, 입력 영상에 대한 최초 필터링(500)뿐만 아니라 서브 샘플링에 의해 축소된 특징 지도에 대한 필터링(504)에서도 도 4와 같은 스위치부를 적용하여 배경 영역의 필터링 값을 강제적으로 0으로 설정할 수 있는 것이다. The structure of the filter as shown in Fig. 4 may be applied to the filters of all stages. That is, the filtering unit 504 for the feature map reduced by the subsampling as well as the initial filtering 500 for the input image can forcibly set the filtering value of the background region to 0 by applying the switch unit as shown in FIG. 4 .

도 6은 본 발명의 제3 실시예에 따른 객체 인식을 위한 영상 학습 장치의 구성을 도시한 블록도이다. 6 is a block diagram illustrating the configuration of an image learning apparatus for object recognition according to a third embodiment of the present invention.

도 6을 참조하면, 본 발명의 제3 실시예에 따른 객체 인식을 위한 영상 학습 장치는 영상 입력부(600) 및 학습부(610)를 포함하되, 학습부(610)는 필터부(612), 영역별 가중치 적용부(614), 컨트롤러(616) 및 서브샘플링부(618)를 포함한다. 6, an image learning apparatus for object recognition according to a third exemplary embodiment of the present invention includes an image input unit 600 and a learning unit 610. The learning unit 610 includes a filter unit 612, An area weighting application unit 614, a controller 616, and a subsampling unit 618. [

도 6에는 도시되어 있지 않으나, 도 1의 제1 실시예에 따라 생성되는 입력 영상 세트가 도 3의 영상 입력부(600)로 입력될 수도 있을 것이다. Although not shown in FIG. 6, the input image set generated according to the first embodiment of FIG. 1 may be input to the image input unit 600 of FIG.

학습부(610)는 DNN 및 CNN을 포함하는 딥 러닝 학습 네트워크이며, 입력 영상에 대한 학습을 수행하고, 제3 실시예에서는 필터부(612)에 영역별 가중치 적용부(614) 및 컨트롤러(616)가 결합되어 있는 구조를 가진다. The learning unit 610 is a deep learning learning network including a DNN and a CNN and performs learning on the input image. In the third embodiment, the filtering unit 612 includes a domain weighting unit 614 and a controller 616 ) Are combined with each other.

도 3의 제2 실시예에는 필터부(312)에 스위치부(314)가 결합되는 구조이나, 제3 실시에에서는 필터부(612)에 영역별 가중치 적용부(614)가 결합되는 구조라는 점에서 구별된다. 3 is a structure in which the switch unit 314 is coupled to the filter unit 312 and a structure in which the weight unit 614 for each domain is coupled to the filter unit 612 in the third embodiment .

도 7은 본 발명의 제3 실시예에 따른 영상 학습 장치에서 필터부의 동작 구조를 도시한 도면이다. 7 is a diagram illustrating an operation structure of a filter unit in an image learning apparatus according to a third embodiment of the present invention.

필터부(612)는 다수의 필터를 포함하며, 도 6에는 필터부를 구성하는 다수의 필터 중 하나의 필터에 대한 동작 구조가 도시되어 있다. The filter unit 612 includes a plurality of filters, and the operation structure for one of the plurality of filters constituting the filter unit is shown in Fig.

도 7을 참조하면, 필터(700)의 필터링 영역은 입력 영상(710)에 비해 작은 사이즈를 가지며, 입력 영상의 특정 영역별로 필터(700)를 이동시켜가면서 필터링을 수행한다. 일례로, 필터링은 입력 영상의 특정 영역에 대한 픽셀값과 필터의 내적 연산을 포함할 수 있다. Referring to FIG. 7, the filtering region of the filter 700 has a smaller size than the input image 710, and performs filtering while moving the filter 700 for specific regions of the input image. In one example, filtering may include pixel values for a particular region of the input image and an inner product of the filter.

본 발명의 바람직한 실시예에 따르면, 필터에는 영역별 가중치 적용부(614)가 결합되며, 영역별 가중치 적용부(614)에 의해 적용되는 영역별 가중치는 컨트롤러(616)의 판단 결과에 기초한다. 영역별 가중치 적용부(614)는 필터의 출력값에 영역별 가중치를 적용한다. 영역별 가중치 적용부(614)는 필터링 영역별로 가중치를 적용하며, 특정 영역에서는 높은 가중치가 적용되도록 하고 특정 영역에서는 낮은 가중치가 적용되도록 한다. 낮은 가중치가 적용되도록 하는 영역은 입력 영상의 배경에 해당되는 영역이고, 높은 가중치가 적용되도록 하는 영역은 입력 영상의 객체에 해당되는 영역이다. (게인을 가중치로 하였습니다. 영역이란 말이 문맥상 나올 경우 영역이 중복적으로 사용되지 않도록 하였습니다)According to a preferred embodiment of the present invention, the filter is provided with a region-specific weight applying unit 614, and the area weight applied by the area-specific weight applying unit 614 is based on the determination result of the controller 616. The area-specific weight applying unit 614 applies the area-specific weight to the output value of the filter. The weight applying unit 614 applies a weight for each filtering region, and applies a high weight in a specific region and a low weight in a specific region. The area to which the low weight is applied is the area corresponding to the background of the input image, and the area to which the high weight is applied is the area corresponding to the object of the input image. (Gain is weighted so that the area is not redundant when the word is in context)

필터의 필터링 영역이 입력 영상에서 배경 영역에 해당될 경우, 컨트롤러(616)의 판단에 기초하여 영역별 가중치 적용부(614)는 필터링 값에 상대적으로 낮은 가중치를 적용한다. 한편, 필터의 필터링 영역이 입력 영상에서 객체 영역일 경우 컨트롤러(616)의 판단에 기초하여 영역별 가중치 적용부(614)는 필터링 값에 상대적으로 높은 가중치를 적용한다. When the filtering area of the filter corresponds to the background area in the input image, the area weight applying part 614 applies a relatively low weight to the filtering value based on the determination of the controller 616. On the other hand, when the filtering region of the filter is the object region in the input image, the weighting unit 614 for each region applies a relatively high weight to the filtering value based on the determination of the controller 616. [

컨트롤러(616)에는 입력 영상에서 현재 필터링 영역이 배경 영역인지 또는 객체 영역인지 여부를 판단할 수 있는 기초 정보가 제공될 수 있으며, 컨트롤러(616)는 제공된 기초 정보를 이용하여 판단을 수행한다.The controller 616 may be provided with basic information for determining whether the current filtering region is the background region or the object region in the input image, and the controller 616 performs the determination using the provided basic information.

컨트롤러(616)로 제공되는 기초 정보는 제2 실시예에서와 같이 객체의 특징 정보를 포함할 수 있으며, 보다 구체적으로 객체의 색상 정보가 이에 포함될 수 있을 것이다. 또한, 앞서 설명한 바와 같이, 컨트롤러(616)는 별도의 판단 알고리즘에 의해 객체 영역인지 또는 배경 영역인지 여부를 판단할 수도 있을 것이다. The basic information provided to the controller 616 may include the feature information of the object as in the second embodiment, and more specifically, the color information of the object may be included therein. Also, as described above, the controller 616 may determine whether it is an object area or a background area by a separate judgment algorithm.

이와 같이, 필터에 영역별 가중치 적용부(614)를 결합함으로써 배경에 대해서는 특징이 검출되지 않도록 학습이 수행될 수 있으며, 따라서 객체에 대한 효율적인 학습이 이루어질 수 있을 것이다. As such, by combining the filter with the region-specific weight applying unit 614, learning can be performed so that the feature is not detected for the background, so that efficient learning of the object can be performed.

한편, 필터에 적용되는 영역별 가중치 적용부(614) 및 컨트롤러(616)는 도 5에 도시된 제2 실시예와 같이 신경망 네트워크에서 모든 스테이지의 필터부의 필터들에 대해 적용될 수 있을 것이다. 요컨대, 입력 영상에 대한 최초 필터링뿐만 아니라 서브 샘플링에 의해 축소된 특징 지도에 대한 필터링에서도 도 6과 같이 영역별 가중치 조절을 수행하여 배경 영역의 필터링 값을 객체 영역에 비해 낮은 수준의 값으로 조절할 수 있을 것이다. On the other hand, the area weighting unit 614 and the controller 616 applied to the filter may be applied to the filters of the filter units of all stages in the neural network network as in the second embodiment shown in Fig. That is, the filtering of the feature map reduced by subsampling as well as the initial filtering of the input image may be performed by adjusting the weighting of each region as shown in FIG. 6, thereby adjusting the filtering value of the background region to a lower value than that of the object region There will be.

도 8은 본 발명의 제1 실시예 및 제2 실시예가 결합된 영상 학습 방법의 전체적인 흐름을 도시한 순서도이다. FIG. 8 is a flowchart showing an overall flow of an image learning method in which first and second embodiments of the present invention are combined.

제1 실시예 및 제2 실시예는 별개의 방법이나 조합하여 이루어질 수 있는 과정이므로 이를 결합한 방법의 순서도가 도 8을 통해 도시되어 있다. 그러나, 도 8의 순서도가 제1 실시예 및 제2 실시에가 반드시 결합되어야 하는 것으로 해석될 수는 없을 것이다. Since the first embodiment and the second embodiment are processes that can be performed by separate methods or combinations, a flowchart of a method of combining them is shown in FIG. However, the flowchart of Fig. 8 can not be interpreted as necessarily to be combined with the first embodiment and the second embodiment.

도 8을 참조하면, 레퍼런스 영상으로부터 객체를 추출한다(단계 800). Referring to FIG. 8, an object is extracted from a reference image (step 800).

레퍼런스 영상으로부터 객체가 추출되면, 추출된 객체와 합성될 다수의 배경 영상을 생성한다(단계 802). 앞서 설명한 바와 같이, 특정 배경 영상은 랜덤 생성기를 적용하여 생성된 배경 영상일 수 있다. When the object is extracted from the reference image, a plurality of background images to be synthesized with the extracted object are generated (step 802). As described above, a specific background image may be a background image generated by applying a random generator.

추출된 객체에 생성된 다수의 배경 각각을 합성하여 다수의 입력 영상으로 이루어진 입력 영상 세트를 생성한다(단계 804). 이 단계(단계 304)는 생략될 수도 있을 것이다. An input image set including a plurality of input images is generated by combining each of the plurality of backgrounds generated in the extracted object (step 804). This step (step 304) may be omitted.

입력 영상 세트를 구성하는 입력 영상들 중 특정 입력 영상이 입력되면, 학습 네트워크는 필터링할 영역이 배경 영역인지 또는 객체 영역인지 여부를 판단한다(단계 806). 앞서 설명한 바와 같이, 현재 필터링 영역이 배경 영역인지 또는 객체 영역인지 여부에 대한 판단은 미리 제공되는 기초 정보를 이용하여 수행될 수도 있을 것이며, 별도의 판단 알고리즘이 이용될 수도 있을 것이다. If a specific input image is input from the input images constituting the input image set, the learning network determines whether the area to be filtered is a background area or an object area (step 806). As described above, the determination as to whether the current filtering region is the background region or the object region may be performed using the basic information provided in advance, and a separate determination algorithm may be used.

현재 필터링하는 영역이 배경 영역에 해당될 경우, 필터링 결과에 대한 값을 0으로 강제로 설정한다(단계 808). 즉, 필터링 결과에 스위칭을 적용하여 필터링 결과에 대한 값을 강제적으로 0이 되도록 하는 것이다. If the current filtering area corresponds to the background area, the value of the filtering result is forcibly set to 0 (step 808). That is, switching is applied to the filtering result so that the value of the filtering result is forced to be zero.

현재 필터링하는 영역이 객체 영역에 해당될 경우, 정상적인 필터링을 수행한다(단계 810). If the area currently filtered corresponds to the object area, normal filtering is performed (step 810).

도 8에는 설명되어 있지 않으나, 딥 러닝 학습 네트워크는 필터링과 서브 샘플링을 반복하므로, 스위치를 적용한 필터링은 모든 스테이지의 필터에 대해 적용될 수 있을 것이다. Although not depicted in FIG. 8, the deep learning learning network repeats filtering and subsampling, so that filtering with switches can be applied to all stage filters.

도 9는 본 발명의 제1 실시예 및 제3 실시예가 결합된 영상 학습 방법의 전체적인 흐름을 도시한 순서도이다. FIG. 9 is a flowchart showing an overall flow of an image learning method in which the first embodiment and the third embodiment of the present invention are combined.

제1 실시예 및 제3 실시예 역시 별개의 방법이나 조합하여 이루어질 수 있는 과정이므로 이를 결합한 방법의 순서도가 도 9를 통해 도시되어 있다. 그러나 도 9의 순서도 역시 제1 실시예 및 제2 실시예가 반드시 결합되어야 하는 것으로 해석될 수는 없을 것이다. The first embodiment and the third embodiment are also processes that can be performed by separate methods or in combination, and therefore, a flowchart of a method of combining them is shown in FIG. However, the flowchart of FIG. 9 can not be interpreted as necessarily to be combined with the first embodiment and the second embodiment.

도 9를 참조하면, 레퍼런스 영상으로부터 객체를 추출한다(단계 900). Referring to FIG. 9, an object is extracted from a reference image (step 900).

레퍼런스 영상으로부터 객체가 추출되면, 추출된 객체와 합성될 다수의 배경 영상을 생성한다(단계 902). 앞서 설명한 바와 같이, 특정 배경 영상은 랜덤 생성기를 적용하여 생성된 배경 영상일 수 있다. When the object is extracted from the reference image, a plurality of background images to be synthesized with the extracted object are generated (step 902). As described above, a specific background image may be a background image generated by applying a random generator.

추출된 객체에 생성된 다수의 배경 각각을 합성하여 다수의 입력 영상으로 이루어진 입력 영상 세트를 생성한다(단계 904). 단계 902와 단계 904는 생략될 수도 있다. An input image set including a plurality of input images is generated by combining each of the plurality of backgrounds generated in the extracted object (step 904). Steps 902 and 904 may be omitted.

입력 영상 세트를 구성하는 입력 영상들 중 특정 입력 영상이 입력되면, 학습 네트워크는 필터링할 영역이 배경 영역인지 또는 객체 영역인지 여부를 판단한다(단계 906). 앞서 설명한 바와 같이, 현재 필터링 영역이 배경 영역인지 또는 객체 영역인지 여부에 대한 판단은 미리 제공되는 기초 정보를 이용하여 수행될 수도 있을 것이며, 별도의 영역 식별 알고리즘이 이용될 수도 있을 것이다. When a specific input image is input from the input images constituting the input image set, the learning network determines whether the area to be filtered is a background area or an object area (step 906). As described above, the determination as to whether the current filtering region is the background region or the object region may be performed using previously provided basic information, and a separate region identification algorithm may be used.

현재 필터링하는 영역이 배경 영역에 해당될 경우, 필터링 결과에 대해 상대적으로 낮은 가중치를 적용한다(단계 908). If the current filtering area corresponds to the background area, a relatively low weight is applied to the filtering result (step 908).

현재 필터링하는 영역이 객체 영역에 해당될 경우, 상대적으로 높은 가중치를 적용한다(단계 910). If the current filtering area corresponds to the object area, a relatively high weight is applied (step 910).

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.As described above, the present invention has been described with reference to particular embodiments, such as specific elements, and specific embodiments and drawings. However, it should be understood that the present invention is not limited to the above- And various modifications and changes may be made thereto by those skilled in the art to which the present invention pertains. Accordingly, the spirit of the present invention should not be construed as being limited to the embodiments described, and all of the equivalents or equivalents of the claims, as well as the following claims, belong to the scope of the present invention .

Claims

A video input unit receiving an input video to be learned;
And a filtering unit that receives the input image and learns it, and performs filtering on a specific area of the input image,
The filter unit
Multiple filters,
A controller for determining whether an area to be filtered by each of the plurality of filters is an object area or a background area and a controller for setting a filtering value of the corresponding filter to 0 when the controller determines that the filtering area of the specific filter is a background area The object recognition apparatus comprising:

delete

The method according to claim 1,
Wherein the controller determines whether the filtering region is an object region or a background region by using basic information on an object characteristic.

The method according to claim 1,
Wherein the learning unit repeatedly performs a process of generating feature maps through filtering by the filter unit and a sub-sampling process of spatially integrating the feature maps to reduce the size, and the switch unit repeatedly performs filtering And the filtering value for the object is set to zero.

A video input unit receiving an input video to be learned;
And a filtering unit that receives the input image and learns it, and performs filtering on a specific area of the input image,
The filter unit
Multiple filters,
A controller for determining whether the area filtered by each of the plurality of filters is an object area or a background area,
And a weight applying unit for applying a relatively high weight if the filtered region is an object region and applying a relatively low weight when the filtered region is a background region based on the determination result of the controller Image learning device for object recognition.

delete

The method according to claim 6,
Wherein the controller determines whether the filtering region is an object region or a background region by using basic information on an object characteristic.

The method according to claim 6,
Wherein the learning unit repeatedly performs a process of generating feature maps through filtering by the filter unit and a sub-sampling process of reducing the size by spatially integrating the feature maps, and the weighting unit for each region repeatedly performs filtering And applying the set weight to each region.

(A) receiving an input image to be learned;
And a learning step (b) of receiving and learning the input image and performing filtering on a specific area of the input image,
Wherein the step (b) comprises: determining whether the filtering area is an object area or a background area; And setting a filtering value to 0 if the filtering region is determined to be a background region.

delete

(A) receiving an input image to be learned;
And a learning step (b) of receiving and learning the input image and performing filtering on a specific area of the input image,
Wherein the step (b) comprises: determining whether the filtering area is an object area or a background area; And applying a relatively high weight if the filtered region is an object region and applying a relatively low weighted region when the filtered region is a background region.

delete

An object extraction unit for extracting an object from a reference image;
A background generator for generating a plurality of background images applied to the extracted object;
An input image generation unit for generating an input image set composed of a plurality of input images for the same object by combining the plurality of background images and the extracted objects;
And a learning unit for learning input images included in the input image set.

17. The method of claim 16,
Wherein at least one of the plurality of background images generated by the background generation unit includes a background image in which a random number is assigned to each pixel by a random generator.

(A) extracting an object from a reference image;
(B) generating a plurality of background images applied to the extracted object;
(C) generating an input image set including a plurality of input images for the same object by combining the plurality of background images and the extracted objects; And
And (d) learning input images included in the input image set.