KR102199912B1

KR102199912B1 - Data Augmentation based Robust Object Recognition Method and System

Info

Publication number: KR102199912B1
Application number: KR1020180070055A
Authority: KR
Inventors: 조충상; 정혜동; 이영한; 고상기; 김보은
Original assignee: 한국전자기술연구원
Priority date: 2018-06-19
Filing date: 2018-06-19
Publication date: 2021-01-08
Also published as: KR20190142856A

Abstract

딥러닝 네트워크의 객체 인지 성능 향상에 큰 기여를 할 수 있는 학습 이미지 회전에 의한 데이터 증분 방법 및 시스템이 제공된다. 본 발명의 실시예에 따른 학습 데이터 증분 방법은 학습 이미지를 회전시킬 회전각을 선택하는 단계; 선택단계에서 선택된 회전각에 따라 학습 이미지를 회전시키는 단계; 선택된 회전각에 따라 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 결정되는 범위 내에서, 회전된 학습 이미지의 BB를 생성하는 단계;를 포함한다.
이에 의해, 회전 증분을 통해 학습 데이터를 증분시켜, 제한된 학습 데이터로 딥러닝 네트워크의 복잡도를 증가시키지 않고서도, 딥러닝 네트워크의 객체 인지 성능을 크게 향상시키는 것이 가능해진다.A method and system for incrementing data by rotating a training image that can contribute to the improvement of object recognition performance of a deep learning network are provided. The method for incrementing learning data according to an embodiment of the present invention includes selecting a rotation angle to rotate the training image; Rotating the training image according to the rotation angle selected in the selection step; And generating a BB of the rotated learning image within a range determined using a BB (Bouding Box) for the recognized object rotated according to the selected rotation angle.
As a result, it is possible to increase the learning data through rotation increment to greatly improve the object recognition performance of the deep learning network without increasing the complexity of the deep learning network with limited training data.

Description

Data Augmentation based Robust Object Recognition Method and System}

본 발명은 객체 인지 기술에 관한 것으로, 더욱 상세하게는 딥러닝 네트워크를 이용하여 학습 데이터 증분을 통해 강인한 객체 인지를 수행할 수 있는 방법 및 시스템에 관한 것이다.The present invention relates to an object recognition technology, and more particularly, to a method and system capable of performing robust object recognition through incremental learning data using a deep learning network.

영상 기반의 객체 인지 기술의 한계를 극복하고자 등장한 딥러닝 기반 객체 인지 기술은 복잡도가 상당히 중요한 요소이다. 딥러닝 네트워크의 복잡도에 따라 객체 인지 성능이 연관됨을 의미한다.In order to overcome the limitations of image-based object recognition technology, the deep learning-based object recognition technology is a very important factor. It means that object recognition performance is related according to the complexity of the deep learning network.

이에, 딥러닝 네트워크의 복잡도를 증가시키면서, 객체 인지 성능을 개선하는 방식이 주류를 이루고 있는데, 복잡도 증가는 리소스와 속도 측면에서의 문제를 야기한다.Accordingly, a method of improving object recognition performance while increasing the complexity of a deep learning network is becoming the mainstream, and the increase in complexity causes problems in terms of resources and speed.

딥러닝 네트워크의 복잡도를 증가시키지 않으면서, 객체 인지 성능을 높이기 위한 방안으로, 학습 데이터 증분 기법을 상정할 수 있다. 제한된 학습 데이터를 더 많은 학습 데이터로 증분시켜 딥러닝 네트워크를 학습시키는 것이다.As a method for improving object recognition performance without increasing the complexity of the deep learning network, a training data increment technique can be assumed. It is to train a deep learning network by incrementing the limited training data into more training data.

하지만, 학습 데이터 증분에 있어서도 한계는 존재한다. 이를 테면, 학습 이미지를 회전시켜 증분한 학습 데이터는 딥러닝 네트워크의 객체 인지 성능 향상에 큰 기여를 하지 못한다.However, there is also a limitation in incrementing the training data. For example, the training data incremented by rotating the training image does not contribute significantly to the improvement of object recognition performance of the deep learning network.

이에 따라, 학습 이미지의 회전 증분 기법에 대한 개량이 요구되고 있는 실정이다.Accordingly, there is a demand for improvement of the rotation increment technique of the learning image.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은, 딥러닝 네트워크의 객체 인지 성능 향상에 큰 기여를 할 수 있는 학습 이미지 회전에 의한 데이터 증분 방법 및 시스템을 제공함에 있다.The present invention was conceived to solve the above problems, and an object of the present invention is to provide a method and system for incrementing data by rotating a learning image that can greatly contribute to improving the object recognition performance of a deep learning network. .

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른, 학습 데이터 증분 방법은 학습 이미지를 회전시킬 회전각을 선택하는 단계; 선택단계에서 선택된 회전각에 따라 학습 이미지를 회전시키는 단계; 선택된 회전각에 따라 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 결정되는 범위 내에서, 회전된 학습 이미지의 BB를 생성하는 단계;를 포함한다. According to an embodiment of the present invention for achieving the above object, a method for incrementing learning data includes selecting a rotation angle to rotate a learning image; Rotating the training image according to the rotation angle selected in the selection step; And generating a BB of the rotated learning image within a range determined using a BB (Bouding Box) for the recognized object rotated according to the selected rotation angle.

생성 단계는, 회전각에 따라 회전된 BB에 내접하는 제1 박스와 외접하는 제2 박스 사이에 위치하는 제3 박스를, 회전된 학습 이미지의 BB로 생성하는 것일 수 있다. The generating step may be to generate a third box positioned between the first box inscribed with the BB rotated according to the rotation angle and the second box inscribed with the second box as the BB of the rotated learning image.

제1 박스, 제2 박스 및 제3 박스의 가로 변과 세로 변은, 회전 전 BB의 가로 변과 세로 변에 각각 평행한 것일 수 있다. The horizontal and vertical sides of the first box, the second box, and the third box may be parallel to the horizontal and vertical sides of the BB before rotation.

제3 박스의 위치는, '제1 박스와 제3 박스 간의 거리'와 '제3 박스와 제2 박스 간의 거리'의 비율에 의해 결정되는 것일 수 있다. The location of the third box may be determined by a ratio of the'distance between the first box and the third box' and the'distance between the third box and the second box'.

비율은, 선택된 회전각에 따라 가변하는 것일 수 있다. The ratio may vary depending on the selected rotation angle.

선택단계는, 평균이 0°인 가우시안 분포 그래프에 따라 랜덤하게 회전각을 선택하는 것일 수 있다. The selection step may be to randomly select a rotation angle according to a Gaussian distribution graph having an average of 0°.

회전 단계는, 학습 이미지의 중심을 원점으로 이동시키는 단계; 선택된 회전각에 따라 원점을 기준으로 학습 이미지를 회전시키는 단계; 회전된 학습 이미지의 중심을 원 위치로 이동시키는 단계;를 포함하는 것일 수 있다. The rotating step may include moving the center of the training image to an origin; Rotating the training image based on the origin according to the selected rotation angle; It may include; moving the center of the rotated training image to the original position.

본 발명에 따른 학습 데이터 증분 방법은 학습 이미지를 증분시키는 단계;를 더 포함하고, 회전 단계는, 증분 단계에서 증분된 학습 이미지를 회전시키는 것일 수 있다. The method of incrementing training data according to the present invention further includes: incrementing the training image, and the rotating step may be rotating the training image incremented in the incrementing step.

증분단계는, 학습 이미지에 대해 줌잉, 노이즈 적용 및 이동 중 적어도 하나를 통해, 학습 이미지를 증분시키는 것일 수 있다. The incrementing step may be to increment the training image through at least one of zooming, applying noise, and moving the training image.

한편, 본 발명의 다른 실시예에 따른, 학습 데이터 증분 시스템은 학습 이미지를 입력받는 입력부; 및 학습 이미지를 회전시킬 회전각을 선택하고, 선택된 회전각에 따라 학습 이미지를 회전시키며, 선택된 회전각에 따라 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 결정되는 범위 내에서 회전된 학습 이미지의 BB를 생성하는 프로세서;를 포함한다. On the other hand, according to another embodiment of the present invention, a learning data increment system includes an input unit for receiving a learning image; And selecting a rotation angle to rotate the training image, rotating the training image according to the selected rotation angle, and learning rotated within a range determined using the BB (Bouding Box) for the recognized object rotated according to the selected rotation angle. It includes; a processor that generates a BB of the image.

한편, 본 발명의 다른 실시예에 따른, 학습 방법은 학습 이미지를 회전시킬 회전각을 선택하는 단계; 선택단계에서 선택된 회전각에 따라 학습 이미지를 회전시키는 단계; 선택된 회전각에 따라 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 결정되는 범위 내에서, 회전된 학습 이미지의 BB를 생성하는 단계; 및 회전된 학습 이미지와 생성된 BB를 이용하여, 딥러닝 네트워크를 학습시키는 단계;를 포함한다.On the other hand, according to another embodiment of the present invention, a learning method includes selecting a rotation angle to rotate a learning image; Rotating the training image according to the rotation angle selected in the selection step; Generating a BB of the rotated learning image within a range determined using a BB (Bouding Box) for the recognized object rotated according to the selected rotation angle; And training a deep learning network using the rotated training image and the generated BB.

한편, 본 발명의 다른 실시예에 따른, 학습 시스템은 학습 이미지를 입력받는 입력부; 및 학습 이미지를 회전시킬 회전각을 선택하고, 선택단계에서 선택된 회전각에 따라 학습 이미지를 회전시키며, 선택된 회전각에 따라 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 결정되는 범위 내에서 회전된 학습 이미지의 BB를 생성하고, 회전된 학습 이미지와 생성된 BB를 이용하여 딥러닝 네트워크를 학습시키는 프로세서;를 포함한다.On the other hand, according to another embodiment of the present invention, a learning system includes an input unit for receiving a learning image; And selecting a rotation angle to rotate the training image, rotating the training image according to the rotation angle selected in the selection step, and within a range determined using the BB (Bouding Box) for the recognized object rotated according to the selected rotation angle. And a processor that generates a BB of the rotated training image and trains a deep learning network using the rotated training image and the generated BB.

이상 설명한 바와 같이, 본 발명의 실시예들에 따르면, 회전 증분을 통해 학습 데이터를 증분시켜, 제한된 학습 데이터로 딥러닝 네트워크의 복잡도를 증가시키지 않고서도, 딥러닝 네트워크의 객체 인지 성능을 크게 향상시키는 것이 가능해진다.As described above, according to embodiments of the present invention, learning data is incremented through rotation increment, so that the object recognition performance of the deep learning network is greatly improved without increasing the complexity of the deep learning network with limited training data. It becomes possible.

도 1은 본 발명의 일 실시예에 따른 학습 데이터 증분 방법의 설명에 제공되는 도면,
도 2는, 도 1에 도시된 회전 증분 과정의 상세 설명에 제공되는 도면,
도 3 내지 도 5는, 회전 증분된 학습 이미지에서 인식 객체에 대한 BB를 생성하는 방법의 상세 설명에 제공되는 도면들,
도 6과 도 7은, 본 발명의 실시예에 따른 회전 증분 방법을 적용한 결과를 예시한 도면들,
도 8은 기존 방법과 본 발명의 실시예에 따른 방법에 대한 VOC2007 테스트 결과를 나타낸 표,
도 9는 본 발명의 다른 실시예에 따른 학습 데이터 증분 시스템의 블럭도이다.1 is a diagram provided to explain a method for incrementing learning data according to an embodiment of the present invention;
FIG. 2 is a diagram provided for detailed description of the rotation increment process shown in FIG. 1;
3 to 5 are diagrams provided for a detailed description of a method of generating a BB for a recognized object from a rotation-incremented training image,
6 and 7 are diagrams illustrating a result of applying the rotation increment method according to an embodiment of the present invention;
8 is a table showing VOC2007 test results for the existing method and the method according to an embodiment of the present invention;
9 is a block diagram of a learning data increment system according to another embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다.Hereinafter, the present invention will be described in more detail with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른 학습 데이터 증분 방법의 설명에 제공되는 도면이다. 본 발명의 실시예에 따른 학습 데이터 증분 방법은, 학습 데이터를 증분시켜 객체 인지를 위한 딥러닝 네트워크를 학습시킨다.1 is a diagram provided to explain a method of incrementing learning data according to an embodiment of the present invention. In the method of incrementing training data according to an embodiment of the present invention, a deep learning network for object recognition is trained by incrementing training data.

이에, 딥러닝 네트워크의 복잡도를 증가시키지 않고서도, 딥러닝 네트워크에 의한 객체 인지 성능을 향상시킬 수 있다.Accordingly, it is possible to improve the object recognition performance by the deep learning network without increasing the complexity of the deep learning network.

본 발명의 실시예에 따른 학습 데이터 증분 방법은, 회전 기반의 학습 데이터 증분까지 수행하는데, 회전 증분에 의한 학습 효과를 높이기 위한 차원에서, 회전된 학습 이미지에 최적의 BB(Bouding Box)를 생성하여 학습에 이용한다.The learning data increment method according to an embodiment of the present invention performs up to rotation-based training data increment, and in order to increase the learning effect by rotation increment, an optimal BB (Bouding Box) is generated on the rotated training image. Use it for learning.

학습 데이터 증분을 위해, 도 1에 도시된 바와 같이, 먼저, 학습 이미지를 증분시킨다(S110).In order to increment the training data, as shown in FIG. 1, first, the training image is incremented (S110).

S110단계에서의 학습 이미지 증분에는, 1) 학습 이미지에 대한 줌잉(확대/축소), 2) 학습 이미지에 대한 노이즈 적용 및 3) 학습 이미지에 대한 상/하/좌/우의 평행 이동 등의 기법에 의한 증분이 포함된다.The training image increment in step S110 includes: 1) zooming (zooming in/out) for the training image, 2) applying noise to the training image, and 3) moving up/down/left/right to the training image. The increment by is included.

S110단계에서의 학습 이미지 증분에는, 학습 이미지의 회전에 의한 증분은 포함되지 않는데, 회전 기반 증분은 후술할 단계들에서 수행된다.The training image increment in step S110 does not include the increment by rotation of the training image, and rotation-based increment is performed in steps to be described later.

다음, S110단계에서 증분된 학습 이미지 각각에 대해 회전 증분을 적용할지 여부를 결정한다(S120).Next, it is determined whether to apply the rotation increment to each of the training images incremented in step S110 (S120).

S120단계에서 회전 증분을 적용하지 않기로 결정된 학습 이미지는(S120-N), 객체 인지용 딥러닝 네트워크로 입력되어 학습에 이용된다(S160).The training image, which is determined not to apply the rotation increment in step S120 (S120-N), is input to a deep learning network for object recognition and is used for learning (S160).

반면, S120단계에서 회전 증분을 적용하기로 결정된 학습 이미지에 대해서는 회전 증분 과정을 수행한 후에(S130단계 내지 S150단계), 객체 인지용 딥러닝 네트워크로 입력되어 학습에 이용된다(S160).On the other hand, after performing the rotation increment process for the training image determined to apply the rotation increment in step S120 (steps S130 to S150), it is input to the deep learning network for object recognition and used for learning (S160).

즉, S120단계는 S110단계에서 증분된 학습 이미지들 중 회전 증분을 적용할 학습 이미지를 선정하여 주는 단계로 기능한다. S120단계에서 회전 증분을 적용할 학습 이미지의 비율은 설정에 의해 정해진다.That is, step S120 functions as a step for selecting a training image to which the rotation increment is applied among the training images incremented in step S110. In step S120, the ratio of the training image to which the rotation increment is applied is determined by setting.

이를 테면, 회전 증분을 적용할 학습 이미지의 비율이 "30%"로 설정되었다면, S120단계에서는 회전 증분을 적용하는 것으로 결정할 확률이 30%가 되도록 동작한다.For example, if the ratio of the training image to which the rotation increment is to be applied is set to "30%", in step S120, the probability of determining to apply the rotation increment is 30%.

S120단계에서 회전 증분을 적용하기로 결정된 학습 이미지에 대해 회전 증분을 위한 단계로, 가장 먼저 회전각을 선택한다(S130).In step S120, the rotation increment is performed for the training image determined to be applied by the rotation increment, and the rotation angle is first selected (S130).

S130단계에서의 회전각 선택은 학습 이미지 마다 개별적으로 이루어진다. 즉, 회전 증분 대상이 된 학습 이미지들에 대한 회전각들은 서로 독립적으로 결정된다.The rotation angle selection in step S130 is made individually for each training image. That is, the rotation angles of the training images subject to rotation increment are determined independently of each other.

그리고, S120단계에서 회전 증분 대상으로 결정된 학습 이미지에 대해, S130단계에서 선택된 회전각에 따라 회전하여 학습 이미지를 추가 생성함으로써, 학습 이미지를 증분시킨다(S140).In addition, the training image determined in step S120 as a rotation increment target is rotated according to the rotation angle selected in step S130 to generate additional training images, thereby incrementing the training image (S140).

다음, S140단계에서 학습 이미지의 회전에 따라 함께 회전된 인식 객체에 대한 BB(Bouding Box)를 이용하여 회전 증분된 학습 이미지의 BB를 새롭게 생성한다(S150).Next, in step S140, a BB (Bouding Box) for the recognized object rotated together with the rotation of the training image is used to newly generate a BB of the training image whose rotation is incremented (S150).

이후, S140단계에서 회전 증분된 학습 이미지와 S150단계에서 생성된 BB가 객체 인지용 딥러닝 네트워크로 입력되어 학습에 이용된다(S160).Thereafter, the training image incremented by rotation in step S140 and the BB generated in step S150 are input to the deep learning network for object recognition and used for learning (S160).

이하에서는, S130단계에서의 회전각 선택, S130단계에서의 학습 이미지 회전, S140단계에서의 BB 생성 방법에 대해, 도 2를 참조하여 상세히 설명한다.Hereinafter, a method of selecting a rotation angle in step S130, rotating a training image in step S130, and a method of generating a BB in step S140 will be described in detail with reference to FIG. 2.

도 2는, 도 1에 도시된 회전 증분 과정(S130 내지 S150)의 상세 설명에 제공되는 도면이다. 도 2에는 특정 학습 이미지를 예시하여 회전 증분하는 과정 및 회전 증분한 결과를 나타내었다.FIG. 2 is a diagram provided for detailed explanation of the rotation increment process (S130 to S150) shown in FIG. 1. 2 illustrates a process of incrementing rotation and a result of incrementing rotation by exemplifying a specific training image.

도 2에 도시된 바와 같이, 학습 이미지의 회전 증분을 위해, 회전 증분을 적용할 학습 이미지와 인식 객체에 대한 BB를 입력받는다. 도 2의 좌측에 나타난 이미지가 회전 증분을 적용할 학습 이미지이고, 이 학습 이미지에 표시된 빨간 색 박스가 인식 객체에 대한 BB이다.As shown in FIG. 2, in order to increment the rotation of the training image, a training image to which the rotation increment is applied and a BB for a recognized object are input. The image shown on the left side of FIG. 2 is a training image to which the rotation increment is applied, and a red box displayed on the training image is a BB for the recognized object.

S120단계에서의 회전 증분을 위한 회전각 선택은, 평균이 0°인 가우시안 분포 그래프에 따라 랜덤하게 이루어지도록 한다. 도 2에 도시된 가우시안 분포 그래프에 따르면, 회전각이 -σ°~σ° 내에서 선택될 가능성은 68.2%이고, 회전각이 -2σ°~2σ° 내에서 선택될 가능성은 95.4%이다.The rotation angle selection for the rotation increment in step S120 is made randomly according to a Gaussian distribution graph having an average of 0°. According to the Gaussian distribution graph shown in FIG. 2, the probability that the rotation angle is selected within -σ° to σ° is 68.2%, and the probability that the rotation angle is selected within -2σ° to 2σ° is 95.4%.

선택 확률이 가우시안 분포에 따르므로, 0°에 가까운 회전각이 선택될 가능성이 높다.Since the selection probability follows a Gaussian distribution, a rotation angle close to 0° is likely to be selected.

S140단계에서의 학습 이미지의 회전은, 학습 이미지의 중심(c_x,c_y)을 원점으로 이동시키고, S130단계에서 선택된 회전각으로 학습 이미지를 회전시킨 후에, 회전된 학습 이미지의 중심(c_x,c_y)을 다시 원래의 위치로 이동시키는 과정에 의해 수행된다.In the rotation of the training image in step S140, the center of the training image (c _x ,c _y ) is moved to the origin, and the training image is rotated at the rotation angle selected in step S130, and then the center of the rotated training image (c _x It is performed by moving the ,c _y ) back to the original position.

이 과정에서 수행되는 이동 → 회전 → 이동을 위한 변환 행렬 T를 도 2에 제시하였다.The transformation matrix T for movement → rotation → movement performed in this process is presented in FIG. 2.

S150단계에서 수행되는 회전 증분된 학습 이미지에서 인식 객체에 대한 BB를 생성하는 방법이, 도 2의 하부에 도시되어 있는데, 이를 도 3 내지 도 5에 보다 시인성을 높여 도시하였다.A method of generating a BB for a recognized object from the rotation-incremented learning image performed in step S150 is shown in the lower part of FIG. 2, and this is illustrated with higher visibility in FIGS. 3 to 5.

회전 증분된 학습 이미지에서 인식 객체에 대한 BB를 생성하는 방법은 다음과 같다.A method of generating a BB for a recognized object from a rotation-incremented training image is as follows.

먼저, 도 3에 도시된 바와 같이 변환 행렬 T로 회전시킨 학습 이미지의 BB(B)에 외접하는 박스(BO)를 산출하고, 도 4에 도시된 바와 같이 변환 행렬 T로 회전시킨 학습 이미지의 BB(B)에 내접하는 박스(BI)를 산출한다.First, a box (BO) circumscribed to BB(B) of the training image rotated by the transformation matrix T as shown in FIG. 3 is calculated, and BB of the training image rotated by the transformation matrix T as shown in FIG. Box (BI) inscribed in (B) is calculated.

외접 박스(BO)의 가로/세로 변과 내접 박스(BI)의 가로/세로 변은, 회전 전 BB(B)의 가로/세로 변에 각각 평행하다.The horizontal/vertical sides of the circumscribed box BO and the horizontal/vertical sides of the inscribed box BI are parallel to the horizontal/vertical sides of the BB(B) before rotation, respectively.

다음, 도 5에 도시된 바와 같이, 내접 박스(BI)와 외접 박스(BO) 사이의 임의의 위치에서, 회전 증분된 학습 이미지에서 인식 객체에 대한 BB(B')를 생성한다.Next, as shown in FIG. 5, at an arbitrary position between the inscribed box BI and the circumscribed box BO, a BB(B') for the recognized object is generated from the rotation-incremented training image.

도 5에 나타난 바와 같이, 회전 증분된 학습 이미지의 BB(B')는, 외접 박스(BO)의 및 내접 박스(BI)와 중심은 일치하고, 가로/세로 변의 길이는 내접 박스(BI) 보다 길지만 외접 박스(BO) 보다 짧다.As shown in Figure 5, BB (B') of the training image that has been incremented by rotation, the center of the circumscribed box (BO) and the inscribed box (BI) coincide, and the length of the horizontal/vertical side is greater than that of the inscribed box (BI) Longer but shorter than circumscribed box (BO).

도 2의 우측에는 S150단계에서 생성된 BB(B')를 회전 증분된 학습 이미지에 부가한 상태를 나타내었다.The right side of FIG. 2 shows a state in which the BB(B') generated in step S150 is added to the rotation-incremented training image.

BI와 BO 사이에서 결정되는 B'의 위치와 크기는, BI와 B' 간의 거리와 B'와 BO 간의 거리의 비율로, 다음과 같이 정의할 수 있다.The location and size of B'determined between BI and BO is a ratio of the distance between BI and B'and the distance between B'and BO, and can be defined as follows.

(BI~D') : (D'~BO) = 0.5:0.5(BI~D'): (D'~BO) = 0.5:0.5

(BI~D') : BI의 가로/세로 변과 B'의 가로/세로 변 간의 길이(BI~D'): The length between the horizontal/vertical sides of BI and the horizontal/vertical sides of B'

(D'~BO) : B'의 가로/세로 변과 BO의 가로/세로 변 간의 길이(D'~BO): The length between the horizontal/vertical sides of B'and the horizontal/vertical sides of BO

거리 비인 "0.5:0.5"는 다른 비율, 이를 테면, "0.7:0.3", "0.3:0.7" 등의 다른 비율로 설정할 수 있음은 물론이다.It goes without saying that the distance ratio "0.5:0.5" can be set to different ratios, such as "0.7:0.3", "0.3:0.7", etc.

나아가, BI와 B' 간의 거리와 B'와 BO 간의 거리 비는, 고정된 비율이 아닌 가변 비율로 설정할 수도 있다. 이를 테면, S120단계에서 선택된 회전각에 따라 거리 비가 결정되는 것으로 구현가능하다. 이를 테면, 회전각이 ±45° 또는 ±135°에 가까울수록 거리 비는 "1:0"에 가깝고, 회전각이 ±45° 또는 ±135°에서 멀어질수록 거리 비는 "0:1"에 가깝도록 구현하는 것이 가능하다.Further, the distance between BI and B'and the distance ratio between B'and BO may be set to a variable ratio instead of a fixed ratio. For example, it can be implemented that the distance ratio is determined according to the rotation angle selected in step S120. For example, as the rotation angle is closer to ±45° or ±135°, the distance ratio is closer to “1:0”, and as the rotation angle is further from ±45° or ±135°, the distance ratio becomes “0:1”. It is possible to implement close together.

도 7에는, 도 6에 제시된 학습 이미지들에 대해, 본 발명의 실시예에 따른 회전 증분 방법을 적용한 결과를 예시하였다.7 illustrates the results of applying the rotation increment method according to an embodiment of the present invention to the training images shown in FIG. 6.

본 발명의 실시예에 따른 방법의 성능 검증을 위해, 최근 많이 사용되고 있는 SSD(Single Shot MultiBox Detector)를 본 발명의 실시예에 따라 증분된 학습 이미지들을 이용하여 학습시켰다.In order to verify the performance of the method according to the embodiment of the present invention, a single shot multibox detector (SSD), which has been widely used recently, was trained using incremented learning images according to the embodiment of the present invention.

도 8에는 VOC2007 테스트를 통해 기존 방법과 본 발명의 실시예에 따른 방법이 적용된 결과를 비교하였다. 이에 따르면, 본 발명의 실시예에 따른 방법은 SSD의 복잡도를 증가시키지 않으면서도, 이미지 전반에 걸쳐 높은 성능을 나타내었으며, 기존 방법에서 부정확하게 인지했던 많은 객체를 정확하게 인지할 수 있도록 하였음을 확인할 수 있다.In FIG. 8, the results of applying the method according to the embodiment of the present invention and the existing method through the VOC2007 test are compared. Accordingly, it can be seen that the method according to the exemplary embodiment of the present invention exhibits high performance over the image without increasing the complexity of the SSD, and enables accurate recognition of many objects that were incorrectly recognized in the existing method. have.

도 9는 본 발명의 다른 실시예에 따른 학습 데이터 증분 시스템의 블럭도이다. 본 발명의 다른 실시예에 따른 학습 데이터 증분 시스템은, 도 9 도시된 바와 같이, 통신부(210), 출력부(220), 프로세서(230), 입력부(240) 및 저장부(250)를 포함하는 컴퓨팅 시스템으로 구현할 수 있다.9 is a block diagram of a learning data increment system according to another embodiment of the present invention. Learning data incrementing system according to another embodiment of the present invention, as shown in FIG. 9, including a communication unit 210, an output unit 220, a processor 230, an input unit 240, and a storage unit 250 It can be implemented as a computing system.

통신부(210)는 외부 기기와 외부 네트워크로부터 학습 대상이 되는 학습 이미지를 입력받기 위한 통신 수단이다.The communication unit 210 is a communication means for receiving a learning image to be a learning target from an external device and an external network.

입력부(240)는 사용자 설정 명령을 입력받기 위한 입력 수단이고, 출력부(220)는 학습 이미지 및 학습 이미지 증분 과정과 결과를 표시하기 위한 디스플레이이다.The input unit 240 is an input means for receiving a user setting command, and the output unit 220 is a display for displaying a training image and a training image increment process and result.

프로세서(230)는 도 1에 도시된 방법을 실행하여 학습 이미지를 증분시키고, 증분된 학습 이미지로 객체 인지용 딥러닝 네트워크를 학습시킨다. 나아가, 프로세서(230)는 학습된 딥러닝 네트워크를 이용하여 입력 이미지에서의 객체 인지를 수행한다.The processor 230 executes the method shown in FIG. 1 to increment the training image, and trains the object recognition deep learning network with the incremented training image. Furthermore, the processor 230 performs object recognition in the input image using the learned deep learning network.

저장부(250)는 프로세서(230)가 동작함에 있어 필요한 저장 공간을 제공한다.The storage unit 250 provides a storage space necessary for the processor 230 to operate.

지금까지, 객체 인지용 딥러닝 네트워크의 학습을 위한 학습 데이터 증분 방법 및 시스템에 대해 바람직한 실시예를 들어 상세히 설명하였다.So far, a method and a system for incrementing learning data for learning a deep learning network for object recognition have been described in detail with reference to preferred embodiments.

본 발명의 실시예에 따른 학습 데이터 증분 방법 및 시스템은, 딥러닝 네트워크의 복잡도를 증가시키지 않으면서 객체 인지 성능을 강인하게 만들기 위한 기법을 제시한다.A method and system for incrementing learning data according to an embodiment of the present invention proposes a technique for enhancing object recognition performance without increasing the complexity of a deep learning network.

나아가, 본 발명의 실시예에 따른 학습 데이터 증분 방법 및 시스템은, 제한된 학습 데이터로 강인한 객체 인지용 딥러닝 네트워크의 학습이 가능하다.Further, the method and system for incrementing learning data according to an embodiment of the present invention enable learning of a robust object recognition deep learning network with limited training data.

본 발명의 실시예에 따른 학습 데이터 증분 기술은, 다양한 분야, 이를 테면, CCTV, 보안 로봇, 자율주행 자동차 등은 물론 그 밖의 영상 분석을 통해 객체 인지를 수행하는 다양한 시스템에 적용될 수 있다.The learning data increment technology according to an embodiment of the present invention can be applied to various fields, such as CCTV, security robots, autonomous vehicles, etc., as well as various systems that perform object recognition through image analysis.

한편, 본 실시예에 따른 장치와 방법의 기능을 수행하게 하는 컴퓨터 프로그램을 수록한 컴퓨터로 읽을 수 있는 기록매체에도 본 발명의 기술적 사상이 적용될 수 있음은 물론이다. 또한, 본 발명의 다양한 실시예에 따른 기술적 사상은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 코드 형태로 구현될 수도 있다. 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고 데이터를 저장할 수 있는 어떤 데이터 저장 장치이더라도 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광디스크, 하드 디스크 드라이브, 등이 될 수 있음은 물론이다. 또한, 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터간에 연결된 네트워크를 통해 전송될 수도 있다.Meanwhile, it goes without saying that the technical idea of the present invention can be applied to a computer-readable recording medium containing a computer program that performs functions of the apparatus and method according to the present embodiment. Further, the technical idea according to various embodiments of the present disclosure may be implemented in the form of a computer-readable code recorded on a computer-readable recording medium. The computer-readable recording medium can be any data storage device that can be read by a computer and can store data. For example, a computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, or the like. Further, a computer-readable code or program stored in a computer-readable recording medium may be transmitted through a network connected between computers.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although the preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention claimed in the claims. In addition, various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

B : 변환 행렬 T로 회전시킨 학습 이미지의 BB
BI : B에 내접하는 박스
BO : B에 외접하는 박스
B : 회전 증분된 학습 이미지의 BBB: BB of training image rotated by transformation matrix T
BI: Box inscribed to B
BO: Box circumscribed to B
B: BB of the training image incremented by rotation

Claims

Selecting a rotation angle to rotate the training image;
Rotating the training image according to the rotation angle selected in the selection step;
Generating a BB of the rotated learning image within a range determined using a BB (Bouding Box) for the recognized object rotated according to the selected rotation angle; Including,
The generation stage is,
A third box located between the first box inscribed with the BB rotated according to the rotation angle and the second box inscribed with the second box circumscribed is generated as the BB of the rotated learning image,
The location of the third box is,
It is determined by the ratio of the first distance, which is the distance between the first box and the third box, and the second distance, which is the distance between the third box and the second box,
The ratio of the first distance and the second distance (first distance: second distance) is,
It varies according to the selected rotation angle, but the closer the rotation angle is to ±45° or ±135°, the closer to 1:0, and the farther from ±45° or ±135° the distance ratio is closer to 0:1. Learning data increment method characterized by.

delete

The method according to claim 1,
The horizontal and vertical sides of the first box, the second box, and the third box,
A learning data increment method, characterized in that they are parallel to the horizontal and vertical sides of the BB before rotation.

delete

The method according to claim 1,
The selection step is,
A learning data increment method, characterized in that the rotation angle is randomly selected according to a Gaussian distribution graph having an average of 0°.

The method according to claim 1,
The rotating stage is,
Moving the center of the training image to the origin;
Rotating the training image based on the origin according to the selected rotation angle;
Step of moving the center of the rotated training image to the original position; Learning data increment method comprising a.

The method according to claim 1,
Prior to the selection step, incrementing the training image; further comprising,
The rotating stage is,
Learning data increment method, characterized in that rotating the incremented training image in the increment step.

The method of claim 8,
The incremental steps are:
A method of incrementing training data, characterized in that the training image is incremented through at least one of zooming, applying noise, and moving the training image.

An input unit for receiving a training image; And
Select a rotation angle to rotate the training image, rotate the training image according to the selected rotation angle, and rotate the training image within the range determined by using the BB (Bouding Box) for the recognized object rotated according to the selected rotation angle Including; a processor that generates the BB of,
The processor,
A third box positioned between the first box inscribed with the BB rotated according to the rotation angle and the second box inscribed with the second box circumscribed is generated as the BB of the rotated learning image
The location of the third box is,
It is determined by the ratio of the first distance, which is the distance between the first box and the third box, and the second distance, which is the distance between the third box and the second box,
The ratio of the first distance and the second distance (first distance: second distance) is,
It varies according to the selected rotation angle, but the closer the rotation angle is to ±45° or ±135°, the closer to 1:0, and the farther from ±45° or ±135° the distance ratio is closer to 0:1. Learning data increment system characterized by.

Selecting a rotation angle to rotate the training image;
Rotating the training image according to the rotation angle selected in the selection step;
Generating a BB of the rotated learning image within a range determined using a BB (Bouding Box) for the recognized object rotated according to the selected rotation angle; And
Using the rotated training image and the generated BB, training a deep learning network; Including,
The generation stage is,
A third box located between the first box inscribed with the BB rotated according to the rotation angle and the second box inscribed with the second box circumscribed is generated as the BB of the rotated learning image,
The location of the third box is,
It is determined by the ratio of the first distance, which is the distance between the first box and the third box, and the second distance, which is the distance between the third box and the second box,
The ratio of the first distance and the second distance (first distance: second distance) is,
It varies according to the selected rotation angle, but the closer the rotation angle is to ±45° or ±135°, the closer to 1:0, and the farther from ±45° or ±135° the distance ratio is closer to 0:1. Learning method characterized by.

An input unit for receiving a training image; And
Select a rotation angle to rotate the training image, rotate the training image according to the selected rotation angle, and rotate the training image within the range determined by using the BB (Bouding Box) for the recognized object rotated according to the selected rotation angle Including; a processor that generates the BB of and trains the deep learning network by using the rotated training image and the generated BB,
The processor,
A third box located between the first box inscribed with the BB rotated according to the rotation angle and the second box inscribed with the second box circumscribed is generated as the BB of the rotated learning image,
The location of the third box is,
It is determined by the ratio of the first distance, which is the distance between the first box and the third box, and the second distance, which is the distance between the third box and the second box,
The ratio of the first distance and the second distance (first distance: second distance) is,
It varies according to the selected rotation angle, but the closer the rotation angle is to ±45° or ±135°, the closer to 1:0, and the farther from ±45° or ±135° the distance ratio is closer to 0:1. Learning system characterized by.