KR102583686B1

KR102583686B1 - Method and apparatus for detecting objects

Info

Publication number: KR102583686B1
Application number: KR1020200036066A
Authority: KR
Inventors: 문성원; 이지원
Original assignee: 한국전자통신연구원
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2023-09-27
Also published as: KR20210119672A

Abstract

객체 검출 장치는 객체 검출 성능을 향상시키는 방향으로 학습된 전처리 신경망을 이용해 입력되는 영상에 대해 해당 영상의 통계적 특성을 변화시키는 영상 전처리를 수행하고, 상기 영상 전처리된 영상으로부터 객체를 검출하며, 상기 객체의 검출 결과를 상기 전처리 신경망으로 피드백한다.The object detection device performs image preprocessing to change the statistical characteristics of the input image using a preprocessing neural network learned to improve object detection performance, detects an object from the preprocessed image, and detects the object. The detection results are fed back to the preprocessing neural network.

Description

Object detection method and apparatus {METHOD AND APPARATUS FOR DETECTING OBJECTS}

본 발명은 객체 검출 방법 및 장치에 관한 것으로, 특히 다양한 기상효과가 존재하는 해상 영상을 사람의 육안이 아닌 선박 검출을 위한 기계 학습에 적합한 형태로 전처리하여 선박 검출 성능을 향상시킬 수 있는 객체 검출 방법 및 장치에 관한 것이다.The present invention relates to an object detection method and device. In particular, an object detection method that can improve ship detection performance by preprocessing marine images with various weather effects into a form suitable for machine learning for ship detection rather than the human eye. and devices.

운행 중인 선박이나 해안 초소, 관제 센터 등에서는 교통 통제나 해양 사고 지원 등 여러 가지 목적을 위해 주변 선박을 검출하는 방법이 필요하다. 이를 위해 일정 크기 이상의 선박에 대해 선박자동식별시스템(AIS, automatic identification system), 소형 선박에 대해 위치발신장치(V-pass) 등의 통신 기반 식별 방법을 적용하고 있으나 기기의 고장, 어장을 숨기기 위한 고의적인 미사용 등으로 민간 선박 정보를 획득하지 못하는 경우가 있다. 또한 적국 및 타국 군함의 경우 이러한 통신 기반으로 정보를 얻기는 어렵다. 이러한 경우 합성 개구 레이더, 위성 영상 등을 활용하는 기술이 제시되었으나 해당 정보의 획득에 오랜 시간이 걸려 해상 광학 영상에 대한 선박 검출이 필요하다.On operating ships, coastal posts, control centers, etc., methods are needed to detect nearby ships for various purposes, such as traffic control or maritime accident support. To this end, communication-based identification methods such as automatic identification system (AIS) are applied to vessels over a certain size and location transmitting devices (V-pass) to small vessels. There are cases where private vessel information cannot be obtained due to intentional non-use. Additionally, in the case of enemy countries and warships of other countries, it is difficult to obtain information based on such communication. In this case, technologies using synthetic aperture radar and satellite images have been proposed, but it takes a long time to acquire the information, so vessel detection on marine optical images is necessary.

해상 환경은 육상과 달리 해무, 후류, 수면 반사광 등의 환경요소가 많으며 데이터의 획득 또한 어렵다. 해양 영상의 경우 해무 제거를 통해 영상의 화질을 개선하고자 하는데, 대체적으로 해무의 특성을 분석하여 해무와 운광을 분리하고 이에 따른 밝기 보상을 하는 방식으로 해무를 제거한다. 하지만 영상 내에 밝기가 극단적으로 차이 나는 경우 왜곡이 심해지는 문제가 있으며 또한 사람의 육안으로 보는 것에 적합한 형태의 전처리 기술이므로 해상 광학 영상내의 객체를 검출하기 위한 기술로는 부적합하다. 또한 전통적인 영상처리 기법을 활용하여 해상 영상을 전처리하는 방법이 제안되었는데, 대체적으로 단순히 기상효과를 제거하는 방법에만 초점을 맞추고 있다.Unlike land, the marine environment has many environmental factors such as sea fog, wake, and reflected light from the water, and it is also difficult to obtain data. In the case of marine images, the goal is to improve the image quality by removing sea fog. In general, sea fog is removed by analyzing the characteristics of the sea fog, separating sea fog and cloud light, and then compensating for brightness accordingly. However, there is a problem of worsening distortion when there is an extreme difference in brightness within the image, and since it is a preprocessing technology suitable for viewing with the human eye, it is unsuitable as a technology for detecting objects in a resolution optical image. In addition, methods for preprocessing marine images using traditional image processing techniques have been proposed, but they generally focus on simply removing weather effects.

디노이징, 블러처리, 안개제거와 같은 영상 전처리 기법의 경우 대부분 인간의 육안에 최적화된 출력을 위한 것이며, 이러한 전처리 기술들은 기계학습 성능을 높이기 위해 활용하기에는 어려움이 있다. 이는 기존 대부분의 전처리 기술이 가지고 있는 문제이기도 하다. 또한 학습데이터 부족을 해결하기 위해 영상에 기하학적인 변화를 가하거나 색상을 바꾸는 등의 데이터 증강 기법이 있으나 이 방법 또한 인간의 직관에 의존하고 증강기법의 사각(화이트밸런스 등) 변화에 취약하다. Most image preprocessing techniques, such as denoising, blurring, and fog removal, are for output optimized for the human eye, and it is difficult to use these preprocessing techniques to improve machine learning performance. This is also a problem with most existing preprocessing technologies. Additionally, to solve the lack of learning data, there are data augmentation techniques such as adding geometric changes to images or changing colors, but these methods also rely on human intuition and are vulnerable to changes in the angle (white balance, etc.) of the augmentation techniques.

또한 많은 데이터 중 학습 효과가 높은 데이터를 선별적으로 학습하는 능동학습(Active learning) 기법이 유의미한 결과를 보이는 것을 통해 기계학습에 유효한 학습데이터가 존재하는 것을 확인하였으나 이는 기존 데이터 중 일부를 선별적으로 학습하는 기술이다.In addition, the active learning technique, which selectively learns data with a high learning effect among a large amount of data, showed significant results, confirming the existence of effective learning data for machine learning, but this was achieved by selectively learning some of the existing data. It is a learning skill.

본 발명이 해결하려는 과제는 데이터 확보가 어려운 해상 환경에서 촬영된 영상으로부터 객체 검출 성능을 높일 수 있는 객체 검출 방법 및 장치를 제공하는 것이다.The problem to be solved by the present invention is to provide an object detection method and device that can improve object detection performance from images captured in a maritime environment where data is difficult to secure.

본 발명의 한 실시 예에 따르면, 객체 검출 장치에서 입력되는 영상으로부터 객체를 검출하는 방법이 제공된다. 객체 검출 방법은 객체 검출 성능을 향상시키는 방향으로 학습된 전처리 신경망을 이용해 입력되는 영상에 대해 해당 영상의 통계적 특성을 변화시키는 영상 전처리를 수행하는 단계, 상기 영상 전처리된 영상으로부터 객체를 검출하는 단계, 그리고 상기 객체의 검출 결과를 상기 전처리 신경망으로 피드백하는 단계를 포함한다.According to one embodiment of the present invention, a method for detecting an object from an image input from an object detection device is provided. The object detection method includes performing image preprocessing to change the statistical characteristics of the input image using a preprocessing neural network learned to improve object detection performance, detecting an object from the preprocessed image, And it includes the step of feeding back the detection result of the object to the preprocessing neural network.

상기 검출하는 단계는 객체 검출을 위해 학습된 객체 검출 신경망을 이용하여 상기 영상 전처리된 영상으로부터 상기 객체를 검출하는 단계를 포함할 수 있다.The detecting step may include detecting the object from the image preprocessed using an object detection neural network learned to detect the object.

상기 피드백하는 단계는 상기 객체의 검출 결과를 토대로 상기 객체 검출 성능을 향상시키는 방향으로 상기 영상의 통계적 특성이 변화되도록 상기 전처리 신경망을 학습시키는 단계를 포함할 수 있다.The feedback step may include training the preprocessing neural network to change statistical characteristics of the image in a direction to improve object detection performance based on the object detection result.

상기 검출 결과는 검출된 객체의 클래스 및 상기 검출된 객체에 대한 손실값을 포함하고, 상기 피드백하는 단계는 상기 손실값을 줄이는 방향으로 상기 전처리 신경망을 학습시키는 단계를 포함할 수 있다.The detection result includes a class of the detected object and a loss value for the detected object, and the feedback step may include training the preprocessing neural network in a direction to reduce the loss value.

상기 통계적 특성은 색상값, 채도값, 밝기값, 화이트밸런스, 확률분포 자체 중 적어도 하나를 포함할 수 있따.The statistical characteristic may include at least one of color value, saturation value, brightness value, white balance, and probability distribution itself.

상기 영상은 해상에서 촬영된 영상을 포함하고, 상기 객체는 선박을 포함할 수 있다.The image may include an image taken at sea, and the object may include a ship.

본 발명의 다른 한 실시 예에 따르면, 입력된 영상으로부터 객체를 검출하는 객체 검출 장치가 제공된다. 객체 검출 장치는 전처리부, 검출부, 그리고 피드백 처리부를 포함한다. 상기 전처리부는 신경망 기반 객체 검출에 최적화된 형태로 영상을 변조시키도록 학습된 전처리 신경망을 이용하여 입력되는 영상을 변조하여 출력한다. 상기 검출부는 상기 전처리부로부터 출력된 영상으로부터 상기 신경망 기반 객체 검출을 수행한다. 그리고 상기 피드백 처리부는 상기 검출부의 객체 검출 결과를 상기 전처리부로 전달한다.According to another embodiment of the present invention, an object detection device for detecting an object from an input image is provided. The object detection device includes a pre-processing unit, a detection unit, and a feedback processing unit. The preprocessor modulates the input image using a preprocessing neural network learned to modulate the image into a form optimized for neural network-based object detection and outputs the modulated image. The detection unit performs the neural network-based object detection from the image output from the preprocessor. And the feedback processing unit transmits the object detection result of the detection unit to the pre-processing unit.

상기 전처리부는 상기 입력되는 영상과 상기 입력되는 영상으로부터 검출된 객체 검출 결과를 이용하여, 상기 신경망 기반 객체 검출 성능을 향상시키는 방향으로 상기 전처리 신경망을 재학습시킬 수 있다.The preprocessor may use the input image and object detection results detected from the input image to retrain the preprocessing neural network to improve object detection performance based on the neural network.

상기 전처리 신경망은 상기 영상을 변조시키기 위해 상기 영상의 통계적 특성을 변화시킬 수 있다.The preprocessing neural network may change statistical characteristics of the image to modulate the image.

상기 검출 결과는 검출된 객체의 클래스 및 상기 검출된 객체에 대한 손실값을 포함하고, 상기 전처리부는 상기 손실값을 줄이는 방향으로 상기 전처리 신경망을 재학습시킬 수 있다.The detection result includes a class of the detected object and a loss value for the detected object, and the preprocessor may retrain the preprocessing neural network in a direction to reduce the loss value.

본 발명의 실시 예에 의하면, 선박 검출 성능의 향상시키는 방향으로 학습된 신경망을 이용해 입력되는 해상 광학 영상의 통계적 특성을 변화시키는 영상 전처리를 수행한 후, 전처리된 영상에 대해 선박 검출을 시도함으로써, 선박 검출 성능을 향상시킬 수 있다. According to an embodiment of the present invention, image preprocessing is performed to change the statistical characteristics of the input marine optical image using a learned neural network in the direction of improving vessel detection performance, and then vessel detection is attempted on the preprocessed image, Ship detection performance can be improved.

즉, 본 발명의 실시 예에 의하면, 인간의 직관이나 육안으로 보이는 정성적 결과에 대한 판단에 의존하지 않고 선박 검출 성능의 향상만을 목적으로 할 수 있는 영상 전처리 기법이 적용되기 때문에, 적용 현장에 최적화된 전처리 기법을 데이터 누적에 따라 자동으로 도출해낼 수 있다. 이는 기존의 통신/레이더로 검출/식별이 불가능하였던 선박을 광학 영상을 통해 검출하는 것에 활용 가능한 기술로 해상감시 응용에 사용될 수 있다.That is, according to the embodiment of the present invention, an image preprocessing technique that aims only to improve vessel detection performance is applied without relying on human intuition or judgment on qualitative results visible to the naked eye, so it is optimized for the application site. The preprocessing technique can be automatically derived based on data accumulation. This is a technology that can be used to detect ships through optical images that were impossible to detect/identify with existing communications/radar, and can be used in maritime surveillance applications.

도 1은 본 발명의 한 실시 예에 따른 객체 검출 장치를 나타낸 도면이다.
도 2는 본 발명의 실시 예에 따른 전처리 신경망의 학습 방법을 나타낸 도면이다.
도 3은 본 발명의 실시 예에 따른 객체 검출 방법을 나타낸 흐름도이다.
도 4는 본 발명의 다른 실시 예에 따른 객체 검출 장치를 나타낸 도면이다.1 is a diagram showing an object detection device according to an embodiment of the present invention.
Figure 2 is a diagram showing a learning method of a preprocessing neural network according to an embodiment of the present invention.
Figure 3 is a flowchart showing an object detection method according to an embodiment of the present invention.
Figure 4 is a diagram showing an object detection device according to another embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Below, with reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily implement the present invention. However, the present invention may be implemented in many different forms and is not limited to the embodiments described herein. In order to clearly explain the present invention in the drawings, parts that are not related to the description are omitted, and similar parts are given similar reference numerals throughout the specification.

명세서 및 청구범위 전체에서, 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification and claims, when a part is said to “include” a certain component, this means that it may further include other components rather than excluding other components, unless specifically stated to the contrary.

이제 본 발명의 실시 예에 따른 객체 검출 방법 및 장치에 대하여 도면을 참고로 하여 상세하게 설명한다.Now, the object detection method and device according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 1은 본 발명의 한 실시 예에 따른 객체 검출 장치를 나타낸 도면이다.1 is a diagram showing an object detection device according to an embodiment of the present invention.

도 1을 참고하면, 객체 검출 장치(100)는 전처리부(110), 검출부(120) 및 피드백 처리부(130)를 포함한다. 객체 검출 장치(100)는 저장부(140)를 더 포함할 수 있다.Referring to FIG. 1 , the object detection apparatus 100 includes a pre-processing unit 110, a detection unit 120, and a feedback processing unit 130. The object detection device 100 may further include a storage unit 140.

전처리부(110)는 입력되는 영상을 전처리하고, 전처리된 영상을 검출부(120)로 전달한다. 본 발명의 한 실시 예에 따른 객체는 선박일 수 있으나, 사람 또는 구조물 등 이에 한정되지 않는다. 또한 입력되는 영상은 해양 환경에서 촬영된 해양 영상일 수 있으나, 이에 한정되지 않는다. 해양 환경에서 촬영된 영상에 적용되는 기존의 디노이징(denoising), 블러링(blurring) 처리, 안개제거와 같은 영상 전처리 기법은 대부분 인간의 육안에 최적화된 영상 출력을 위한 방법으로, 이러한 영상 전처리 기법들은 기계학습 성능을 높이기 위해 활용되기에는 어려움이 있다. The preprocessing unit 110 preprocesses the input image and transmits the preprocessed image to the detection unit 120. An object according to an embodiment of the present invention may be a ship, but is not limited to people or structures. Additionally, the input image may be a marine image captured in a marine environment, but is not limited to this. Most of the existing image pre-processing techniques such as denoising, blurring, and fog removal applied to images captured in the marine environment are methods for outputting images optimized for the human eye. These image pre-processing techniques It is difficult to use them to improve machine learning performance.

본 발명의 실시 예에 따른 전처리부(110)는 기계학습 모델에 해당하는 신경망을 이용한 객체 검출에 최적화되도록 영상의 통계적 특성을 변화시키는 영상 전처리 과정을 수행한다. 영상의 통계적 특성은 예를 들면, 픽셀의 색상값, 픽셀의 밝기값, 픽셀의 채도값, 픽셀의 특성값에 대한 평균값, 색상 분포, 화이트 밸런스(White balance), 특성값에 대한 확률분포 등을 포함할 수 있으며, 전처리부(110)는 입력되는 영상에 대해 이들 중 적어도 하나의 요소를 변화시켜 신경망을 이용한 객체 검출에 최적화시킨다. 전처리부(110)는 신경망을 이용한 객체 검출에 최적화되도록 영상의 통계적 특성을 변화시키는 영상 전처리 자체를 학습시킨 신경망을 이용한다. 아래에서는 영상 전처리를 위해 학습된 신경망을 전처리 신경망이라 한다. 기존의 스타일 변화(Style transfer) 등의 기법에 사용된 적대적 생성 모델(Generative Adversarial Networks)은 목표하는 분포와 유사한 분포를 만드는 신경망을 사용하였다면, 본 발명의 실시 예에 따른 전처리부(110)에서는 입력되는 영상의 통계적 특성을 신경망을 이용한 객체 검출 성능 향상에 유리한 형태로 변화시키는 전처리 신경망을 사용함으로써, 입력되는 영상에서 특징점을 찾아내는 기존의 전처리 기법과는 달리 입력되는 영상을 신경망을 이용한 객체 검출에 최적화되도록 변조하여 출력한다. The preprocessor 110 according to an embodiment of the present invention performs an image preprocessing process that changes the statistical characteristics of the image to optimize object detection using a neural network corresponding to a machine learning model. Statistical characteristics of images include, for example, pixel color value, pixel brightness value, pixel saturation value, average value of pixel characteristic value, color distribution, white balance, probability distribution of characteristic value, etc. It may be included, and the preprocessor 110 changes at least one element of the input image to optimize object detection using a neural network. The preprocessor 110 uses a neural network that learns image preprocessing itself to change the statistical characteristics of the image to optimize object detection using a neural network. Below, the neural network learned for image preprocessing is called a preprocessing neural network. If the Generative Adversarial Networks used in existing techniques such as style transfer use a neural network that creates a distribution similar to the target distribution, the preprocessor 110 according to an embodiment of the present invention By using a preprocessing neural network that changes the statistical characteristics of the image into a form that is advantageous for improving object detection performance using a neural network, the input image is optimized for object detection using a neural network, unlike existing preprocessing techniques that find feature points in the input image. Modify and output as much as possible.

즉, 전처리부(110)에서 입력되는 영상의 통계적 특성을 신경망을 이용한 객체 검출 성능 향상에 유리한 형태로 변화시킨다는 것은 입력되는 영상을 사람의 육안으로 보이는 것에 적합한 형태로 변화시키는 것을 의미하는 것이 아니라, 객체 검출을 위한 신경망에 적합한 형태가 되도록 입력되는 영상을 전처리하는 것을 의미하며, 입력되는 영상과 완전히 다른 영상으로 변조하는 것을 포함할 수 있다. In other words, changing the statistical characteristics of the image input from the preprocessor 110 into a form advantageous for improving object detection performance using a neural network does not mean changing the input image into a form suitable for viewing with the human eye. This means preprocessing the input image to form a form suitable for a neural network for object detection, and may include modulating an image that is completely different from the input image.

전처리부(110)는 학습 영상과 학습 영상에 대한 신경망을 이용한 객체 검출 결과를 이용하여, 신경망을 이용한 객체 검출에 최적화되도록 해당 영상의 통계적 특성이 변하도록 전처리 신경망을 학습시켜, 객체 검출을 위한 신경망에 특화된 전처리 신경망을 생성한다. 또한 전처리부(110)는 피드백 처리부(130)를 통해 피드백되는 검출부(120)의 검출 결과를 이용하여 객체 검출을 위한 신경망의 객체 검출 성능을 향상시키는 방향으로 전처리 신경망을 재학습시킬 수 있다. 전처리부(110)는 검출부(120)의 검출 결과를 토대로 전처리 신경망의 학습 방향을 결정하고 전처리 신경망의 파라미터 등을 업데이트할 수 있다. 전처리부(110)는 검출부(120)의 검출 결과의 양성 또는 음성에 따라 전처리 신경망의 학습 방향을 결정할 수 있다. 전처리부(110)는 검출부(120)의 검출된 객체에 대한 손실 값을 최소화하는 방향으로 전처리 신경망을 학습시킬 수 있다. The preprocessor 110 uses the training image and the object detection results using a neural network for the learning image to train the preprocessing neural network to change the statistical characteristics of the image to optimize object detection using the neural network, thereby creating a neural network for object detection. Create a specialized preprocessing neural network. Additionally, the pre-processing unit 110 may use the detection results of the detection unit 120 fed back through the feedback processing unit 130 to retrain the pre-processing neural network in a way that improves the object detection performance of the neural network for object detection. The pre-processing unit 110 may determine the learning direction of the pre-processing neural network based on the detection result of the detection unit 120 and update the parameters of the pre-processing neural network. The preprocessing unit 110 may determine the learning direction of the preprocessing neural network depending on whether the detection result of the detection unit 120 is positive or negative. The preprocessing unit 110 may train the preprocessing neural network in a way that minimizes the loss value for the detected object of the detection unit 120.

이와 같이, 선박 검출을 위한 신경망에 특화된 전처리 신경망을 이용한 영상 전처리를 통해 검출부(120)의 객체 검출 성능을 향상시킬 수 있다. In this way, the object detection performance of the detection unit 120 can be improved through image pre-processing using a pre-processing neural network specialized for ship detection.

검출부(120)는 전처리부(110)로부터 전처리된 영상에 대해 신경망 기반의 선박 검출 기법을 적용하여 선박 검출을 수행하고, 선박 검출 성능을 측정한다. 검출부(120)는 Cascade R-CNN(Region-based Convolutional Neural Network)과 같은 신경망을 사용하여 선박 검출을 수행할 수 있으며, 그 외에도 다양한 기계 학습을 이용한 선박 검출 기법을 적용할 수 있다. 검출부(120)는 선박 검출 과정에서 객체 클래스(class)와 검출된 객체에 대한 손실(loss) 값을 측정할 수 있다. 객체 클래스는 식별의 대상이 되는 객체의 종류를 의미하며, 사람, 선박이나 구조물 등으로 구별될 수 있으며, 필요에 따라 좀 더 세밀하게 구별될 수도 있다. 예를 들면, 선박 같은 경우, 상선, 군함, 어선 등으로 객체 클래스가 설정될 수 있다. 또한 검출된 객체에 대한 손실값은 신경망의 출력 값과 실제 객체 클래스 값의 차이를 나타낸다. 이때 검출부(120)는 전처리부(110)의 전처리 신경망의 적용 여부에 따른 객체 클래스와 검출된 객체에 대한 손실 값을 측정할 수 있으며, 이를 통해 적용된 전처리 신경망을 이용한 영상 전처리 기법의 유효성 여부를 판별할 수 있다. The detection unit 120 performs vessel detection by applying a neural network-based vessel detection technique to the image preprocessed by the preprocessor 110 and measures vessel detection performance. The detection unit 120 can detect ships using a neural network such as Cascade R-CNN (Region-based Convolutional Neural Network), and can also apply ship detection techniques using various machine learning. The detection unit 120 may measure the object class and loss value for the detected object during the vessel detection process. Object class refers to the type of object that is the subject of identification, and can be distinguished as a person, ship, or structure, and can be distinguished in more detail as needed. For example, in the case of a ship, the object class may be set to a merchant ship, warship, fishing ship, etc. Additionally, the loss value for the detected object represents the difference between the output value of the neural network and the actual object class value. At this time, the detection unit 120 can measure the object class and the loss value for the detected object depending on whether the preprocessing neural network of the preprocessing unit 110 is applied, and through this, determine whether the image preprocessing technique using the applied preprocessing neural network is effective. can do.

피드백 처리부(130)는 검출부(120)의 검출 결과를 전처리부(110)로 전달한다. 또한 피드백 처리부(130)는 검출부(120)의 검출 결과를 입력 영상에 대응하여 저장부(140)에 저장할 수 있다. 이렇게 저장된 데이터는 추후 학습 데이터로 활용되어, 검출부(120)의 신경망이 업데이트될 수 있다.The feedback processing unit 130 transmits the detection result of the detection unit 120 to the preprocessing unit 110. Additionally, the feedback processing unit 130 may store the detection result of the detection unit 120 in the storage unit 140 in response to the input image. The data stored in this way can be used as learning data later, and the neural network of the detection unit 120 can be updated.

도 2는 본 발명의 실시 예에 따른 전처리 신경망의 학습 방법을 나타낸 도면이다.Figure 2 is a diagram showing a learning method of a preprocessing neural network according to an embodiment of the present invention.

도 2를 참고하면, 전처리부(110)는 입력 영상들과 입력 영상들에 대한 실제 객체 검출 결과를 전처리 신경망의 학습 데이터로 사용한다. Referring to FIG. 2, the preprocessor 110 uses input images and real object detection results for the input images as learning data for the preprocessing neural network.

전처리부(110)는 입력 영상과 입력 영상에 대한 객체 검출 결과를 수신한다(S210). The preprocessor 110 receives the input image and the object detection result for the input image (S210).

전처리부(110)는 입력 영상과 이 입력 영상에 대한 객체 검출 결과를 토대로 전처리 신경망의 학습 방향을 결정한다(S220). 전처리부(110)는 실제 객체 검출 결과의 양성 또는 음성에 따라 전처리 신경망의 학습 방향을 결정할 수 있다. 양성은 객체가 제대로 검출된 것을 나타내고, 음성은 그렇지 않은 것을 나타낸다. The preprocessor 110 determines the learning direction of the preprocessing neural network based on the input image and the object detection result for the input image (S220). The preprocessor 110 may determine the learning direction of the preprocessing neural network depending on whether the actual object detection result is positive or negative. A positive indicates that the object was detected correctly, and a negative indicates that it was not detected.

전처리부(110)는 결정된 학습 방향을 토대로 입력된 영상을 신경망 기반 객체 검출에 최적화된 형태로 변환시키는 전처리 신경망을 학습시킨다(S230). 예를 들면, 객체 검출 결과의 손실값이 기존보다 작아졌다면, 객체 검출 성능이 향상되는 방향으로 학습이 된 것이고, 객체 검출 결과의 손실값이 기존보다 커졌다면, 객체 검출 성능이 나빠진 방향으로 학습이 된 것이라 할 수 있다. 전처리부(110)는 입력 영상과 이 입력 영상에 대한 실제 객체 검출 결과를 토대로 객체 검출 성능을 향상시키는 방향으로 전처리 신경망을 학습시킬 수 있다. 객체 검출 성능을 향상시키는 방향으로 전처리 신경망을 학습시키는 방법으로, 전처리부(110)는 영상 내의 객체 좌표의 실제 값(Ground truth)에 해당하는 바운딩 박스와 예측된 바운딩 박스의 손실을 최소화 하는 방향, 객체의 센터 좌표의 손실을 최소화 하는 방향, 세그맨테이션 값에 대한 IOU(Intersection over Union)를 최대화 하는 방향 등으로 전처리 신경망을 학습시킬 수 있다. The preprocessor 110 trains a preprocessing neural network that converts the input image into a form optimized for neural network-based object detection based on the determined learning direction (S230). For example, if the loss value of the object detection result is smaller than before, learning is done in the direction of improving object detection performance, and if the loss value of the object detection result is larger than before, learning is done in the direction of worsening object detection performance. It can be said that it has happened. The preprocessor 110 may train a preprocessing neural network to improve object detection performance based on the input image and the actual object detection result for the input image. As a method of learning a preprocessing neural network in the direction of improving object detection performance, the preprocessor 110 minimizes the loss of the bounding box and the predicted bounding box corresponding to the ground truth of the object coordinates in the image, A preprocessing neural network can be trained in a direction that minimizes the loss of the object's center coordinates and maximizes IOU (Intersection over Union) for segmentation values.

이렇게 학습된 전처리 신경망에서는 영상이 입력되면, 입력되는 영상을 변조하여, 최종적으로 검출부(120)에서 전처리된 영상을 이용한 객체 검출 성능이 향상되도록 한다. In the preprocessing neural network learned in this way, when an image is input, the input image is modulated to ultimately improve object detection performance using the image preprocessed by the detection unit 120.

도 3은 본 발명의 실시 예에 따른 객체 검출 방법을 나타낸 흐름도이다.Figure 3 is a flowchart showing an object detection method according to an embodiment of the present invention.

도 3을 참고하면, 전처리부(110)는 객체를 검출하기 위한 입력 영상을 수신하면(S310), 학습된 전처리 신경망의 영상 전처리 기법을 이용하여 입력 영상의 통계적 특성을 변화시켜 출력한다(S320). Referring to FIG. 3, when the preprocessor 110 receives an input image for detecting an object (S310), it changes the statistical characteristics of the input image using the image preprocessing technique of the learned preprocessing neural network and outputs it (S320). .

검출부(120)는 전처리부(110)로부터 출력된 영상으로부터 신경망 기반 객체 검출 기법을 이용하여 객체를 검출한다(S330). The detection unit 120 detects an object from the image output from the preprocessor 110 using a neural network-based object detection technique (S330).

검출부(120)는 객체 검출 결과를 피드백 처리부(130)를 통해 전처리부(110)로 전달한다. 검출부(120)는 검출된 객체 클래스 및 검출된 객체에 대한 손실 값 등을 객체 검출 결과로서 피드백 처리부(130)를 통해 전처리부(110)로 전달할 수 있다.The detection unit 120 transmits the object detection result to the pre-processing unit 110 through the feedback processing unit 130. The detection unit 120 may transmit the detected object class and the loss value for the detected object to the preprocessor 110 through the feedback processor 130 as the object detection result.

전처리부(110)는 입력 영상과 이 입력 영상에 대한 객체 검출 결과를 토대로 전처리 신경망을 학습시킬 수 있다(S340). 즉, 전처리부(110)는 입력 영상과 이 입력 영상에 대한 객체 검출 결과를 토대로 신경망 기반 객체 검출 성능이 향상되도록, 전처리 신경망을 업데이트시킬 수 있다.The preprocessor 110 may learn a preprocessing neural network based on the input image and the object detection result for the input image (S340). That is, the preprocessor 110 may update the preprocessing neural network to improve neural network-based object detection performance based on the input image and the object detection result for the input image.

도 4는 본 발명의 다른 실시 예에 따른 객체 검출 장치를 나타낸 도면으로, 도 1 내지 도 3을 참고하여 설명한 객체 검출 장치 및 방법 중 적어도 일부를 수행하는 데 사용할 수 있는 시스템을 나타낸다.FIG. 4 is a diagram illustrating an object detection device according to another embodiment of the present invention, and represents a system that can be used to perform at least some of the object detection devices and methods described with reference to FIGS. 1 to 3.

도 4를 참고하면, 객체 검출 장치(400)는 프로세서(410), 메모리(420), 저장 장치(430) 및 입출력(input/output, I/O) 인터페이스(440)를 포함한다.Referring to FIG. 4 , the object detection device 400 includes a processor 410, a memory 420, a storage device 430, and an input/output (I/O) interface 440.

프로세서(410)는 중앙 처리 유닛(central processing unit, CPU)이나 기타 칩셋, 마이크로프로세서 등으로 구현될 수 있다.The processor 410 may be implemented as a central processing unit (CPU), other chipset, microprocessor, etc.

메모리(420)는 동적 랜덤 액세스 메모리(dynamic random access memory, DRAM), 램버스 DRAM(rambus DRAM, RDRAM), 동기식 DRAM(synchronous DRAM, SDRAM), 정적 RAM(static RAM, SRAM) 등의 RAM과 같은 매체로 구현될 수 있다. The memory 420 is a medium such as RAM, such as dynamic random access memory (DRAM), rambus DRAM (RDRAM), synchronous DRAM (SDRAM), and static RAM (static RAM, SRAM). It can be implemented as:

저장 장치(430)는 하드 디스크(hard disk), CD-ROM(compact disk read only memory), CD-RW(CD rewritable), DVD-ROM(digital video disk ROM), DVD-RAM, DVD-RW 디스크, 블루레이(blu-ray) 디스크 등의 광학 디스크, 플래시 메모리, 다양한 형태의 RAM과 같은 영구 또는 휘발성 저장 장치로 구현될 수 있다.The storage device 430 may be a hard disk, compact disk read only memory (CD-ROM), CD rewritable (CD-RW), digital video disk ROM (DVD-ROM), DVD-RAM, or DVD-RW disk. , may be implemented as permanent or volatile storage devices such as optical disks such as Blu-ray disks, flash memory, and various types of RAM.

I/O 인터페이스(440)는 프로세서(410) 및/또는 메모리(420)가 저장 장치(430)에 접근할 수 있도록 한다. 또한 I/O 인터페이스(440)는 외부 예를 들면, 사용자와의 인터페이스를 제공할 수 있다. I/O interface 440 allows processor 410 and/or memory 420 to access storage device 430. Additionally, the I/O interface 440 may provide an interface with an external user, for example.

메모리(420) 또는 저장 장치(430)는 저장부(140)를 포함할 수 있다. The memory 420 or the storage device 430 may include a storage unit 140.

프로세서(410)는 도 1 내지 도 3에서 설명한 객체 검출 기능을 수행할 수 있으며, 전처리부(110), 검출부(120) 및 피드백 처리부(130) 중 적어도 일부의 기능을 구현하기 위한 프로그램 명령을 메모리(420)에 로드시켜, 도 1 내지 도 3을 참고로 하여 설명한 동작이 수행되도록 제어할 수 있다. 그리고 이러한 프로그램 명령은 저장 장치(430)에 저장되어 있을 수 있으며, 또는 네트워크로 연결되어 있는 다른 시스템에 저장되어 있을 수 있다. The processor 410 may perform the object detection function described in FIGS. 1 to 3, and may store program instructions for implementing at least some of the functions of the preprocessor 110, the detection unit 120, and the feedback processing unit 130. By loading it at 420, it can be controlled so that the operations described with reference to FIGS. 1 to 3 are performed. And these program commands may be stored in the storage device 430, or may be stored in another system connected to a network.

이상에서 본 발명의 실시 예에 대하여 상세하게 설명하였지만 본 발명의 권리 범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리 범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements can be made by those skilled in the art using the basic concept of the present invention defined in the following claims. It falls within the scope of rights.

Claims

A method of detecting an object from an image input from an object detection device,
Performing image preprocessing to change the statistical characteristics of the input image using a preprocessing neural network learned to improve object detection performance,
Detecting an object from the image preprocessed image, and
Feeding back the detection result of the object to the preprocessing neural network,
The detection result includes a class of the detected object and a loss value for the detected object,
The class of the object represents the type of object that is the subject of identification,
The loss value represents the difference between the output value of the neural network and the actual object class,
The feedback step includes training the preprocessing neural network in a direction to reduce the loss value, and
An object detection method where the statistical characteristics include color value, saturation value, brightness value, white balance, and probability distribution.

In paragraph 1:
The detecting step includes detecting the object from the image preprocessed using an object detection neural network learned to detect the object.

In paragraph 1:
The feedback step includes training the preprocessing neural network to change statistical characteristics of the image in a direction to improve object detection performance based on the object detection result.

delete

In paragraph 1:
An object detection method wherein the image includes an image taken at sea, and the object includes a ship.

An object detection device that detects an object from an input image,
A preprocessor that performs image preprocessing to change the statistical characteristics of the input image using a preprocessing neural network learned to improve object detection performance;
A detection unit that detects an object from the pre-processed image, and
A feedback processing unit that feeds back the detection result of the object to the pre-processing neural network,
The detection result includes a class of the detected object and a loss value for the detected object,
The class of the object represents the type of object that is the subject of identification,
The loss value represents the difference between the output value of the neural network and the actual object class,
The feedback processing unit trains the preprocessing neural network to reduce the loss value, and
The statistical characteristics include color value, saturation value, brightness value, white balance, and probability distribution.

In paragraph 7:
The detection unit detects the object from the pre-processed image using an object detection neural network learned to detect the object.

The object detection device of claim 7, wherein the feedback processing unit trains the preprocessing neural network to change statistical characteristics of the image in a direction that improves object detection performance based on the detection result of the object.

delete