KR102341471B1

KR102341471B1 - Method and aooaratus for object regobnition using thermal imaging sensor and imaging sensor

Info

Publication number: KR102341471B1
Application number: KR1020190151848A
Authority: KR
Inventors: 김종원
Original assignee: 김종원
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2021-12-21
Also published as: KR20210063569A

Abstract

본 발명의 다양한 실시 예는 열화상 센서와 영상 센서를 이용한 객체 인식 기반 야간 인명구조 관리 시스템에 관한 것이다. 본 발명의 다양한 실시 예에 따르면, 열화상 센서와 영상 센서를 이용한 객체 인식 기반 야간 인명구조 관리 시스템은, 상기 열화상 센서와 상기 영상 센서를 포함하며 상기 열화상 센서와 상기 영상 센서를 이용하여 열화상 영상과 촬영 영상을 획득하는 영상 획득부; 및 상기 열화상 영상과 상기 촬영 영상을 이용하여 객체를 탐지하는 객체 분석 장치;를 포함하고, 상기 영상 획득부는 상기 객체 분석 장치로 상기 열화상 영상과 상기 촬영 영상을 전송하고, 상기 객체 분석 장치는, 상기 수신한 열화상 영상과 상기 촬영 영상을 전처리하고, 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하고, 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다. 다른 실시 예들도 가능할 수 있다.Various embodiments of the present invention relate to a night lifesaving management system based on object recognition using a thermal image sensor and an image sensor. According to various embodiments of the present disclosure, an object recognition-based night lifesaving management system using a thermal image sensor and an image sensor includes the thermal image sensor and the image sensor, and uses the thermal image sensor and the image sensor to provide heat an image acquisition unit for acquiring an image image and a photographed image; and an object analysis apparatus for detecting an object using the thermal image and the captured image, wherein the image acquisition unit transmits the thermal image and the captured image to the object analysis apparatus, and the object analysis apparatus comprises: , the received thermal image and the captured image are pre-processed, and the pre-processed thermal image and the pre-processed captured image are respectively input to a first deep neural network and a second deep neural network that have been pre-learned, and the first The first object detection result and the second object detection result are obtained through the deep neural network and the second deep neural network, and the first object detection result and the second object detection result are previously learned in a third deep neural network. By inputting it, a final object detection result may be obtained through the third deep neural network. Other embodiments may be possible.

Description

Object analysis method and apparatus using a thermal image sensor and an image sensor

본 발명은 열화상 센서와 영상 센서를 이용한 객체 분석 방법 및 장치에 관한 것으로, 보다 자세하게는 영상 획득부를 통해 획득한 열화상 영상과 촬영 영상을 이용하여 객체 인식의 정확도를 높인 열화상 센서와 영상 센서를 이용한 객체 분석 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for analyzing an object using a thermal image sensor and an image sensor, and more particularly, a thermal image sensor and an image sensor that increase the accuracy of object recognition by using a thermal image obtained through an image acquisition unit and a photographed image It relates to an object analysis method and apparatus using

사람들의 활동 영역이 넓어짐에 따라 인명구조 활동은 다양한 시간대와 장소에서 필요하다. 특히, 야간 해양 인명구조 활동은 단순히 조명을 비추거나 조명탄을 활용한 육안 수색에 의존하고 있고, 이러한 인명구조 활동은 많은 인력과 시간이 소요되는 문제점이 있었다.As the field of activity of people expands, lifesaving activities are required at various times and places. In particular, marine lifesaving activities at night simply depend on visual search using lights or flares, and these lifesaving activities require a lot of manpower and time.

한편, 무인 항공기(drone) 기술이 발전함에 따라 사람이 접근하기 어려운 지역까지 무인 항공기가 촬영한 영상을 통해 해당 지역의 정보를 획득할 수 있다. 그러나, 앞서 설명한 야간 해양과 같은 지역은 단순히 해당 지역의 영상을 획득하여도 육안으로 구별하기 어려우므로 정확이 어느 위치에서 조난자가 위치하는 지 알기 어려운 문제점이 있었다.Meanwhile, with the development of unmanned aerial vehicle (drone) technology, it is possible to acquire information on the area through the image captured by the unmanned aerial vehicle to an area that is difficult for humans to access. However, there is a problem in that it is difficult to know exactly where the distressed person is located because it is difficult to distinguish with the naked eye even if an image of the area is simply acquired in the above-described area such as the night ocean.

KR 10-2003187KR 10-2003187

본 발명은 상기와 같은 문제점을 해결하기 위해 안출한 것으로서, 야간 해상 지역과 같이 조난자 발견이 어려운 지역에서도 손쉽게 조난자를 탐지할 수 있는 열화상 센서와 영상 센서를 이용한 객체 분석 방법 및 장치를 제공하는데 그 목적이 있다.The present invention has been devised to solve the above problems, and to provide an object analysis method and apparatus using a thermal image sensor and an image sensor that can easily detect a survivor even in an area where it is difficult to find a survivor such as a sea area at night. There is a purpose.

본 발명의 다양한 실시 예에 따르면, 야간 인명 구조를 위한 객체 분석 장치는, 열화상 센서와 영상 센서를 이용하여 열화상 영상과 촬영 영상을 획득하는 영상 획득부로부터 영상을 수신하는 통신 모듈 및 상기 수신되는 열화상 영상과 촬영 영상을 이용하여 객체를 탐지하는 제어부를 포함하고, 상기 제어부는 상기 수신되는 열화상 영상과 촬영 영상을 전처리하고, 상기 전처리된 열화상 영상과 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하고, 상기 제1 객체 탐지 결과와 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득한다.According to various embodiments of the present disclosure, an apparatus for analyzing an object for lifesaving at night includes a communication module for receiving an image from an image acquisition unit that acquires a thermal image and a photographed image using a thermal image sensor and an image sensor, and the reception and a control unit for detecting an object using the thermal image and the captured image, wherein the control unit pre-processes the received thermal image and the captured image, and pre-learns the pre-processed thermal image and the captured image. input to the deep neural network and the second deep neural network, respectively, to obtain a first object detection result and a second object detection result through the first deep neural network and the second deep neural network, and the first object detection result and The second object detection result is input to a pre-trained third deep neural network, and a final object detection result is obtained through the third deep neural network.

본 발명의 다양한 실시 예에 따르면, 상기 제어부는, 상기 최종 객체 탐지 결과에 기반하여 탐지된 객체가 조난자라고 판단될 경우, 상기 조난자가 탐지된 지점의 위치 정보와 영상정보를 관리자 장치로 전송할 수 있다.According to various embodiments of the present disclosure, when it is determined that the detected object is a person in distress based on the final object detection result, the controller may transmit location information and image information of a point where the person in distress is detected to the manager device. .

본 발명의 다양한 실시 예에 따르면, 상기 제1 딥 뉴럴 네트워크는 깊이맵(dense map) 기반 YOLO(you only look once)이고, 상기 제2 딥 뉴럴 네트워크는 컬러맵(color map) 기반 YOLO이고, 상기 제3 딥 뉴럴 네트워크는 다층 퍼셉트론(Multi-Layer Perceptron: MLP)일 수 있다.According to various embodiments of the present disclosure, the first deep neural network is a depth map-based you only look once (YOLO), the second deep neural network is a color map-based YOLO, and the The third deep neural network may be a Multi-Layer Perceptron (MLP).

본 발명의 다양한 실시 예에 따르면, 상기 제1 객체 탐지 결과는 상기 열화상 영상에서 탐지된 객체의 위치, 객체의 클래스(class) 및 객체의 탐지 점수(confidence score)를 포함하고, 상기 제2 객체 탐지 결과는 상기 촬영 영상에서 탐지된 객체의 위치, 객체의 클래스 및 객체의 탐지 점수를 포함하고, 상기 최종 객체 탐지 결과는 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다. According to various embodiments of the present disclosure, the first object detection result includes a location of an object detected in the thermal image, a class of the object, and a confidence score of the object, and the second object The detection result includes the position of the object detected in the captured image, the class of the object, and the detection score of the object, and the final object detection result is the final position of the object detected together in the thermal image and the captured image, It may include the final class and final detection score of the object.

본 발명의 다양한 실시 예에 따르면, 상기 상기 제어부는 상기 촬영 영상과 상기 열화상 영상을 시각 동기화하고, 상기 시각 동기화된 촬영 영상과 열화상 영상의 해상도가 동일하도록 상기 시각 동기화된 촬영 영상과 열화상 영상을 스케일링(scaling)할 수 있다.According to various embodiments of the present disclosure, the control unit time-synchronizes the captured image and the thermal image, and the time-synchronized captured image and the thermal image so that the resolution of the time-synchronized captured image and the thermal image is the same The image may be scaled.

본 발명의 다양한 실시 예에 따르면, 상기 객체 분석 장치는 상기 영상 획득부의 위치 정보를 획득하는 GPS 모듈을 더 포함하고, 상기 통신 모듈은 상기 관리자 장치와 연결되도록 할 수 있다.According to various embodiments of the present disclosure, the object analysis device may further include a GPS module configured to acquire location information of the image acquisition unit, and the communication module may be connected to the manager device.

본 발명의 다양한 실시 예에 따르면, 야간 인명구조를 위한 객체 분석 장치는 열화상 센서와 영상 센서를 이용하여 열화상 영상과 촬영 영상을 획득하는 영상 획득부로부터 영상을 수신하는 통신 모듈 및 상기 수신되는 열화상 영상과 상기 촬영 영상을 이용하여 객체를 탐지하는 제어부를 포함하고, 제어부는 상기 수신한 열화상 영상과 상기 촬영 영상을 전처리하고, 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하고, 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 융합함으로써 상기 객체의 예측 맵(probability map)을 생성하고, 상기 생성한 객체의 예측 맵을 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다. According to various embodiments of the present disclosure, an object analysis apparatus for lifesaving at night includes a communication module for receiving an image from an image acquisition unit that acquires a thermal image and a photographed image using a thermal image sensor and an image sensor, and the received and a controller for detecting an object using a thermal image and the captured image, wherein the controller pre-processes the received thermal image and the captured image, and pre-learns the pre-processed thermal image and the pre-processed captured image input to a first deep neural network and a second deep neural network, respectively, to obtain a first object detection result and a second object detection result through the first deep neural network and the second deep neural network, and the first By fusing the object detection result and the second object detection result, a prediction map of the object is generated, and the generated object prediction map is input to a pre-trained third deep neural network, and the third deep A final object detection result can be obtained through a neural network.

본 발명의 다양한 실시 예에 따르면, 상기 객체의 예측 맵은 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 사전 위치, 객체의 사전 클래스 및 객체의 사전 탐지 점수를 포함하고, 상기 최종 객체 탐지 결과는 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다.According to various embodiments of the present disclosure, the prediction map of the object includes a prior location of an object detected together in the thermal image and the captured image, a prior class of the object, and a prior detection score of the object, and the final object detection The result may include a final position of an object detected together in the thermal image and the captured image, a final class of the object, and a final detection score of the object.

본 발명의 다양한 실시 예에 따르면, 야간 인명구조를 위한 객체 분석 방법은 열화상 영상과 촬영 영상을 획득하는 단계, 상기 열화상 영상과 상기 촬영 영상을 전처리하는 단계, 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하는 단계 및 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득하는 단계를 포함한다.According to various embodiments of the present disclosure, the object analysis method for lifesaving at night includes obtaining a thermal image and a photographed image, pre-processing the thermal image and the photographed image, the pre-processed thermal image and the The preprocessed captured image is input to the pre-trained first deep neural network and the second deep neural network, respectively, and the first object detection result and the second object detection result through the first deep neural network and the second deep neural network and inputting the first object detection result and the second object detection result into a pre-trained third deep neural network to obtain a final object detection result through the third deep neural network. .

상기와 같은 본 발명에 따르면, 아래와 같은 다양한 효과들을 가진다.According to the present invention as described above, it has various effects as follows.

본 발명은 시야 확보가 어려운 야간에서도 인명구조가 필요한 조난자를 신속하고 정확하게 탐지할 수 있다.According to the present invention, it is possible to quickly and accurately detect a person in distress in need of lifesaving even at night when visibility is difficult.

또한, 본 발명은 딥 뉴럴 네트워크에서 열화상 영상과 촬영 영상을 모두 활용함으로써 객체 탐지의 정확도를 높일 수 있다.In addition, the present invention can increase the accuracy of object detection by utilizing both a thermal image and a captured image in a deep neural network.

도 1은 본 발명의 일 실시 예에 따른 야간 인명구조 관리 시스템을 나타낸 블록도다.
도 2는 본 발명의 일 실시 예에 따른 야간 인명구조 관리 방법을 나타낸 흐름도이다.
도 3은 본 발명의 일 실시 예에 따른 객체를 탐지하는 방법을 나타낸 흐름도이다.
도 4는 본 발명의 일 실시 예에 따른 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크를 나타낸 흐름도이다.
도 5는 본 발명의 일 실시 예에 따른 제3 딥 뉴럴 네트워크를 나타낸 흐름도이다.
도 6은 본 발명의 다른 실시 예에 따른 야간 인명구조 관리 방법을 나타낸 흐름도이다.1 is a block diagram illustrating a night life rescue management system according to an embodiment of the present invention.
2 is a flowchart illustrating a night lifesaving management method according to an embodiment of the present invention.
3 is a flowchart illustrating a method for detecting an object according to an embodiment of the present invention.
4 is a flowchart illustrating a first deep neural network and a second deep neural network according to an embodiment of the present invention.
5 is a flowchart illustrating a third deep neural network according to an embodiment of the present invention.
6 is a flowchart illustrating a night lifesaving management method according to another embodiment of the present invention.

이하 본 발명의 다양한 실시 예를 첨부된 도면을 참조하여 상세히 설명한다. 그리고, 본 발명의 실시 예를 설명함에 있어서, 관련된 공지기능 혹은 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단된 경우 그 상세한 설명은 생략한다. 그리고 후술되는 용어들은 본 발명의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings. And, in describing the embodiment of the present invention, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. And, the terms to be described later are terms defined in consideration of the functions of the present invention, which may vary according to the intention or custom of the user or operator. Therefore, the definition should be made based on the content throughout this specification.

본 발명의 바람직한 실시예에 대하여 첨부한 도면을 참조하여 설명하면 다음과 같다. 그러나 본 발명은 이하에서 개시되는 실시예에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예는 본 발명의 개시가 완전하도록 하며 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다.A preferred embodiment of the present invention will be described with reference to the accompanying drawings as follows. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, only this embodiment allows the disclosure of the present invention to be complete and the scope of the invention to those of ordinary skill in the art completely It is provided to inform you.

본 문서의 다양한 실시예들은 기기(machine)(예: 컴퓨터)로 읽을 수 있는 저장 매체(machine-readable storage media))에 저장된 명령어를 포함하는 소프트웨어(예: 프로그램)로 구현될 수 있다. 기기는, 저장 매체로부터 저장된 명령어를 호출하고, 호출된 명령어에 따라 동작이 가능한 장치로서, 개시된 실시예들에 따른 전자 장치(예: 서버)를 포함할 수 있다. 명령은 컴파일러 또는 인터프리터에 의해 생성 또는 실행되는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, ‘비일시적’은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장됨을 구분하지 않는다.Various embodiments of the present document may be implemented as software (eg, a program) including instructions stored in a machine-readable storage medium (eg, a computer). The device is a device capable of calling a stored command from a storage medium and operating according to the called command, and may include an electronic device (eg, a server) according to the disclosed embodiments. Instructions may include code generated or executed by a compiler or interpreter. The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' means that the storage medium does not include a signal and is tangible, and does not distinguish that data is semi-permanently or temporarily stored in the storage medium.

일시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 온라인으로 배포될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to an example, the method according to various embodiments disclosed in this document may be included and provided in a computer program product. Computer program products may be traded between sellers and buyers as commodities. The computer program product may be distributed in the form of a machine-readable storage medium (eg, compact disc read only memory (CD-ROM)) or online through an application store (eg, Play Store™). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily generated in a storage medium such as a memory of a server of a manufacturer, a server of an application store, or a relay server.

다양한 실시예들에 따른 구성 요소(예: 모듈 또는 프로그램) 각각은 단수 또는 복수의 개체로 구성될 수 있으며, 전술한 해당 서브 구성 요소들 중 일부 서브 구성 요소가 생략되거나, 또는 다른 서브 구성 요소가 다양한 실시예에 더 포함될 수 있다. 대체적으로 또는 추가적으로, 일부 구성 요소들(예: 모듈 또는 프로그램)은 하나의 개체로 통합되어, 통합되기 이전의 각각의 해당 구성 요소에 의해 수행되는 기능을 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따른, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱하게 실행되거나, 적어도 일부 동작이 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다.Each of the components (eg, a module or a program) according to various embodiments may be composed of a singular or a plurality of entities, and some sub-components of the aforementioned sub-components may be omitted, or other sub-components may be It may be further included in various embodiments. Alternatively or additionally, some components (eg, a module or a program) may be integrated into a single entity to perform the same or similar functions performed by each corresponding component prior to integration. According to various embodiments, operations performed by a module, program, or other component are sequentially, parallel, repetitively or heuristically executed, or at least some operations are executed in a different order, are omitted, or other operations are added. can be

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein may be used with the meaning commonly understood by those of ordinary skill in the art to which the present invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless clearly defined in particular.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components.

도 1 은 본 발명의 일 실시 예에 따른 야간 인명구조 관리 시스템을 나타낸 블록도다.1 is a block diagram showing a night life rescue management system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 야간 인명구조 관리 시스템(10)은 야간에서 특정 지역(예: 해상, 사고위험지역)의 열화상 영상과 촬영 영상을 획득하기 위한 영상 획득부(200)와 딥 러닝에 기반하여 객체를 탐지하는 객체 분석 장치(100)를 포함할 수 있다. 예를 들어, 본 발명의 시스템(10)은 영상 획득부(200)의 열화상 영상과 촬영 영상으로부터 사전 학습된 제1 내지 제3 딥 뉴럴 네트워크를 적용하여 객체 중 조난자를 탐지할 수 있다. 상기 영상 획득부(200)는 예를 들어 드론이 이용될 수 있다. 그러나, 본 발명은 이에 한정되는 것은 아니며 상기 영상 획득부(200)는 열화상 영상과 촬영 영상을 획득할 수 있는 다양한 수단이 이용될 수 있다. 1, the night life rescue management system 10 according to an embodiment of the present invention is an image acquisition unit for acquiring a thermal image and a photographed image of a specific area (eg, sea, accident risk area) at night 200 and an object analysis apparatus 100 for detecting an object based on deep learning. For example, the system 10 of the present invention may detect a survivor among objects by applying the first to third deep neural networks pre-learned from the thermal image and the captured image of the image acquisition unit 200 . The image acquisition unit 200 may be, for example, a drone. However, the present invention is not limited thereto, and various means for acquiring a thermal image and a photographed image may be used for the image acquisition unit 200 .

일 실시 예에서, 영상 획득부(200)는 비행하는 동안 해당 지역의 열화상 영상과 촬영 영상을 획득할 수 있다. 예를 들어, 영상 획득부(200)는 일정 주기 또는 일정 거리마다 지속적으로 비행 지역을 촬영할 수 있고, 동일한 지역의 열화상 영상과 촬영 영상을 함께 획득할 수 있다. 여기서 비행 지역은 어느 한 지역에 국한되지 않고 야간에서 사고 발생 확률이 높은 지역이나 해상 지역이 일 예가 될 수 있다. 이를 위해, 영상 획득부(200)는 열화상 센서(210), 영상 센서(220) 및 통신 모듈(230)을 포함할 수 있다.In an embodiment, the image acquisition unit 200 may acquire a thermal image and a captured image of a corresponding area during flight. For example, the image acquisition unit 200 may continuously photograph a flight area at a predetermined period or at a predetermined distance, and may acquire a thermal image and a photographed image of the same area together. Here, the flight area is not limited to any one area, and an example may be an area or a sea area with a high probability of an accident occurring at night. To this end, the image acquisition unit 200 may include a thermal image sensor 210 , an image sensor 220 , and a communication module 230 .

일 실시 예에서, 영상 획득부(200)는 통신 모듈(230)을 이용하여 무선 통신망으로 통해 실시간으로 획득한 열화상 영상과 촬영 영상을 객체 분석 장치(100)에 전송할 수 있다. 또한, 도면에는 도시되지 않았지만, 조난자가 어느 위치에 있는지 검출할 수 있도록 영상 획득부(200)는 GPS 모듈을 포함할 수 있고, 이를 이용하여 조난자의 위치 정보를 객체 분석 장치(100)에 전송할 수 있다. In an embodiment, the image acquisition unit 200 may transmit a thermal image and a captured image acquired in real time through a wireless communication network using the communication module 230 to the object analysis apparatus 100 . In addition, although not shown in the drawing, the image acquisition unit 200 may include a GPS module to detect the location of the person in distress, and may transmit the location information of the person in distress to the object analysis apparatus 100 using the GPS module. have.

일 실시 예에서, 열화상 센서(210)는 비행 지역을 영상 획득부 (200)에서 촬영하여 열화상 영상을 획득할 수 있다. 열화상 센서(210)는 열을 추적, 탐지하여 영상을 촬영하는 열화상 카메라일 수 있다.In an embodiment, the thermal image sensor 210 may acquire a thermal image by photographing a flight area by the image acquisition unit 200 . The thermal imaging sensor 210 may be a thermal imaging camera that captures an image by tracking and detecting heat.

일 실시 예에서, 영상 센서(220)는 카메라일 수 있고, 비행 지역을 영상 획득부(200)에서 촬영할 수 있다. 영상 센서(220)는 정지영상, 동영상 등을 포함하는 촬영 영상을 획득할 수 있다.In an embodiment, the image sensor 220 may be a camera, and the image acquisition unit 200 may photograph the flight area. The image sensor 220 may acquire a captured image including a still image, a moving image, and the like.

일 실시 예에서, 통신 모듈(230)은 이동통신을 위한 기술표준들 또는 통신방식에 따라 구축된 이동 통신망 상에서 객체 분석 장치(100)와 무선 신호를 송수신한다. 아울러, 통신 모듈(230)는 블루투스, 와이파이 등의 단거리 통신도 지원할 수 있다.In an embodiment, the communication module 230 transmits/receives a wireless signal to and from the object analysis apparatus 100 on a mobile communication network constructed according to technical standards or communication methods for mobile communication. In addition, the communication module 230 may support short-range communication such as Bluetooth and Wi-Fi.

일 실시 예에서, 객체 분석 장치(100)는 영상 획득부(200)의 비행을 제어할 수 있고, 열화상 영상과 촬영 영상을 딥 러닝 기법을 이용하여 분석함으로써 객체를 탐지할 수 있다. 여기서 객체는 사람, 동물, 자동차 등 다양할 수 있다. 객체 분석 장치(100)는 제어부(110), 데이터베이스(120), GPS 모듈(130) 및 통신 모듈(140)을 포함할 수 있다. 통신 모듈(140)은 영상 획득부(200)와 무선 신호를 송수신한다. 아울러, 통신 모듈(140)은 블루투스, 와이파이 등의 단거리 통신도 지원할 수 있다.In an embodiment, the object analysis apparatus 100 may control the flight of the image acquisition unit 200 , and may detect an object by analyzing a thermal image and a captured image using a deep learning technique. Here, the object may be various, such as a person, an animal, a car, and the like. The object analysis apparatus 100 may include a controller 110 , a database 120 , a GPS module 130 , and a communication module 140 . The communication module 140 transmits and receives wireless signals to and from the image acquisition unit 200 . In addition, the communication module 140 may support short-range communication such as Bluetooth and Wi-Fi.

일 실시 예에서, 데이터베이스(120)는 영상 획득부(200)에서 수신한 열화상 영상과 촬영 영상, 딥 러닝 학습 결과, 객체 탐지 결과 등을 빅데이터화여 저장할 수 있다. 또한, 데이터베이스(120)는 해상 지역, 사고 발생 위험 지역 등의 지형 정보를 미리 빅데이터화 하여 저장할 수 있다. 참고로 데이터베이스(120)는 객체 분석 장치(100)에 포함될 수도 있고 외부의 서버(미도시)에 포함될 수도 있다. In an embodiment, the database 120 may store the thermal image and the captured image received from the image acquisition unit 200 , a deep learning learning result, an object detection result, and the like into big data. In addition, the database 120 may store topographical information such as a sea area and an accident risk area in advance as big data. For reference, the database 120 may be included in the object analysis apparatus 100 or may be included in an external server (not shown).

일 실시 예에서, GPS(Global Positioning System) 모듈(130)은 조난자가 탐지된 곳의 위치 정보를 수집할 수 있다. 즉, GPS 모듈(130)은 조난자의 위치 또는 영상 획득부(200)의 위치(또는 현재 위치)를 획득하기 위한 모듈로서, GPS 위성에서 보내는 신호를 이용하여 조난자의 위치 또는 영상 획득부(200)의 위치를 획득할 수 있다.In an embodiment, the Global Positioning System (GPS) module 130 may collect location information of a place where a person in distress is detected. That is, the GPS module 130 is a module for acquiring the position (or current position) of the position of the distressed person or the image acquisition unit 200, and the position or image acquisition unit 200 of the distressed person using a signal transmitted from a GPS satellite. position can be obtained.

일 실시 예에서, 제어부(110)는 객체 분석 장치(100)에 포함된 각 구성들의 동작들을 전반적으로 제어할 수 있다. 이를 위해 제어부(110)는 각각의 구성들과 내부 네트워크를 통해 데이터 통신을 수행할 수 있다.In an embodiment, the controller 110 may control overall operations of each component included in the object analysis apparatus 100 . To this end, the controller 110 may perform data communication through each of the components and an internal network.

일 실시 예에서, 제어부(110)는 영상 획득부(200)의 원격 컨트롤러 기능을 수행할 수 있고, 영상 획득부(200)에게 통신 모듈(140)을 통해 기 설정된 비행 명령 신호를 전송할 수 있고, 열화상 센서(210)와 영상 센서(220)에 촬영 신호를 전송할 수 있다.In one embodiment, the control unit 110 may perform a remote controller function of the image acquisition unit 200, may transmit a preset flight command signal to the image acquisition unit 200 through the communication module 140, A photographing signal may be transmitted to the thermal image sensor 210 and the image sensor 220 .

일 실시 예에서, 제어부(110)는 영상 획득부(200)로부터 촬영된 비행 지역의 열화상 영상과 촬영 영상이 통신 모듈(140)을 통해 수신되도록 제어할 수 있고, 수신한 열화상 영상과 촬영 영상을 전부 딥러닝 기반의 제1 딥 뉴럴 네트워크 내지 제3 딥 뉴럴 네트워크를 학습시키는데 활용하거나 임의로 복수의 영상 중 일부를 추출하여 제1 딥 뉴럴 네트워크 내지 제3 딥 뉴럴 네트워크를 학습시키는데 활용할 수 있다.In an embodiment, the controller 110 may control the thermal image and the captured image of the flight area photographed from the image acquisition unit 200 to be received through the communication module 140, and the received thermal image and photographing All images may be used to train the first deep neural network or the third deep neural network based on deep learning, or a part of a plurality of images may be arbitrarily extracted and used to train the first deep neural network or the third deep neural network.

일 실시 예에서, 제어부(110)는 제1 내지 제3 딥 뉴럴 네트워크를 이용하여 열화상 영상과 촬영 영상으로부터 객체를 탐지할 수 있다. 예컨대, 제어부(110)는 열화상 영상과 촬영 영상으로부터 객체를 검출하기 위한 분류기(classifier)일 수 있고, 컨볼루션 신경망(CNN: convolutional neural network)과 다층 퍼셉트론(Multi-Layer Perceptron: MLP)을 활용할 수 있다. In an embodiment, the controller 110 may detect an object from a thermal image and a captured image using the first to third deep neural networks. For example, the controller 110 may be a classifier for detecting an object from a thermal image and a captured image, and utilize a convolutional neural network (CNN) and a multi-layer perceptron (MLP). can

일 실시 예에서, 제어부(110)는 복수의 열화상 영상과 촬영 영상을 제1 내지 제3 딥 뉴럴 네트워크에 입력하여 객체를 탐지할 수 있다. 구체적으로, 각각의 딥 뉴럴 네트워크의 가중치(weights)를 학습시킬 수 있고, 이 때 가중치들은 0에서 1 사이의 값을 갖는다. 제어부(110)는 미리 학습된 제1 내지 제3 딥 뉴럴 네트워크를 적용하여 열화상 영상과 촬영 영상에서 픽셀 단위 또는 바운딩 박스(bounding box, 단위 영역)별로 어느 객체인지 객체가 아닌지 따라 분류된 각 클래스(class)에 속할 확률(class probability)을 각각 출력할 수 있다. In an embodiment, the controller 110 may detect an object by inputting a plurality of thermal images and captured images to the first to third deep neural networks. Specifically, weights of each deep neural network can be trained, and in this case, the weights have a value between 0 and 1. The controller 110 applies pre-trained first to third deep neural networks to each class classified according to which object is or is not an object by pixel unit or bounding box (unit area) in the thermal image and the captured image. Each probability of belonging to (class) can be output.

일 실시 예에서, 객체 분석 장치(100)는 네트워크를 통해 관리자 장치(20)와 통신 연결될 수 있다. 네트워크는 무선 네트워크 및 유선 네트워크를 포함할 수 있다. 예를 들어, 상기 네트워크는 근거리 통신 네트워크(예: 블루투스, WiFi direct 또는 IrDA(infrared data association)) 또는 원거리 통신 네트워크(예: 셀룰러 네트워크, 인터넷, 또는 컴퓨터 네트워크(예: LAN 또는 WAN))일 수 있다.In an embodiment, the object analysis device 100 may be communicatively connected to the manager device 20 through a network. Networks may include wireless networks and wired networks. For example, the network may be a short-range communication network (eg, Bluetooth, WiFi direct, or infrared data association (IrDA)) or a telecommunications network (eg, a cellular network, the Internet, or a computer network (eg, LAN or WAN)). have.

일 실시 예에서, 관리자 장치(20)는 조난자를 구조할 수 있는 기관의 관리자가 사용하는 장치일 수 있다. 외부 장치(20)는 예를 들면, 스마트폰(smartphone), 서버(server), 태블릿 PC(tablet personal computer), 데스크탑 PC (desktop PC), 랩탑 PC(laptop PC), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), PDA(personal digital assistant), PMP(portable multimedia player), MP3 플레이어, 모바일 의료기기, 또는 카메라 중 적어도 하나를 포함할 수 있다.In an embodiment, the manager device 20 may be a device used by a manager of an organization capable of rescuing a person in distress. The external device 20 is, for example, a smartphone, a server, a tablet personal computer, a desktop PC, a laptop PC, a netbook computer, It may include at least one of a workstation, a personal digital assistant (PDA), a portable multimedia player (PMP), an MP3 player, a mobile medical device, and a camera.

도 2는 본 발명의 일 실시 예에 따른 야간 인명구조 관리 방법을 나타낸 흐름도이다. 도 3은 본 발명의 일 실시 예에 따른 객체를 탐지하는 방법을 나타낸 흐름도이다. 도 4는 본 발명의 일 실시 예에 따른 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크를 나타낸 흐름도이다. 도 5는 본 발명의 일 실시 예에 따른 제3 딥 뉴럴 네트워크를 나타낸 흐름도이다. 도 2의 동작들은 도 1의 객체 분석 장치(100) 및 영상 획득부(200)에 의해 수행될 수 있다.2 is a flowchart illustrating a night lifesaving management method according to an embodiment of the present invention. 3 is a flowchart illustrating a method for detecting an object according to an embodiment of the present invention. 4 is a flowchart illustrating a first deep neural network and a second deep neural network according to an embodiment of the present invention. 5 is a flowchart illustrating a third deep neural network according to an embodiment of the present invention. The operations of FIG. 2 may be performed by the object analysis apparatus 100 and the image acquisition unit 200 of FIG. 1 .

도 2 내지 도 5를 참조하면, 일 실시 예에서, 영상 획득부(200)는, 동작 21에서, 열화상 센서(210)와 영상 센서(220)를 이용하여 열화상 영상과 촬영 영상을 획득할 수 있다. 예를 들어, 열화상 영상과 촬영 영상은 야간에 획득될 수 있다.2 to 5 , in an embodiment, the image acquisition unit 200 acquires a thermal image and a captured image using the thermal image sensor 210 and the image sensor 220 in operation 21 . can For example, a thermal image and a photographed image may be acquired at night.

일 실시 예에서, 영상 획득부(200)는, 동작 22에서, 객체 분석 장치(100)로 열화상 영상과 촬영 영상을 전송할 수 있다. 예를 들어, 영상 획득부(200)는 영상을 획득할 때 마다 또는 일정 주기마다 전송 동작을 수행할 수 있다.In an embodiment, the image acquisition unit 200 may transmit a thermal image and a captured image to the object analysis apparatus 100 in operation 22 . For example, the image acquisition unit 200 may perform a transmission operation every time an image is acquired or every predetermined period.

일 실시 예에서, 객체 분석 장치(100)의 제어부(110)는, 동작 23에서, 도 3에 도시된 바와 같이 수신한 열화상 영상과 촬영 영상을 영상 전처리할 수 있다. 예를 들어, 객체 분석 장치(100)는 촬영 영상과 열화상 영상을 시각 동기화할 수 있다. 여기서 시각 동기화란 동일한 시점에 영상끼리 매칭시키는 동작일 수 있다. 촬영 영상과 열화상 영상은 서로 다른 센서에서 획득한 영상이므로 시각 동기화가 필요할 수 있다. 또한, 객체 분석 장치(100)는 시각 동기화된 촬영 영상과 열화상 영상의 해상도가 동일하도록 시각 동기화된 촬영 영상과 열화상 영상을 스케일링(scaling)할 수 있다. 또한, 해상도 이외에도 사이즈 등이 동일하도록 추가 스케일링을 할 수 있다.In an embodiment, in operation 23 , the controller 110 of the object analysis apparatus 100 may pre-process the received thermal image and the captured image as shown in FIG. 3 . For example, the object analysis apparatus 100 may visually synchronize the captured image and the thermal image. Here, the time synchronization may be an operation of matching images with each other at the same time point. Since the captured image and the thermal image are images obtained from different sensors, time synchronization may be required. Also, the object analysis apparatus 100 may scale the time-synchronized captured image and the thermal image so that the resolution of the time-synchronized captured image and the thermal image is the same. In addition, additional scaling may be performed so that the size and the like are the same in addition to the resolution.

일 실시 예에서, 객체 분석 장치(100)의 제어부(110)는, 동작 24에서, 도면 3에 도시된 바와 같이 열화상 전처리 영상과 촬영 전처리 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득할 수 있다.In an embodiment, the control unit 110 of the object analysis apparatus 100 performs a first deep neural network and a second deep learning in advance of the thermal image pre-processed image and the photographing pre-processed image as shown in FIG. 3 in operation 24 . The first object detection result and the second object detection result may be obtained through the first deep neural network and the second deep neural network, respectively, by input to the neural network.

일 실시 예에서, 제1 딥 뉴럴 네트워크는 깊이맵(dense map) 기반 YOLO(you only look once)이고, 제2 딥 뉴럴 네트워크는 컬러맵(color map) 기반 YOLO일 수 있다. 즉, 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네크워크는 기본 구조는 서로 동일할 수 있으나, 학습하는 데이터가 열화상 영상과 촬영 영상으로 상이하므로 가중치가 다를 수 있다. 여기서 깊이맵이란 열화상 영상을 이용한 네트워크를 의미하고 컬러맵이란 촬영 영상을 이용한 네트워크를 의미할 수 있다.In an embodiment, the first deep neural network may be a depth map-based you only look once (YOLO), and the second deep neural network may be a color map-based YOLO. That is, the first deep neural network and the second deep neural network may have the same basic structure, but weights may be different because the learning data is different from a thermal image and a captured image. Here, the depth map may mean a network using a thermal image, and the color map may mean a network using a captured image.

일 실시 예에서, YOLO(You Only Look Once)는 이미지 내의 bounding box와 class probability를 single regression problem으로 간주하여, 이미지를 한 번 보는 것으로 객체의 종류와 위치를 예측할 수 있다. 도 4와 같이 single convolutional network를 통해 multiple bounding box에 대한 class probablility를 계산하는 방식이다. 예를 들어, YOLO는 입력 영상을 S X S grid로 나눌 수 있고, 각각의 grid cell은 B개의 bounding box와 각 bounding box에 대한 탐지 점수(confidence score)를 가질 수 있다. 따라서, 만약 cell에 객체가 존재하지 않는다면 탐지 점수는 0이 된다. 각각의 grid cell은 C개의 conditional class probability를 가질 수 있고, 각각의 bounding box는 x, y, w, h, confidence로 구성될 수 있다, 예를 들어, (x,y): Bounding box의 중심점을 의미하며, grid cell의 범위에 대한 상대값이 입력될 수 있고, (w,h): 전체 이미지의 width, height에 대한 상대값이 입력될 수 있다.(예1: 만약 x가 grid cell의 가장 왼쪽에 있다면 x=0, y가 grid cell의 중간에 있다면 y=0.5, 예2: bbox의 width가 이미지 width의 절반이라면 w=0.5) YOLO를 이용한 학습 시 conditional class probability와 bounding box의 confidence score를 곱하여 특정한 class에 대한 　confidence score를 얻을 수 있다. 이러한 탐지 점수는 해당 bounding box가 class를 나타내는지, 얼마나 bounding box가 객체와 잘 맞춰져있는지에 대한 정보를 포함하고 있다. In an embodiment, YOLO (You Only Look Once) considers the bounding box and class probability in an image as a single regression problem, and may predict the type and location of an object by looking at the image once. As shown in FIG. 4, it is a method of calculating class probability for multiple bounding boxes through a single convolutional network. For example, YOLO may divide an input image into an S X S grid, and each grid cell may have B bounding boxes and a confidence score for each bounding box. Therefore, if there is no object in the cell, the detection score is 0. Each grid cell can have C conditional class probabilities, and each bounding box can consist of x, y, w, h, and confidence, for example, (x,y): Meaning, relative values for the grid cell range can be input, and (w,h): relative values for the width and height of the entire image can be input. (Example 1: If x is the most If it is on the left, x=0, if y is in the middle of the grid cell, y=0.5, Example 2: If the width of the bbox is half the width of the image, w=0.5) Conditional class probability and confidence score of bounding box when learning using YOLO By multiplying, you can get a confidence score for a specific class. These detection scores include information on whether the corresponding bounding box represents a class and how well the bounding box fits the object.

예를 들어, 도 4에 도시된 YOLO의 구조는 GoogLeNet for image classification 모델을 기반으로 할 수 있다.(24 Convolutional layers & 2 Fully Connected layers) 여기서, 7X7은 49개의 Grid Cell을 의미한다. 그리고 각각의 Grid cell은 B개의 bounding Box를 가지고 있는데(여기선 B=2), 앞 5개의 값은 해당 Grid cell의 첫 번째 bounding box에 대한 값이 채워질 수 있다. 예컨대, 네트워크의 첫번째 Convolutional layer는 영상로부터 특징을 추출하고, fully connected layer는 클래스일 확률과 좌표를 예측할 수 있다. 네트워크의 최종 출력은 7x7x30 tensor로 출력될 수 있고, detection network는 24개의 Convolutional layer와 2개의 fully connected layer를 가질 수 있고, 1x1 convolutional layer를 번갈아가면서 사용하여 layer를　지날 때마다　특징 공간이 줄어들 수 있다.For example, the structure of YOLO shown in FIG. 4 may be based on the GoogLeNet for image classification model (24 Convolutional layers & 2 Fully Connected layers). Here, 7X7 means 49 Grid Cells. And each grid cell has B bounding boxes (here, B=2), and the first five values can be filled with the values for the first bounding box of the grid cell. For example, the first convolutional layer of the network extracts features from the image, and the fully connected layer can predict class probability and coordinates. The final output of the network can be output as a 7x7x30 tensor, and the detection network can have 24 convolutional layers and 2 fully connected layers. .

일 실시 예에서, YOLO로 구성된 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크 학습 시 하기 수학식 1인 multi-part loss function이 적용될 수 있다.In an embodiment, when learning the first deep neural network and the second deep neural network composed of YOLO, the multi-part loss function of Equation 1 below may be applied.

[수학식 1][Equation 1]

여기서, Grid cell의 여러 bounding box들 중, ground-truth box와의 IOU가 가장높은 bounding box를 predictor로 설정할 수 있고,

은 객체가 존재하는 grid cell i의 predictor bounding box j일 수 있고,

는 객체가 존재하지 않는 grid cell i의 bounding box j일 수 있고,

는 객체가 존재하는 grid cell i일 수 있다.Here, among the many bounding boxes of the grid cell, the bounding box with the highest IOU with the ground-truth box can be set as a predictor,

may be the predictor bounding box j of grid cell i where the object exists,

may be bounding box j of grid cell i where no object exists,

may be grid cell i in which the object exists.

또한, (1)은 객체가 존재하는 grid cell i의 predictor bounding box j에 대해, x와 y의 loss를 연산하는 과정이고, (2)는 객체가 존재하는 grid cell i의 predictor bounding box j에 대해, w와 h의 loss를 연산하는 과정이다. 즉, 큰 box에 대해서는 small deviation을 반영하기 위해 제곱근을 취한 후, sum-squared error를 한다.(같은 error라도 larger box의 경우 상대적으로 IOU에 영향을 적게 준다.) (3)은 객체가 존재하는 grid cell i의 predictor bounding box j에 대해, confidence score의 loss를 연산하는 과정이다.(Ci=1) (4)는 객체가 존재하지 않는 grid cell i의 bounding box j에 대해, confidence score의 loss를 연산하는 과정이다.(Ci=0) (5)는 객체가 존재하는 grid cell i에 대해, conditional class probability의 loss를 연산하는 과정이다.(Correct class c:　pi(c)=1, otherwise:　pi(c)=0)Also, (1) is the process of calculating the loss of x and y for the predictor bounding box j of grid cell i where the object exists, and (2) is the process of calculating the predictor bounding box j of grid cell i where the object exists. , the process of calculating the loss of w and h. That is, for a large box, a sum-squared error is performed after taking the square root to reflect the small deviation. (Even the same error has relatively little effect on IOU in the case of a larger box.) This is the process of calculating the loss of the confidence score for the predictor bounding box j of grid cell i. (Ci = 1) (4) calculates the loss of the confidence score for the bounding box j of grid cell i where no object exists. (Ci=0) (5) is the process of calculating the loss of conditional class probability for grid cell i where an object exists. (Correct class c:　pi(c)=1, otherwise:　pi (c)=0)

또한, 여기서 λ_coord는 좌표들(x,y,w,h)에 대한 loss와 다른 loss들과의 균형을 위한 balancing parameter이고, λ_noobj는 obj가 있는 box와 없는 box간에 균형을 위한 balancing parameter이다.(일반적으로 영상내에는 객체가 있는 cell보다 객체가 없는 cell이 훨씬 많기 때문이다.)Also, where λ _coord is a balancing parameter for balancing the loss with respect to the coordinates (x, y, w, h) with other losses, and λ _noobj is a balancing parameter for balancing the box with and without obj (This is because, in general, there are far more cells without objects in the image than cells with objects.)

예를 들어, 제1 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과는 열화상 영상에서 탐지된 객체의 위치, 객체의 클래스(class) 및 객체의 탐지 점수(confidence score)를 포함할 수 있고, 제2 딥 뉴럴 네트워크를 통해 제2 객체 탐지 결과는 촬영 영상에서 탐지된 객체의 위치, 객체의 클래스 및 객체의 탐지 점수를 포함할 수 있다.For example, the first object detection result through the first deep neural network may include the position of the object detected in the thermal image, the class of the object, and the detection score of the object, and the second The second object detection result through the deep neural network may include the location of the object detected in the captured image, the class of the object, and the detection score of the object.

일 실시 예에서, 객체 분석 장치(100)의 제어부(110)는, 동작 25에서, 제1 객체 탐지 결과와 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다. 예를 들어, 제3 딥 뉴럴 네트워크는 다층 퍼셉트론(Multi-Layer Perceptron: MLP)일 수 있다.In an embodiment, in operation 25, the controller 110 of the object analysis apparatus 100 inputs the first object detection result and the second object detection result into a pre-learned third deep neural network to perform a third deep neural network. The final object detection result can be obtained through the network. For example, the third deep neural network may be a Multi-Layer Perceptron (MLP).

예를 들어, 도 5에 도시된 바와 같이, 다층 퍼셉트론은 입력레이어(Input Layer), 히든레이어(Hiddent Layer), 및 출력 레이어(Output Layer)를 포함할 수 있다. 각 레이어는 복수의 노드들을 포함하고, 각 레이어는 다음 레이어와 연결된다. 인접한 레이어 사이의 노드들은 가중치(weight)를 가지고 서로 연결될 수 있다.For example, as shown in FIG. 5 , the multilayer perceptron may include an input layer, a hidden layer, and an output layer. Each layer includes a plurality of nodes, and each layer is connected to the next layer. Nodes between adjacent layers may be connected to each other with weights.

일 실시 예에서, 각 노드들은 미리 학습된 모델에 기초하여 동작할 수 있고 학습 모델에 따라 입력값에 대응하는 출력값이 결정될 수 있다. 임의의 노드, 예를 들어, 제1 객체 탐지 결과에 대응하는 출력값은 해당 노드와 연결된 다음 레이어의 노드로 입력될 수 있다. 이때, 각 노드의 입력값은 이전 레이어의 노드의 출력값에 가중치(weight)가 적용된 값일 수 있다. 가중치(weight)는 노드간의 연결 강도를 의미할 수 있다. 다층 퍼셉트론은 학습된 레이어(layer)를 이용하여 객체를 인식할 수 있다.In an embodiment, each node may operate based on a pre-trained model, and an output value corresponding to an input value may be determined according to the learning model. An output value corresponding to an arbitrary node, for example, a first object detection result may be input to a node of a next layer connected to the corresponding node. In this case, the input value of each node may be a value in which a weight is applied to the output value of the node of the previous layer. A weight may mean a connection strength between nodes. The multilayer perceptron can recognize objects using learned layers.

예를 들어, 최종 객체 탐지 결과는 열화상 영상과 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다.For example, the final object detection result may include the final position of the object detected together in the thermal image and the captured image, the final class of the object, and the final detection score of the object.

일 실시 예에서, 객체 분석 장치(100)은, 동작 26에서, 탐지된 객체가 조난자인지 판단할 수 있다.In an embodiment, the object analysis apparatus 100 may determine whether the detected object is a person in distress in operation 26 .

예를 들어, 조난자를 판단하기 위해 객체 분석 장치(100)는 해당 객체의 클래스, 객체의 최종 탐지 점수 및 촬영 당시의 객체의 위치를 확인할 수 있다. 즉, 객체 분석 장치는 해당 객체의 클래스를 확인하여 사람인지 확인할 수 있고, 사람일 경우 객체의 최종 탐지 점수가 신뢰성 높은 값인지 확인할 수 있고, 신뢰성 높은 값이 경우 촬영 당시의 객체의 위치가 위험 지역인 지 확인할 수 있다. 여기서 위험 지역이란 야간에서 사람이 일반적으로 존재하기 어려운 해상 지역 또는 인명 사고 발생 확률이 높은 지역일 수 있다. 확인 결과 위험 지역인 경우 객체 분석 장치(100)는 탐지된 객체가 조난자라고 판단할 수 있다.For example, in order to determine the person in distress, the object analysis apparatus 100 may check the class of the corresponding object, the final detection score of the object, and the location of the object at the time of photographing. That is, the object analysis device can check the class of the object to determine whether it is a person, and if it is a person, it can determine whether the final detection score of the object is a high reliability value. It can be checked whether Here, the dangerous area may be an area in the sea in which it is difficult for people to generally exist at night or an area with a high probability of occurrence of human accidents. As a result of the check, if it is a dangerous area, the object analysis apparatus 100 may determine that the detected object is a person in distress.

일 실시 예에서, 객체 분석 장치(100)은, 동작 27에서, 최종 객체 탐지 결과에 기반하여 탐지된 객체가 조난자라고 판단될 경우, 조난자가 탐지된 지점의 위치 정보와 영상정보를 관리자 장치(20)로 전송할 수 있다. 예를 들어, 객체 분석 장치(100)는 GPS 모듈(130)을 통해 획득한 위치 정보와 열화상 영상 및 촬영 영상을 포함하는 영상 정보를 관리자 장치(20)로 전송할 수 있다.In an embodiment, when it is determined that the detected object is a person in distress based on the final object detection result, in operation 27, the object analysis apparatus 100 transmits location information and image information of a point where the person in distress is detected to the manager device 20 ) can be transmitted. For example, the object analysis apparatus 100 may transmit location information acquired through the GPS module 130 and image information including a thermal image and a captured image to the manager device 20 .

반면에, 최종 객체 탐지 결과에 기반하여 탐지된 객체가 조난자가 아니라고 판단될 경우, 동작 21 내지 26을 다시 수행할 수 있다.On the other hand, if it is determined that the detected object is not a distress person based on the final object detection result, operations 21 to 26 may be performed again.

이와 같이, 본 발명의 시스템(10)은 야간 인명 구조를 딥 러닝 기법을 이용하여 신속하고 정확하게 수행할 수 있다.As such, the system 10 of the present invention can perform lifesaving at night quickly and accurately by using a deep learning technique.

한편, 도면에는 도시되지 않았지만, 객체 분석 장치(100)는 탐지한 객체가 핵심 객체인지 비교할 수 있고, 핵심 객체인 경우 탐지 점수가 기준 값 이하인 지 확인할 수 있고, 핵심 객체의 탐지 점수가 기준 값 이하일 경우, 영상이 촬영된 위치로 영상 획득부(200)를 움직여 재촬영하고 다시 분석할 수 있다. 여기서, 핵심 객체는 사람일 수 있고, 탐지 점수가 기준 값이하로서 신뢰성 높은 값은 아닐 경우 바로 관리자 장치(20)에 위치 정보와 영상 정보를 전송하지 않고 한번 더 조난자인지 확인할 수 있다. 즉, 야간에서 촬영한 영상에서 사람으로 의심되는 객체가 포함될 경우 탐지 점수가 낮더라도 다시 확인할 필요가 있으므로 다시 한번 더 촬영하여 객체 탐지 동작을 수행할 수 있다.On the other hand, although not shown in the drawing, the object analysis apparatus 100 may compare whether the detected object is a core object, if it is a core object, may check whether the detection score is less than or equal to the reference value, and the detection score of the core object may be less than or equal to the reference value. In this case, by moving the image acquisition unit 200 to the position where the image was captured, the image may be re-photographed and analyzed again. Here, the core object may be a person, and when the detection score is less than the reference value and is not a highly reliable value, it is possible to check whether the person in distress once more without transmitting the location information and the image information to the manager device 20 . That is, when an object suspected of being a person is included in the image taken at night, it is necessary to check again even if the detection score is low, so that the object detection operation can be performed by taking another image.

도 6 은 본 발명의 다른 실시 예에 따른 야간 인명구조 관리 방법을 나타낸 흐름도이다. 도 6의 동작들은 도 1의 객체 분석 장치(100) 및 영상 획득부(200)에 의해 수행될 수 있다. 설명의 간략화를 위해 도 2와 동일한 동작 61, 62, 63, 66 및 67에 대한 설명은 생략한다.6 is a flowchart illustrating a night lifesaving management method according to another embodiment of the present invention. The operations of FIG. 6 may be performed by the object analysis apparatus 100 and the image acquisition unit 200 of FIG. 1 . For simplicity of description, descriptions of operations 61, 62, 63, 66, and 67 identical to those of FIG. 2 will be omitted.

도 6을 참조하면, 일 실시 예에서, 객체 분석 장치(100)는, 동작 64에서, 제1 객체 탐지 결과와 제2 객체 탐지 결과를 융합함으로써 객체의 예측 맵(probability map)을 생성할 수 있다. 예를 들어, 도 2와 도 3에 도시된 바와 달리, 객체 분석 장치(100)는 제1 객체 탐지 결과와 제2 객체 탐지 결과를 융합할 수 있다. 예컨대, 제어부(110)는 제1 객체 탐지 결과와 제2 객체 탐지 결과에 속한 bounding box 별로 각각의 클래스에 속할 확률이 출력될 경우, 출력된 확률값들을 가중 융합(weighted fusion)할 수 있다. 즉, 각 클래스에 속할 확률이 0.5 이상인 값들만을 bounding box 별로 또는 픽셀 별로 모두 더하고 평균 값을 산출함으로써 제1 객체 탐지 결과와 제2 객체 탐지 결과를 융합할 수 있다. 또는 이와 달리, 각 픽셀 별 또는 각 bounding box별로 가장 높은 확률을 갖는 값을 선택하는 최대 확률 투표 방식(max probability voting)을 적용하여 제1 객체 탐지 결과와 제2 객체 탐지 결과를 융합할 수 있다.Referring to FIG. 6 , in an embodiment, the object analysis apparatus 100 may generate a prediction map of an object by fusing a first object detection result and a second object detection result in operation 64 . . For example, unlike shown in FIGS. 2 and 3 , the object analysis apparatus 100 may fuse the first object detection result and the second object detection result. For example, when the probability of belonging to each class is output for each bounding box belonging to the first object detection result and the second object detection result, the controller 110 may perform weighted fusion of the output probability values. That is, the first object detection result and the second object detection result can be fused by adding only values having a probability of belonging to each class of 0.5 or more for each bounding box or for each pixel and calculating an average value. Alternatively, the first object detection result and the second object detection result may be fused by applying a maximum probability voting method of selecting a value having the highest probability for each pixel or each bounding box.

예를 들어, 객체의 예측 맵은 열화상 영상과 촬영영상에서 함께 탐지된 객체의 사전 위치, 객체의 사전 클래스 및 객체의 사전 탐지 점수를 포함할 수 있다.For example, the prediction map of the object may include the prior location of the object detected together in the thermal image and the captured image, the prior class of the object, and the prior detection score of the object.

일 실시 예에서, 객체 분석 장치(100)는, 동작 65에서, 생성한 객체의 예측 맵을 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다. 예를 들어, 최종 객체 탐지 결과는 열화상 영상과 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다.In an embodiment, in operation 65, the object analysis apparatus 100 inputs the generated object prediction map to the pre-trained third deep neural network to obtain a final object detection result through the third deep neural network. can For example, the final object detection result may include the final position of the object detected together in the thermal image and the captured image, the final class of the object, and the final detection score of the object.

본 발명의 다양한 실시 예에 따르면, 열화상 센서와 영상 센서를 이용한 객체 인식 기반 야간 인명구조 관리 시스템은, 상기 열화상 센서와 상기 영상 센서를 포함하며 상기 열화상 센서와 상기 영상 센서를 이용하여 열화상 영상과 촬영 영상을 획득하는 영상 획득부; 및 상기 열화상 영상과 상기 촬영 영상을 이용하여 객체를 탐지하는 객체 분석 장치;를 포함하고, 상기 영상 획득분는 상기 객체 분석 장치로 상기 열화상 영상과 상기 촬영 영상을 전송하고, 상기 객체 분석 장치는, 상기 수신한 열화상 영상과 상기 촬영 영상을 전처리하고, 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하고, 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다.According to various embodiments of the present disclosure, an object recognition-based night lifesaving management system using a thermal image sensor and an image sensor includes the thermal image sensor and the image sensor, and uses the thermal image sensor and the image sensor to provide heat an image acquisition unit for acquiring an image image and a photographed image; and an object analysis apparatus for detecting an object using the thermal image and the captured image, wherein the image acquisition unit transmits the thermal image and the captured image to the object analysis apparatus, and the object analysis apparatus comprises: , the received thermal image and the captured image are pre-processed, and the pre-processed thermal image and the pre-processed captured image are respectively input to a first deep neural network and a second deep neural network that have been pre-learned, and the first The first object detection result and the second object detection result are obtained through the deep neural network and the second deep neural network, and the first object detection result and the second object detection result are previously learned in a third deep neural network. By inputting it, a final object detection result may be obtained through the third deep neural network.

다양한 실시 예에 따르면, 상기 조난자 분석 장치는, 상기 최종 객체 탐지 결과에 기반하여 탐지된 객체가 조난자라고 판단될 경우, 상기 조난자가 탐지된 지점의 위치 정보와 영상정보를 관리자 장치로 전송할 수 있다.According to various embodiments of the present disclosure, when it is determined that the detected object is a survivor based on the final object detection result, the apparatus for analyzing the distressed person may transmit location information and image information of a point where the person in distress is detected to the manager device.

다양한 실시 예에 따르면, 상기 제1 딥 뉴럴 네트워크는 깊이맵(dense map) 기반 YOLO(you only look once)이고, 상기 제2 딥 뉴럴 네트워크는 컬러맵(color map) 기반 YOLO이고, 상기 제3 딥 뉴럴 네트워크는 다층 퍼셉트론(Multi-Layer Perceptron: MLP)일 수 있다.According to various embodiments, the first deep neural network is a depth map-based you only look once (YOLO), the second deep neural network is a color map-based YOLO, and the third deep The neural network may be a Multi-Layer Perceptron (MLP).

다양한 실시 예에 따르면, 상기 제1 객체 탐지 결과는 상기 열화상 영상에서 탐지된 객체의 위치, 객체의 클래스(class) 및 객체의 탐지 점수(confidence score)를 포함하고, 상기 제2 객체 탐지 결과는 상기 촬영 영상에서 탐지된 객체의 위치, 객체의 클래스 및 객체의 탐지 점수를 포함하고, 상기 최종 객체 탐지 결과는 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다.According to various embodiments, the first object detection result includes a location of an object detected in the thermal image, a class of the object, and a confidence score of the object, and the second object detection result is a position of an object detected in the captured image, a class of the object, and a detection score of the object, and the final object detection result is the final position of the object detected in the thermal image and the captured image, the final class of the object, and It may include the final detection score of the object.

다양한 실시 예에 따르면, 상기 객체 분석 장치는, 상기 촬영 영상과 상기 열화상 영상을 시각 동기화하고, 상기 시각 동기화된 촬영 영상과 열화상 영상의 해상도가 동일하도록 상기 시각 동기화된 촬영 영상과 열화상 영상을 스케일링(scaling)할 수 있다.According to various embodiments of the present disclosure, the object analysis apparatus may time-synchronize the captured image and the thermal image, and the time-synchronized captured image and the thermal image so that resolutions of the time-synchronized captured image and the thermal image are the same can be scaled.

다양한 실시 예에 따르면, 상기 객체 분석 장치는 상기 영상 획득부의 위치 정보를 획득하는 GPS 센서와 관리자 장치와 연결하는 통신 모듈을 포함할 수 있다.According to various embodiments of the present disclosure, the object analysis device may include a GPS sensor for acquiring location information of the image acquisition unit and a communication module for connecting to the manager device.

본 발명의 다양한 실시 예에 따르면, 열화상 센서와 영상 센서를 이용한 객체 인식 기반 야간 인명구조 관리 시스템은, 상기 열화상 센서와 상기 영상 센서를 포함하며 상기 열화상 센서와 상기 영상 센서를 이용하여 열화상 영상과 촬영 영상을 획득하는 영상 획득부; 및 상기 열화상 영상과 상기 촬영 영상을 이용하여 객체를 탐지하는 객체 분석 장치;를 포함하고, 상기 영상 획득부는 상기 객체 분석 장치로 상기 열화상 영상과 상기 촬영 영상을 전송하고, 상기 객체 분석 장치는, 상기 수신한 열화상 영상과 상기 촬영 영상을 전처리하고, 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하고, 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 융합함으로써 상기 객체의 예측 맵(probability map)을 생성하고, 상기 생성한 객체의 예측 맵을 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득할 수 있다.According to various embodiments of the present disclosure, an object recognition-based night lifesaving management system using a thermal image sensor and an image sensor includes the thermal image sensor and the image sensor, and uses the thermal image sensor and the image sensor to provide heat an image acquisition unit for acquiring an image image and a photographed image; and an object analysis apparatus for detecting an object using the thermal image and the captured image, wherein the image acquisition unit transmits the thermal image and the captured image to the object analysis apparatus, and the object analysis apparatus comprises: , the received thermal image and the captured image are pre-processed, and the pre-processed thermal image and the pre-processed captured image are respectively input to a first deep neural network and a second deep neural network that have been pre-learned, and the first Acquire a first object detection result and a second object detection result through a deep neural network and the second deep neural network, and fuse the first object detection result and the second object detection result to obtain a prediction map of the object (probability) map), and input the predicted map of the created object to a pre-trained third deep neural network, to obtain a final object detection result through the third deep neural network.

다양한 실시 예에 따르면, 상기 객체의 예측 맵은 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 사전 위치, 객체의 사전 클래스 및 객체의 사전 탐지 점수를 포함하고, 상기 최종 객체 탐지 결과는 상기 열화상 영상과 상기 촬영영상에서 함께 탐지된 객체의 최종 위치, 객체의 최종 클래스 및 객체의 최종 탐지 점수를 포함할 수 있다.According to various embodiments, the prediction map of the object includes a prior location of an object detected together in the thermal image and the captured image, a prior class of the object, and a prior detection score of the object, and the final object detection result is It may include the final position of the object detected together in the thermal image and the captured image, the final class of the object, and the final detection score of the object.

본 발명의 다양한 실시 예에 따르면, 열화상 센서와 영상 센서를 이용한 객체 인식 기반 야간 인명구조 관리 방법은, 열화상 영상과 촬영 영상을 획득하는 단계; 상기 열화상 영상과 상기 촬영 영상을 전처리하는 단계; 상기 전처리한 열화상 영상과 상기 전처리한 촬영 영상을 사전 학습한 제1 딥 뉴럴 네트워크와 제2 딥 뉴럴 네트워크에 각각 입력하여, 상기 제1 딥 뉴럴 네트워크와 상기 제2 딥 뉴럴 네트워크를 통해 제1 객체 탐지 결과와 제2 객체 탐지 결과를 획득하는 단계; 및 상기 제1 객체 탐지 결과와 상기 제2 객체 탐지 결과를 사전 학습한 제3 딥 뉴럴 네트워크에 입력하여, 상기 제3 딥 뉴럴 네트워크를 통해 최종 객체 탐지 결과를 획득하는 단계;를 포함할 수 있다.According to various embodiments of the present disclosure, a method for managing lifesaving at night based on object recognition using a thermal image sensor and an image sensor includes: acquiring a thermal image and a photographed image; pre-processing the thermal image and the captured image; The pre-processed thermal image and the pre-processed captured image are respectively input to the pre-learned first deep neural network and the second deep neural network, and the first object through the first deep neural network and the second deep neural network obtaining a detection result and a second object detection result; and inputting the first object detection result and the second object detection result into a pre-trained third deep neural network, and obtaining a final object detection result through the third deep neural network.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 청구범위뿐만 아니라 이 청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and not only the claims described below, but also all modifications equivalently or equivalent to the claims described below are said to be within the scope of the spirit of the present invention. will be.

10: 야간 인명구조 관리 시스템
20: 관리자 장치
100: 객체 분석 장치 200: 영상 획득부
110: 제어부 120: 데이터베이스
130: GPS 모듈 140: 통신 모듈
210: 열화상 센서 220: 영상 센서
230: 통신 모듈10: Night lifesaving management system
20: Admin Device
100: object analysis device 200: image acquisition unit
110: control unit 120: database
130: GPS module 140: communication module
210: thermal image sensor 220: image sensor
230: communication module

Claims

In the object analysis device for life-saving at night,
a communication module for receiving an image from an image acquisition unit that acquires a thermal image and a photographed image using the thermal image sensor and the image sensor; and
Including; a control unit for detecting an object using the received thermal image and the captured image;
The control unit is
pre-processing the received thermal image and the captured image;
The preprocessed thermal image and the captured image are respectively input to the pre-trained first and second deep neural networks, and the first object detection result and the second 2 Acquire the object detection result,
inputting the first object detection result and the second object detection result into a pre-trained third deep neural network to obtain a final object detection result through the third deep neural network;
The first deep neural network is a depth map-based YOLO (you only look once), the second deep neural network is a color map-based YOLO, and the third deep neural network is a multilayer perceptron ( Multi-Layer Perceptron (MLP),
Object analysis apparatus, characterized in that the multi-part loss function of Equation 1 below is applied when learning the first deep neural network and the second deep neural network composed of the YOLO.
Equation 1

where i is the grid cell, j is the bounding box,

is the number of grids YOLO divides the input image, B is the number of bounding boxes of each grid cell, C is the number of conditional class probabilities of each grid cell, and (x, y) is the number of bounding boxes of each grid cell. It means the center point, and relative values for the grid cell range are input, and (w, h) is input relative to the width and height of the entire image,

is the predictor bounding box j of grid cell i where the object exists,

is the bounding box j of grid cell i where no object exists,

is the grid cell i where the object exists, (1) is the process of calculating the loss of x and y for the predictor bounding box j of the grid cell i where the object exists, and (2) is the grid cell where the object exists For the predictor bounding box j of i, it is the process of calculating the loss of w and h, (3) is the process of calculating the loss of the confidence score for the predictor bounding box j of the grid cell i where the object exists, ( 4) is the process of calculating the loss of the confidence score for bounding box j of grid cell i in which no object exists, and (5) is the process of calculating the loss of conditional class probability for grid cell i in which the object does not exist. process, λ _coord is a balancing parameter for balancing the loss on coordinates (x, y, w, h) with other losses, λ _noobj is a balancing parameter for balancing the box with and without obj .

The method of claim 1, wherein the control unit, when it is determined that the detected object is a person in distress based on the final object detection result, transmits location information and image information of a point where the person in distress is detected to the manager device. object analysis device.

delete

The method of claim 1, wherein the first object detection result includes a location of an object detected in the thermal image, a class of the object, and a confidence score of the object,
The second object detection result includes a location of an object detected in the captured image, a class of the object, and a detection score of the object,
and the final object detection result includes a final position of an object detected together in the thermal image and the captured image, a final class of the object, and a final detection score of the object.

According to claim 1, wherein the control unit,
Visual synchronization of the captured image and the thermal image,
The object analysis apparatus according to claim 1, wherein the time-synchronized captured image and the thermal image are scaled so that the resolution of the time-synchronized captured image and the thermal image is the same.

The method of claim 2, further comprising: a GPS module for acquiring the location information of the image acquisition unit;
The communication module is an object analysis device, characterized in that to be connected to the manager device.

In the object analysis device for life-saving at night,
a communication module for receiving an image from an image acquisition unit that acquires a thermal image and a photographed image using the thermal image sensor and the image sensor; and
Including; a control unit for detecting an object using the received thermal image and the captured image;
the control unit,
pre-processing the received thermal image and the captured image;
The pre-processed thermal image and the pre-processed captured image are respectively input to the pre-trained first deep neural network and the second deep neural network, and the first object through the first deep neural network and the second deep neural network Obtaining a detection result and a second object detection result,
generating a prediction map of the object by fusing the first object detection result and the second object detection result;
input the generated object prediction map into a pre-trained third deep neural network to obtain a final object detection result through the third deep neural network,
The first deep neural network is a depth map-based YOLO (you only look once), the second deep neural network is a color map-based YOLO, and the third deep neural network is a multilayer perceptron ( Multi-Layer Perceptron (MLP),
Object analysis apparatus, characterized in that the multi-part loss function of Equation 1 below is applied when learning the first deep neural network and the second deep neural network composed of the YOLO.
Equation 1

where i is the grid cell, j is the bounding box,

is the predictor bounding box j of grid cell i where the object exists,

is the bounding box j of grid cell i where no object exists,

is the grid cell i where the object exists, (1) is the process of calculating the loss of x and y for the predictor bounding box j of the grid cell i where the object exists, and (2) is the grid cell where the object exists For the predictor bounding box j of i, it is the process of calculating the loss of w and h, (3) is the process of calculating the loss of the confidence score for the predictor bounding box j of the grid cell i where the object exists, ( 4) is the process of calculating the loss of the confidence score for bounding box j of grid cell i in which an object does not exist, and (5) is the process of calculating the loss of conditional class probability for grid cell i in which an object does not exist. process, λ _coord is a balancing parameter for balancing the loss on coordinates (x, y, w, h) with other losses, λ _noobj is a balancing parameter for balancing the box with and without obj .

delete

The method of claim 7, wherein the prediction map of the object includes a prior location of an object detected together in the thermal image and the captured image, a prior class of the object, and a prior detection score of the object,
and the final object detection result includes a final position of an object detected together in the thermal image and the captured image, a final class of the object, and a final detection score of the object.

In the object analysis method for lifesaving at night,
acquiring a thermal image and a photographed image;
pre-processing the thermal image and the captured image;
The pre-processed thermal image and the pre-processed captured image are respectively input to the pre-trained first deep neural network and the second deep neural network, and the first object through the first deep neural network and the second deep neural network obtaining a detection result and a second object detection result; and
inputting the first object detection result and the second object detection result into a pre-trained third deep neural network, and obtaining a final object detection result through the third deep neural network;
The first deep neural network is a depth map-based YOLO (you only look once), the second deep neural network is a color map-based YOLO, and the third deep neural network is a multilayer perceptron ( Multi-Layer Perceptron (MLP),
Object analysis method, characterized in that the multi-part loss function of Equation 1 below is applied when learning the first deep neural network and the second deep neural network composed of the YOLO.
Equation 1

where i is the grid cell, j is the bounding box,

is the predictor bounding box j of grid cell i where the object exists,

is the bounding box j of grid cell i where no object exists,