KR102199627B1

KR102199627B1 - Apparatus for recognizing approaching vessel considering distance objects based on Deep Neural Networks, method therefor, and computer recordable medium storing program to perform the method

Info

Publication number: KR102199627B1
Application number: KR1020200059221A
Authority: KR
Inventors: 임태호; 송현학; 이효찬
Original assignee: 호서대학교 산학협력단
Priority date: 2020-05-18
Filing date: 2020-05-18
Publication date: 2021-01-07

Abstract

The present invention provides a device for recognizing an approaching vessel considering a distance of a marine object based on a deep neural network, comprising: a camera unit photographing a surveillance image; a preprocessing unit detecting an object image by specifying an area box indicating an area occupied by an object included in the photographed surveillance image through feature point detection; a deep neural network outputting whether or not the object of the object image is a vessel; and a control unit determining whether or not the object is a vessel in accordance with the probability, calculating the degree of risk indicating the probability that a collision with the vessel occurs if the object is the vessel as a result of the determination, and warning the risk of collision if the calculated risk is greater than or equal to a threshold value.

Description

Apparatus for recognizing approaching vessel considering distance objects based on Deep Neural Networks, a device for recognizing an approaching vessel based on a deep neural network, a method for this, and a program that performs this method. , method therefor, and computer recordable medium storing program to perform the method}

본 발명은 접근 선박 인식 기술에 관한 것으로, 보다 상세하게는, 심층신경망(DNN: Deep Neural Networks)을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치, 이를 위한 방법 및 이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체에 관한 것이다. The present invention relates to an approaching vessel recognition technology, and more particularly, an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network (DNN), a method therefor, and a method for performing the method. It relates to a computer-readable recording medium on which a program is recorded.

오늘날까지 해상에서 화재, 침수, 전복 등의 선박 사고가 자주 발생하고 있다. 해양경찰청의 조사에 따르면 2018년도에 해상 조난사고가 선박 3,434척에서 발생하였고 사고 이중 선박 49척은 침몰하여 89명의 임명이 사망 또는 실종되었다. 이러한 사고는 해마다 증가하고 있으며, 사고 발생 원인으로 선박 책임자가 잠을 자거나, 주변 상황을 제대로 살피지 않는 등의 문제가 있으며, 작년 2월에 부산 광안대교에 러시아 화물선이 충돌한 사고가 그 예시이다. To this day, ship accidents such as fire, flooding, and overturning occur frequently at sea. According to an investigation by the Maritime Police Agency, in 2018, maritime distress accidents occurred on 3,434 ships, of which 49 ships sank, and 89 appointments were killed or missing. Such accidents are increasing year by year, and there are problems such as the ship's supervisor sleeping or not properly inspecting the surrounding situation as the cause of the accident. An example of this is an accident in which a Russian cargo ship collided with Busan Gwangan Bridge in February last year.

사소한 사고에서 큰 사고로 번질 가능성이 큰 해상에서 선박 책임자의 부주의로 발생하는 선박 사고를 개선시킬 방법으로 많은 연구가 진행되고 있다. 그 중 심층신경망을 이용한 영상 속 선박을 인식하는 연구를 통해 사고를 예방할 수 있지만, 거리가 먼 선박도 인식하기 위해 이미지를 확대하면서 심층신경망 모델의 연산량이 많아 실시간 처리가 불가능하다. Many studies are being conducted as a way to improve ship accidents that occur due to the carelessness of the ship manager in the sea where the possibility of spreading from minor accidents to major accidents is high. Among them, accidents can be prevented through research that recognizes vessels in images using deep neural networks, but real-time processing is impossible due to the large amount of computation of the deep neural network model while enlarging the image to recognize even distant vessels.

한국공개특허 제2019-0024400호 2019년 03월 08일 공개 (명칭: 객체 인식 장치 및 그 제어 방법)Korean Patent Laid-Open Patent No. 2019-0024400 published on March 08, 2019 (Name: object recognition device and control method thereof)

본 발명은 상술한 바와 같은 문제를 해결하기 위해 안출된 것으로, 본 발명의 목적은 해상에서 선박 책임자의 부주의로 발생하는 사고를 방지하기 위해 실시간 처리가 가능하도록 영상처리를 이용하여 영상 속 선박으로 예상되는 객체 영역을 검출하고, 검출된 객체 이미지를 심층신경망에 입력하여 인식하고, 인식된 객체 영역을 분석해 선박 책임자에게 위험을 경고하기 위한 장치, 이를 위한 방법 및 이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체를 제공하기 위한 것이다. The present invention was conceived to solve the above-described problem, and the object of the present invention is to predict a ship in an image using image processing to enable real-time processing in order to prevent accidents caused by carelessness of a ship manager at sea. A device that detects the object area to be detected, recognizes the detected object image by inputting it into the deep neural network, and analyzes the recognized object area to warn the person in charge of the ship, a method for this, and a computer in which a program that performs this method is recorded It is to provide a readable recording medium.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치는 감시영상을 촬영하기 위한 카메라부와, 상기 촬영된 감시영상에 포함된 객체가 차지하는 영역을 나타내는 영역상자를 특징점 검출을 통해 특정하여 객체영상을 검출하는 전처리부와, 상기 객체영상의 객체가 선박인지 여부를 확률로 출력하는 심층신경망과, 상기 확률에 따라 상기 객체가 선박인지 여부를 판정하고, 판정 결과, 상기 객체가 선박이면, 상기 선박과의 충돌이 발생할 확률을 나타내는 위험도를 산출하고, 산출된 위험도가 임계치 이상이면 충돌 위험을 경고하는 관제부를 포함한다. An apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to a preferred embodiment of the present invention for achieving the above object includes a camera unit for photographing a surveillance image, and the photographed surveillance image A preprocessor for detecting an object image by specifying an area box representing an area occupied by an object included in the object through feature point detection, a deep neural network that outputs with probability whether the object of the object image is a ship, and It determines whether the object is a ship, and as a result of the determination, if the object is a ship, a risk indicating a probability of occurrence of a collision with the ship is calculated, and a control unit for warning of a collision risk if the calculated risk is greater than or equal to a threshold value.

상기 전처리부는 상기 감시영상에서 특징점이 검출되면, 해리스 코너(Harris corner) 알고리즘을 이용하여 상기 감시영상 내의 복수의 코너점을 검출한 후, 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하고, 특정된 영역박스를 통해 상기 객체영상을 검출하는 것을 특징으로 한다. When a feature point is detected in the surveillance image, the pre-processor detects a plurality of corner points in the surveillance image using a Harris corner algorithm, and then searches for a dense area of the detected corner points to an area occupied by the object. A region box representing is specified, and the object image is detected through the specified region box.

상기 전처리부는 상기 영상에서 특징점이 검출되지 않으면, 상기 감시영상에서 수평선을 검출하고, 상기 검출된 수평선의 윗부분의 영상을 소정 크기로 확대한 후, 상기 확대된 수평선의 윗부분의 영상에서 해리스 코너 알고리즘을 이용하여 복수의 코너점을 검출한 후, 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하고, 특정된 영역박스를 통해 상기 객체영상을 검출하는 것을 특징으로 한다. If the feature point is not detected in the image, the preprocessor detects a horizontal line in the surveillance image, enlarges the image above the detected horizontal line to a predetermined size, and then performs a Harris Corner algorithm on the image above the enlarged horizontal line. After detecting a plurality of corner points by using, the object image is detected through the specified area box by specifying an area box representing an area occupied by an object by finding a dense area of the detected plurality of corner points. .

상기 관제부는 상기 객체의 영역상자의 좌표를 통해 객체영상의 넓이를 산출한 후, 상기 감시영상의 넓이 대 상기 객체영상의 넓이의 비율이 기 설정된 임계치 이상이면, 상기 위험도가 임계치 이상인 것으로 판단하는 것을 특징으로 한다. The control unit calculates the width of the object image through the coordinates of the area box of the object, and determines that the risk is greater than or equal to the threshold when the ratio of the area of the surveillance image to the area of the object image is greater than or equal to a preset threshold. It is characterized.

상기 심층신경망은 객체영상이 입력되는 입력층과, 상기 객체영상 혹은 특징영상에 대해 컨벌루션 연산에 의해 도출되는 적어도 하나의 특징영상을 포함하는 하나 이상의 컨벌루션층과, 컨벌루션 연산을 통해 생성된 특징 영상에 대해 풀링 연산을 통해 도출되는 적어도 하나의 특징 영상을 도출하는 하나 이상의 풀링층과, 특징영상 혹은 이전 계층의 노드값을 입력받아 활성화함수에 의한 연산을 통해 노드값이 산출되는 복수의 연산 노드를 포함하는 하나 이상의 완전연결층과, 상기 완전연결층의 노드값을 입력받아 활성화함수에 의한 연산을 통해 출력값이 산출되는 복수의 출력 노드를 포함하는 출력층을 포함한다. The deep neural network includes at least one convolutional layer including an input layer into which an object image is input, at least one feature image derived by a convolution operation on the object image or feature image, and a feature image generated through a convolution operation. Including at least one pooling layer for deriving at least one feature image derived through a pooling operation, and a plurality of operation nodes for calculating a node value through an operation by an activation function by receiving the feature image or the node value of the previous layer. And an output layer including at least one fully connected layer, and a plurality of output nodes for receiving node values of the fully connected layer and calculating an output value through an operation by an activation function.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치는 복수의 프레임을 포함하는 감시영상을 촬영하기 위한 카메라부와, 상기 복수의 프레임 각각에서 순차로 객체가 차지하는 영역을 나타내는 영역상자를 통해 복수의 객체영상을 검출하는 전처리부와, 상기 복수의 객체 영상 각각에 대응하여 시간 순서에 따라 정렬된 복수의 항해벡터를 생성하고, 상기 복수의 항해벡터에 대해 가중치가 적용되는 복수의 연산을 수행하여 소정 시간 후의 상기 객체의 항해 상태를 예측하는 항해예측벡터를 산출하는 심층신경망과, 상기 항해예측벡터로부터 상기 소정 시간 후의 상기 객체의 항해 방향 및 상기 객체가 상기 감시영상에서 차지하는 영역을 나타내는 영역상자를 도출하고, 도출된 항해 방향이 충돌 가능 방향이고, 도출된 영역상자가 상기 감시영상 내의 기 설정된 경고 영역과 적어도 일부가 중첩되면, 충돌 위험을 경고하는 관제부를 포함한다. An apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to a preferred embodiment of the present invention for achieving the above object includes a camera unit for photographing a surveillance image including a plurality of frames, and , A preprocessor for detecting a plurality of object images through an area box indicating an area occupied by an object in each of the plurality of frames, and a plurality of navigation vectors arranged in chronological order corresponding to each of the plurality of object images. A deep neural network that generates and calculates a navigation prediction vector that predicts the navigation state of the object after a predetermined time by performing a plurality of calculations to which weights are applied to the plurality of navigation vectors, and after the predetermined time from the navigation prediction vector An area box indicating the navigation direction of the object and the area occupied by the object in the surveillance image is derived, the derived navigation direction is a collision possible direction, and the derived area box is at least partially with a preset warning area in the surveillance image. When overlapped, it includes a control unit that warns of the risk of collision.

상기 심층신경망은 상기 복수의 객체영상 각각의 객체의 항해 방향을 나타내는 복수의 방향벡터를 도출하는 방향식별망과, 상기 도출된 복수의 방향벡터에 상기 복수의 객체영상 각각의 상기 감시영상 상에서의 상기 영역상자의 좌표를 나타내는 영역벡터 및 상기 복수의 객체영상 각각이 생성된 시간을 나타내는 시간벡터를 결합하여 상기 복수의 항해벡터를 생성하는 덧셈기를 포함한다. The deep neural network includes a direction identification network for deriving a plurality of direction vectors representing a navigation direction of each object of the plurality of object images, and the plurality of object images on the surveillance image of each of the plurality of object images. And an adder for generating the plurality of navigation vectors by combining a region vector indicating coordinates of the region box and a time vector indicating a time when each of the plurality of object images is generated.

상기 심층신경망은 순차로 정렬된 복수의 스테이지로 이루어지며, 이전 스테이지의 상태값과 현 스테이지의 입력값인 항해벡터에 대해 상태 및 입력 가중치가 적용되는 연산을 수행하여 현 스테이지의 상태값을 산출한 후, 산출된 상태값을 다음 스테이지에 전달하는 복수의 은닉셀을 포함하는 제1 은닉셀그룹과, 이전 스테이지의 상태값에 대해 상태 가중치가 적용되는 연산을 수행하여 현 스테이지의 상태값을 산출한 후, 산출된 상태값을 다음 스테이지에 전달하는 복수의 은닉셀을 포함하는 제2 은닉셀그룹과, 이전 스테이지의 상태값에 대해 상태 가중치가 적용되는 연산을 수행하여 현 스테이지의 상태값을 산출한 후, 산출된 현 스테이지의 상태값에 출력 가중치를 적용하는 연산을 수행하여 출력값인 항해예측벡터를 산출하는 은닉셀을 포함하는 제3 은닉셀그룹을 포함하는 항해예측망을 더 포함한다. The deep neural network consists of a plurality of stages arranged in sequence, and the state value of the current stage is calculated by performing an operation in which the state and input weights are applied to the state value of the previous stage and the navigation vector, which is the input value of the current stage. Thereafter, a first hidden cell group including a plurality of hidden cells that transmits the calculated state value to the next stage and a state weight of the previous stage are calculated to calculate the state value of the current stage. Thereafter, a second hidden cell group including a plurality of hidden cells that transmits the calculated state value to the next stage, and an operation in which state weights are applied to the state values of the previous stage are performed to calculate the state value of the current stage. Thereafter, a navigation prediction network including a third hidden cell group including a hidden cell for calculating a navigation prediction vector as an output value by performing an operation of applying an output weight to the calculated state value of the current stage is further included.

상기 항해예측망은 시간 순서에 따라 정렬되는 복수의 항해벡터를 입력받는 순환입력층과, 상기 복수의 항해벡터 각각에 대응하여 상기 복수의 스테이지의 순서대로 가중치가 적용되는 하나 이상의 연산을 수행하여 상기 항해예측벡터를 산출하는 복수의 은닉셀을 포함하는 순환은닉층과, 상기 산출된 위험도를 출력하는 순환출력층을 포함한다. The navigation prediction network performs at least one operation in which weights are applied in the order of the plurality of stages in correspondence with a circular input layer receiving a plurality of navigation vectors arranged according to a time order, and each of the plurality of navigation vectors. And a cyclic hidden layer including a plurality of hidden cells for calculating the navigation prediction vector, and a cyclic output layer for outputting the calculated risk.

상기 전처리부는 상기 감시영상에서 특징점이 검출되면, 해리스 코너(Harris corner) 알고리즘을 이용하여 상기 감시영상 내의 복수의 코너점을 검출한 후, 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하고, 특정된 영역박스를 통해 상기 객체영상을 검출하고, 상기 영상에서 특징점이 검출되지 않으면, 상기 감시영상에서 수평선을 검출하고, 상기 검출된 수평선의 윗부분의 영상을 소정 크기로 확대한 후, 상기 확대된 수평선의 윗부분의 영상에서 해리스 코너 알고리즘을 이용하여 복수의 코너점을 검출한 후, 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하고, 특정된 영역박스를 통해 상기 객체영상을 검출하는 것을 특징으로 한다. When a feature point is detected in the surveillance image, the pre-processor detects a plurality of corner points in the surveillance image using a Harris corner algorithm, and then searches for a dense area of the detected corner points to an area occupied by the object. If a region box indicating a region box is specified, the object image is detected through the specified region box, and a feature point is not detected in the image, a horizontal line is detected in the surveillance image, and the image above the detected horizontal line is set to a predetermined size. After magnifying with, after detecting a plurality of corner points using the Harris Corner algorithm in the image above the enlarged horizontal line, the area box representing the area occupied by the object is specified by searching for a dense area of the detected plurality of corner points. And, it characterized in that the object image is detected through the specified area box.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법은 카메라부가 감시영상을 촬영하는 단계와, 전처리부가 상기 촬영된 감시영상에 포함된 객체가 차지하는 영역을 나타내는 영역상자를 특징점 검출을 통해 특정하여 객체영상을 검출하는 단계와, 심층신경망이 상기 객체영상의 객체가 선박인지 여부를 확률로 출력하는 단계와, 관제부가 상기 확률에 따라 상기 객체가 선박인지 여부를 판정하는 단계와, 상기 관제부가 상기 판정 결과, 상기 객체가 선박이면, 상기 선박과의 충돌이 발생할 확률을 나타내는 위험도를 산출하는 단계와, 상기 관제부가 상기 산출된 위험도가 임계치 이상이면 충돌 위험을 경고하는 단계를 포함한다. A method for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to a preferred embodiment of the present invention for achieving the above-described object includes the steps of a camera unit photographing a surveillance image, and a preprocessor unit The step of detecting an object image by specifying an area box representing the area occupied by the object included in the surveillance image through feature point detection, the deep neural network outputting with probability whether the object of the object image is a ship, and the control unit Determining whether the object is a ship according to the probability, and if the control unit is the object as a result of the determination, calculating a risk indicating a probability of occurrence of a collision with the ship, and the control unit And warning of a collision risk if the calculated risk is greater than or equal to a threshold.

상기 객체영상을 검출하는 단계는 상기 전처리부가 상기 감시영상에서 특징점이 검출되면, 해리스 코너(Harris corner) 알고리즘을 이용하여 상기 감시영상 내의 복수의 코너점을 검출하는 단계와, 상기 전처리부가 상기 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하는 단계와, 상기 전처리부가 상기 특정된 영역박스를 통해 상기 객체영상을 검출하는 단계를 포함한다. The detecting of the object image includes: when the preprocessor detects a feature point in the surveillance image, detecting a plurality of corner points in the surveillance image using a Harris corner algorithm, and the preprocessor detects the detected feature point. And specifying an area box representing an area occupied by an object by searching for a dense area of a plurality of corner points, and detecting the object image through the specified area box by the preprocessor.

상기 객체영상을 검출하는 단계는 상기 전처리부가 상기 영상에서 특징점이 검출되지 않으면, 상기 감시영상에서 수평선을 검출하는 단계와, 상기 전처리부가 상기 검출된 수평선의 윗부분의 영상을 소정 크기로 확대하는 단계와, 상기 전처리부가 상기 확대된 수평선의 윗부분의 영상에서 해리스 코너 알고리즘을 이용하여 복수의 코너점을 검출하는 단계와, 상기 전처리부가 상기 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 영역박스를 특정하는 단계와, 상기 전처리부가 상기 특정된 영역박스를 통해 상기 객체영상을 검출하는 단계를 포함한다. The detecting of the object image includes: if the preprocessor does not detect a feature point in the image, detecting a horizontal line in the surveillance image; and expanding the image above the detected horizontal line to a predetermined size by the preprocessor; , The preprocessing unit detects a plurality of corner points using a Harris corner algorithm in the image above the enlarged horizontal line, and the preprocessor searches for a dense area of the detected corner points and indicates an area occupied by an object. And specifying an area box, and detecting the object image through the specified area box by the preprocessor.

상기 위험도를 산출하는 단계는 상기 관제부가 상기 객체의 영역상자의 좌표를 통해 객체영상의 넓이를 산출하는 단계와, 상기 관제부가 상기 감시영상의 넓이 대 상기 객체영상의 넓이의 비율이 기 설정된 임계치 이상이면, 상기 위험도가 임계치 이상인 것으로 판단하는 단계를 포함한다. The calculating of the risk may include: calculating, by the control unit, the width of the object image through coordinates of the area box of the object, and the ratio of the area of the surveillance image to the area of the object image by the control unit is equal to or greater than a preset threshold. If so, it includes determining that the risk is equal to or greater than a threshold.

상기 심층신경망이 상기 객체영상의 객체가 선박인지 여부를 확률로 출력하는 단계는 상기 심층신경망의 입력층이 객체영상을 입력받는 단계와, 상기 심층신경망의 제1 컨벌루션층이 상기 객체영상에 대해 필터를 이용한 컨벌루션 연산을 수행하여 적어도 하나의 특징영상을 도출하는 단계와, 상기 심층신경망의 제1 풀링층이 상기 제1 컨벌루션층의 특징영상에 대해 필터를 이용한 풀링 연산을 수행하여 적어도 하나의 특징영상을 도출하는 단계와, 상기 심층신경망의 제2 컨벌루션층이 상기 제1 풀링층의 특징영상에 대해 필터를 이용한 컨벌루션 연산을 수행하여 적어도 하나의 특징영상을 도출하는 단계와, 상기 심층신경망의 제2 풀링층이 상기 제2 컨벌루션층의 특징영상에 대해 필터를 이용한 풀링 연산을 수행하여 적어도 하나의 특징영상을 도출하는 단계와, 상기 심층신경망의 제1 완결연결층의 복수의 연산 노드가 상기 제2 풀링층의 특징영상에 대해 활성화함수에 의한 연산을 통해 노드값을 산출하는 단계와, 상기 심층신경망의 제2 완결연결층의 복수의 연산 노드가 상기 제1 완결연결층의 노드값에 대해 활성화함수에 의한 연산을 통해 노드값을 산출하는 단계와, 상기 심층신경망의 출력층의 복수의 출력 노드가 상기 제2 완전연결층의 노드값에 대해 활성화함수에 의한 연산을 통해 상기 객체영상의 객체가 선박인지 여부에 대한 확률인 출력값을 산출하는 단계를 포함한다. In the deep neural network outputting with probability whether the object of the object image is a ship, the input layer of the deep neural network receives the object image, and the first convolutional layer of the deep neural network filters the object image. Deriving at least one feature image by performing a convolution operation using, and at least one feature image by performing a pooling operation using a filter on the feature image of the first convolutional layer by the first pooling layer of the deep neural network Deriving at least one feature image by performing a convolution operation using a filter on the feature image of the first pooling layer by the second convolutional layer of the deep neural network, and the second convolutional image of the deep neural network The step of deriving at least one feature image by performing a pooling operation using a filter on the feature image of the second convolutional layer by a pooling layer, and a plurality of operation nodes of the first complete connection layer of the deep neural network Calculating a node value through an operation using an activation function for the feature image of the pooling layer, and a plurality of operation nodes of the second complete connection layer of the deep neural network with respect to the node value of the first complete connection layer Calculating a node value through calculation by, and whether the object of the object image is a ship through an operation by an activation function for the node value of the second fully connected layer by a plurality of output nodes of the output layer of the deep neural network And calculating an output value that is a probability of whether or not.

본 발명의 다른 견지에 따르면, 전술한 바와 같은 본 발명의 실시예에 따른 접근 선박을 인식하기 위한 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체를 제공한다. According to another aspect of the present invention, there is provided a computer-readable recording medium in which a program for performing a method for recognizing an approaching vessel according to an embodiment of the present invention as described above is recorded.

본 발명에 따르면 영상을 통해 선박 주변의 다른 선박들을 찾아 위험 상황일 경우 선박 책임자에게 알림을 줌으로써, 선박 책임자의 부주의로 발생하는 선박 충돌과 같은 큰 사고로 이어지는 상황 예방할 수 있다. 또한, 높은 컴퓨터 파워와 비용을 요구하는 심층학습 모델을 사용하는 게 아닌 선박 인식에 특화된 영상처리와 심층학습 모델을 이용한 선박 인식 알고리즘으로 설치비용을 절감할 수 있다. According to the present invention, a situation leading to a major accident such as a ship collision that occurs due to the carelessness of the ship manager can be prevented by searching for other ships around the ship and notifying the ship manager in case of danger. In addition, it is possible to reduce the installation cost by using image processing specialized for ship recognition and ship recognition algorithm using deep learning model, rather than using a deep learning model that requires high computer power and cost.

도 1은 본 발명의 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치의 구성을 설명하기 위한 도면이다.
도 2 내지 도 4는 본 발명의 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치의 동작을 설명하기 위한 화면 예이다.
도 5는 본 발명의 제1 실시예에 따른 심층신경망의 구성을 설명하기 위한 도면이다.
도 6 내지 도 9는 본 발명의 제2 실시예에 따른 심층신경망의 구성을 설명하기 위한 도면이다.
도 10은 본 발명의 제1 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법을 설명하기 위한 흐름도이다.
도 11은 본 발명의 실시예에 따른 객체영상을 검출하는 방법을 설명하기 위한 흐름도이다.
도 12는 본 발명의 실시예에 따른 근거리 객체에 대응하는 객체영상을 검출하는 방법을 설명하기 위한 도면이다.
도 13은 본 발명의 실시예에 따른 원거리 객체에 대응하는 객체영상을 검출하는 방법을 설명하기 위한 도면이다.
도 14는 본 발명의 제2 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법을 설명하기 위한 흐름도이다. 1 is a view for explaining the configuration of an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to an embodiment of the present invention.
2 to 4 are screen examples for explaining the operation of an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to an embodiment of the present invention.
5 is a view for explaining the configuration of a deep neural network according to the first embodiment of the present invention.
6 to 9 are diagrams for explaining the configuration of a deep neural network according to a second embodiment of the present invention.
10 is a flowchart illustrating a method for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to the first embodiment of the present invention.
11 is a flowchart illustrating a method of detecting an object image according to an embodiment of the present invention.
12 is a diagram illustrating a method of detecting an object image corresponding to a near object according to an embodiment of the present invention.
13 is a diagram for describing a method of detecting an object image corresponding to a distant object according to an embodiment of the present invention.
14 is a flowchart illustrating a method for recognizing an approaching vessel in consideration of a maritime object distance based on a deep neural network according to a second embodiment of the present invention.

본 발명의 상세한 설명에 앞서, 이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 실시예에 불과할 뿐, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. Prior to the detailed description of the present invention, terms or words used in the present specification and claims described below should not be construed as being limited to their usual or dictionary meanings, and the inventors shall use their own invention in the best way. For explanation, based on the principle that it can be appropriately defined as a concept of terms, it should be interpreted as a meaning and concept consistent with the technical idea of the present invention. Therefore, the embodiments described in the present specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all the technical spirit of the present invention, and various equivalents that can replace them at the time of application It should be understood that there may be water and variations.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 이때, 첨부된 도면에서 동일한 구성 요소는 가능한 동일한 부호로 나타내고 있음을 유의해야 한다. 또한, 본 발명의 요지를 흐리게 할 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략할 것이다. 마찬가지의 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this case, it should be noted that the same components in the accompanying drawings are indicated by the same reference numerals as possible. In addition, detailed descriptions of known functions and configurations that may obscure the subject matter of the present invention will be omitted. For the same reason, some components in the accompanying drawings are exaggerated, omitted, or schematically illustrated, and the size of each component does not entirely reflect the actual size.

먼저, 본 발명의 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치에 대해서 설명하기로 한다. 도 1은 본 발명의 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치의 구성을 설명하기 위한 도면이다. 도 2 내지 도 4는 본 발명의 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 장치의 동작을 설명하기 위한 화면 예이다. First, an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to an embodiment of the present invention will be described. 1 is a view for explaining the configuration of an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to an embodiment of the present invention. 2 to 4 are screen examples for explaining the operation of an apparatus for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 접근 선박을 인식하기 위한 장치(10: 이하, '관제장치'로 축약함)는 통신부(11), 입력부(13), 표시부(14), 저장부(15) 및 제어부(16)를 포함한다. Referring to FIG. 1, the apparatus for recognizing an approaching vessel (10: hereinafter, abbreviated as'control device') according to an embodiment of the present invention includes a communication unit 11, an input unit 13, a display unit 14, and a storage unit. It includes a unit 15 and a control unit 16.

통신부(11)는 예컨대, 휴대폰, 스마트폰, 이동통신단말 등과 같은 관리자의 사용자장치, 선박에 설치된 각 종 경보기, 선박의 항해 및 기타 관리를 위한 관제서버 등과 통신하기 위한 것이다. 통신부(11)는 송신되는 신호의 주파수를 상승 변환 및 증폭하는 RF(Radio Frequency) 송신기(Tx) 및 수신되는 신호를 저 잡음 증폭하고 주파수를 하강 변환하는 RF 수신기(Rx)를 포함할 수 있다. 그리고 통신부(11)는 송신되는 신호를 변조하고, 수신되는 신호를 복조하는 모뎀(Modem)을 포함할 수 있다. The communication unit 11 is for communicating, for example, a user device of an administrator such as a mobile phone, a smart phone, a mobile communication terminal, etc., various alarms installed on the ship, and a control server for navigation and other management of the ship. The communication unit 11 may include a radio frequency (RF) transmitter Tx for up-converting and amplifying a frequency of a transmitted signal, and an RF receiver Rx for low-noise amplifying and down-converting a received signal. In addition, the communication unit 11 may include a modem that modulates the transmitted signal and demodulates the received signal.

카메라부(12)는 영상을 촬영하기 위한 것이다. 특히, 본 발명의 실시예에 따른 카메라부(12)는 수평 및 상하 360도 전방향을 촬영 할 수 있다. 이에 따라, 카메라부(12)는 선박의 360도 둘레를 모두 촬영할 수 있다. 이를 위하여, 카메라부(12)는 복수의 렌즈 및 복수의 이미지 센서를 포함할 수 있다. 이미지 센서는 피사체에서 반사되는 빛을 입력받아 전기신호로 변환하며, CCD(Charged Coupled Device), CMOS(Complementary Metal-Oxide Semiconductor) 등을 기반으로 구현될 수 있다. 카메라부(12)는 하나 이상의 아날로그-디지털 변환기(Analog to Digital Converter)를 더 포함할 수 있으며, 이미지 센서에서 출력되는 전기신호를 디지털 수열로 변환하여 제어부(16)로 출력할 수 있다. The camera unit 12 is for photographing an image. In particular, the camera unit 12 according to an embodiment of the present invention may photograph horizontal and vertical 360 degrees in all directions. Accordingly, the camera unit 12 can photograph all 360 degrees around the ship. To this end, the camera unit 12 may include a plurality of lenses and a plurality of image sensors. The image sensor receives light reflected from a subject and converts it into an electric signal, and may be implemented based on a Charged Coupled Device (CCD) or Complementary Metal-Oxide Semiconductor (CMOS). The camera unit 12 may further include one or more analog to digital converters, and may convert an electrical signal output from the image sensor into a digital sequence and output it to the controller 16.

입력부(13)는 관제장치(10)를 제어하기 위한 사용자의 키 조작을 입력받고 입력 신호를 생성하여 제어부(16에 전달할 수 있다. 입력부(13)는 관제장치(10)를 제어하기 위한 각 종 키들을 포함한다. 입력부(13)는 표시부(14)가 터치스크린으로 이루어진 경우, 각 종 키들의 기능이 표시부(14)에서 이루어질 수 있으며, 터치스크린만으로 모든 기능을 수행할 수 있는 경우, 입력부(13)는 생략될 수도 있다. The input unit 13 receives a user's key manipulation for controlling the control device 10, generates an input signal, and transmits the input signal to the control unit 16. The input unit 13 can control various types of control devices 10. When the display unit 14 is made of a touch screen, the input unit 13 may perform functions of various keys on the display unit 14, and when all functions can be performed only with the touch screen, the input unit ( 13) may be omitted.

표시부(14)는 화면 표시를 위한 것으로, 관제장치(10)의 메뉴, 입력된 데이터, 기능 설정 정보 및 기타 다양한 정보를 사용자에게 시각적으로 제공할 수 있다. 또한, 표시부(14)는 관제장치(10)의 부팅 화면, 대기 화면, 메뉴 화면, 등의 화면을 출력하는 기능을 수행한다. 표시부(14)는 액정표시장치(LCD, Liquid Crystal Display), 유기 발광 다이오드(OLED, Organic Light Emitting Diodes), 능동형 유기 발광 다이오드(AMOLED, Active Matrix Organic Light Emitting Diodes) 등으로 형성될 수 있다. 한편, 표시부(14)는 터치스크린으로 구현될 수 있다. 이러한 경우, 표시부(14)는 터치센서를 포함한다. 터치센서는 사용자의 터치 입력을 감지한다. 터치센서는 정전용량 방식(capacitive overlay), 압력식, 저항막 방식(resistive overlay), 적외선 감지 방식(infrared beam) 등의 터치 감지 센서로 구성되거나, 압력 감지 센서(pressure sensor)로 구성될 수도 있다. 상기 센서들 이외에도 물체의 접촉 또는 압력을 감지할 수 있는 모든 종류의 센서 기기가 본 발명의 터치센서로 이용될 수 있다. 터치센서는 사용자의 터치 입력을 감지하고, 터치된 위치를 나타내는 입력 좌표를 포함하는 감지 신호를 발생시켜 제어부(16)로 전송할 수 있다. 특히, 표시부(14)가 터치스크린으로 이루어진 경우, 입력부(13)의 기능의 일부 또는 전부는 표시부(14)를 통해 이루어질 수 있다. The display unit 14 is for screen display and may visually provide a menu of the control device 10, input data, function setting information, and various other information to a user. In addition, the display unit 14 performs a function of outputting screens such as a boot screen, a standby screen, a menu screen, and the like of the control device 10. The display unit 14 may be formed of a liquid crystal display (LCD), an organic light emitting diode (OLED), an active matrix organic light emitting diode (AMOLED), or the like. Meanwhile, the display unit 14 may be implemented as a touch screen. In this case, the display unit 14 includes a touch sensor. The touch sensor detects a user's touch input. The touch sensor may be composed of a touch sensing sensor such as a capacitive overlay, a pressure type, a resistive overlay, or an infrared beam, or may be composed of a pressure sensor. . In addition to the above sensors, all kinds of sensor devices capable of sensing contact or pressure of an object may be used as the touch sensor of the present invention. The touch sensor may sense a user's touch input, generate a sensing signal including input coordinates indicating the touched position, and transmit the sensing signal to the controller 16. In particular, when the display unit 14 is formed of a touch screen, some or all of the functions of the input unit 13 may be performed through the display unit 14.

저장부(15)는 관제장치(10)의 동작에 필요한 프로그램 및 데이터를 저장하는 역할을 수행한다. 저장부(15)는 본 발명의 실시예에 따른 객체 인식을 위해 사용되는 각 종 영상 및 이미지 등을 소정 기간 저장할 수 있다. 저장부(15)에 저장되는 각 종 데이터는 사용자의 조작에 따라, 삭제, 변경, 추가될 수 있다. The storage unit 15 serves to store programs and data necessary for the operation of the control device 10. The storage unit 15 may store various types of images and images used for object recognition according to an exemplary embodiment of the present invention for a predetermined period. Each type of data stored in the storage unit 15 may be deleted, changed, or added according to a user's manipulation.

제어부(16)는 관제장치(10)의 전반적인 동작 및 관제장치(10)의 내부 블록들 간 신호 흐름을 제어하고, 데이터를 처리하는 데이터 처리 기능을 수행할 수 있다. 또한, 제어부(16)는 기본적으로, 관제장치(10)의 각 종 기능을 제어하는 역할을 수행한다. 제어부(16)는 중앙처리장치(CPU: Central Processing Unit), 디지털신호처리기(DSP: Digital Signal Processor) 등을 예시할 수 있다. 다음으로, 제어부(16)는 전처리부(100), 심층신경망(200), 학습부(300) 및 관제부(400)를 포함한다. The controller 16 may perform a data processing function of controlling the overall operation of the control device 10 and a signal flow between internal blocks of the control device 10, and processing data. In addition, the control unit 16 basically performs a role of controlling various functions of the control device 10. The control unit 16 may be a central processing unit (CPU), a digital signal processor (DSP), or the like. Next, the control unit 16 includes a preprocessor 100, a deep neural network 200, a learning unit 300, and a control unit 400.

본 발명은 제1 및 제2 실시예를 포함하며, 제어부(16)는 제1 및 제2 실시예에 따라 해상 객체 거리를 고려하여 접근 선박을 인식할 수 있다. The present invention includes the first and second embodiments, and the control unit 16 may recognize an approaching vessel in consideration of the distance of a sea object according to the first and second embodiments.

구체적으로, 도 2를 참조하여 본 발명의 제1 실시예에 대해서 설명하기로 한다. 본 발명의 제1 실시예에 따르면, 제어부(16)의 전처리부(100)는 카메라부(12)를 통해 선박 주변을 촬영하여 소정 넓이(SW×SH)의 감시영상(SV)을 생성한다. 그런 다음, 전처리부(100)는 촬영된 영상에 포함된 객체(obj)가 차지하는 영역을 영역상자(B)를 통해 검출하여 객체영상(OV)을 생성한다. 영역상자(B)는 감시영상(SV)에서 객체(obj)가 차지하는 영역을 사각형으로 나타낸 것이며, 객체(obj)가 모두 포함되는 최소 크기의 사격형을 의미한다. 이러한 영역상자(B)는 중심좌표(x, y), 폭(w) 및 높이(h)를 가진다. Specifically, a first embodiment of the present invention will be described with reference to FIG. 2. According to the first embodiment of the present invention, the preprocessing unit 100 of the control unit 16 photographs the surroundings of the ship through the camera unit 12 to generate a surveillance image SV of a predetermined area (SW×SH). Then, the preprocessor 100 detects an area occupied by the object obj included in the captured image through the area box B to generate the object image OV. The area box B represents the area occupied by the object obj in the surveillance image SV as a rectangle, and means a shooting type having the minimum size in which all the objects obj are included. This area box B has a central coordinate (x, y), a width (w) and a height (h).

한편, 전처리부(100)가 검출한 객체영상(OV)은 제1 실시예에 따른 심층신경망(200)에 입력한다. 그러면, 심층신경망(200)은 객체영상(OV)의 객체(obj)가 선박인지 여부를 확률로 산출한다. 관제부(400)는 심층신경망(200)이 산출한 확률에 따라 객체(obj)가 선박인지 여부를 판정하고, 판정 결과, 객체(obj)가 선박이면, 그 선박과의 충돌이 발생할 확률을 나타내는 위험도를 산출한다. Meanwhile, the object image OV detected by the preprocessor 100 is input to the deep neural network 200 according to the first embodiment. Then, the deep neural network 200 calculates as a probability whether the object obj of the object image OV is a ship. The control unit 400 determines whether the object obj is a ship according to the probability calculated by the deep neural network 200, and as a result of the determination, if the object obj is a ship, it indicates the probability of a collision with the ship. Calculate the risk level.

관제부(400)는 감시영상(SV)에서 객체(obj)의 영역상자(B)의 좌표를 통해 객체영상의 넓이(w×h)를 산출한 후, 감시영상(SV)의 넓이(SW×SH) 대 객체영상(OV)의 넓이(w×h)의 비율을 통해 위험도를 산출할 수 있다. 관제부(400)는 산출된 위험도가 기 설정된 임계치 이상이면, 충돌 위험이 있는 것으로 판단한다. 이에 따라, 관제부(400)는 통신부(11)를 통해 사용자장치, 경보기, 관제서버 등에 충돌 위험을 알리는 경보 메시지를 전송한다. The control unit 400 calculates the width (w×h) of the object image through the coordinates of the area box B of the object obj in the surveillance image SV, and then calculates the width of the surveillance image SV (SW× The risk can be calculated through the ratio of the area (w×h) of the object image (OV) to the SH). The control unit 400 determines that there is a risk of collision if the calculated risk is greater than or equal to a preset threshold. Accordingly, the control unit 400 transmits an alarm message notifying the risk of collision to a user device, an alarm, and a control server through the communication unit 11.

한편, 도 2, 도 3 및 도 4를 참조하여 본 발명의 제2 실시예에 대해서 설명하기로 한다. 제2 실시예에 따르면, 제어부(16)의 전처리부(100)는 카메라부(12)를 통해 선박 주변을 촬영하여 시간 순서에 따라 정렬된 복수의 감시영상(SVt)을 생성한다. 감시영상(SV)은 동영상으로 촬영되는 복수의 프레임을 포함한다. 특히, 일 실시예에 따르면, 프레임이 I-프레임(Intra-coded frame), P-프레임(Predictive-coded frame) 및 B-프레임(Bidirectional-coded frame)으로 이루어진 경우, 감시영상(SVt)은 I-프레임(Intra-coded frame)만 제공될 수도 있다. 다른 실시예에 따르면, 소정 주기에 따라 특정 프레임을 추출하여 제공할 수도 있다. 예컨대, 감시영상(SVt)은 초당 1개의 프레임을 제공할 수도 있다. 또한, 전처리부(100)는 복수의 감시영상(SVt) 각각에 포함된 객체(obj)가 차지하는 영역을 영역상자(Bt)를 통해 검출하여 시간 순서에 따라 정렬된 복수의 객체영상(OVt)을 생성한다. Meanwhile, a second embodiment of the present invention will be described with reference to FIGS. 2, 3 and 4. According to the second embodiment, the preprocessing unit 100 of the control unit 16 photographs the surroundings of the ship through the camera unit 12 and generates a plurality of surveillance images SVt arranged in chronological order. The surveillance image SV includes a plurality of frames photographed as a video. In particular, according to an embodiment, when the frame is composed of an I-frame (Intra-coded frame), a P-frame (Predictive-coded frame) and a B-frame (Bidirectional-coded frame), the surveillance image (SVt) is I -Only Intra-coded frames may be provided. According to another embodiment, a specific frame may be extracted and provided according to a predetermined period. For example, the surveillance image SVt may provide one frame per second. In addition, the preprocessor 100 detects an area occupied by the object obj included in each of the plurality of surveillance images SVt through the area box Bt, and generates a plurality of object images OVt arranged in chronological order. Generate.

한편, 시간 순서에 따라 정렬된 복수의 객체영상(OVt)은 제2 실시예에 따른 심층신경망(200)에 입력된다. 그러면, 심층신경망(200)은 복수의 객체영상(OVt) 각각에 포함된 객체가 선박인지 여부 및 복수의 객체영상(OVt) 각각에 포함된 객체가 선박이면, 복수의 객체영상(OVt) 각각의 객체(obj)의 항해 방향을 나타내는 복수의 방향벡터(Dt)를 도출한다. 도 3에 도시된 바와 같이, 객체영상(OVt)을 통해 현재 선박이 항해하고 있는 방향을 검출할 수 있다. 이에 따라, 검출되는 방향벡터(Dt)의 값은 다음의 표 1과 같다. Meanwhile, a plurality of object images OVt arranged according to time order are input to the deep neural network 200 according to the second embodiment. Then, the deep neural network 200 determines whether the object included in each of the plurality of object images OVt is a ship, and if the object included in each of the plurality of object images OVt is a ship, the plurality of object images OVt A plurality of direction vectors Dt representing the navigation direction of the object obj are derived. As shown in FIG. 3, the direction in which the ship is currently sailing may be detected through the object image OVt. Accordingly, values of the detected direction vector Dt are shown in Table 1 below.

방향(도)Direction (degrees) 값value 0(360)0(360) 00 00 00 00 00 00 00 1One 4545 00 00 00 00 00 00 1One 00 9090 00 00 00 00 00 1One 00 00 135135 00 00 00 00 1One 00 00 00 180180 00 00 00 1One 00 00 00 00 225225 00 00 1One 00 00 00 00 00 270270 00 1One 00 00 00 00 00 00 315315 1One 00 00 00 00 00 00 00

이와 같이, 본 발명의 실시예에서 복수의 방향벡터(Dt)는 8개의 방향을 구분하며, 방향벡터(Dt)의 값은 원핫인코딩(one-hot encoding)에 의한 값을 가지는 것으로 설명하지만, 본 발명이 이에 한정되는 것은 아니며, 이 기술분야의 통상의 지식을 가진자라면, 방향벡터(Dt)의 방향의 수 및 인코딩 방식의 다양한 변화와 수정이 있을 수 있음을 이해할 수 있을 것이다. 또한, 심층신경망(200)은 복수의 방향벡터(Dt) 각각에 대응하는 영역벡터(Bt) 및 시간벡터(Tt)를 더하여 복수의 항해벡터(Xt)를 생성한다. 여기서, 영역벡터(Bt)는 감시영상(SV)에서 복수의 객체영상(OVt) 각각이 차지하는 영역을 나타낸다. 이러한 영역벡터(Bt)는 영역상자(B)의 좌표(x, y, w, h)를 통해 표현될 수 있다. 시간벡터(Tt)는 복수의 객체영상(OVt) 각각이 생성된 시간을 나타낸다. 그리고 심층신경망(200)은 각각이 방향벡터(Dt), 영역벡터(Bt) 및 시간벡터(Tt)를 포함하는 복수의 항해벡터(Xt)로부터 소정 시간 이후의 객체(obj)의 항해 상태, 즉, 항해 방향, 항해 영역 및 시간을 나타내는 항해예측벡터(Yk)를 산출한다. As described above, in the embodiment of the present invention, the plurality of direction vectors Dt divide eight directions, and the value of the direction vector Dt is described as having a value by one-hot encoding. The invention is not limited thereto, and those of ordinary skill in the art will appreciate that there may be various changes and modifications in the number of directions of the direction vector Dt and the encoding method. Further, the deep neural network 200 generates a plurality of navigation vectors (Xt) by adding a region vector (Bt) and a time vector (Tt) corresponding to each of the plurality of direction vectors (Dt). Here, the region vector Bt represents an area occupied by each of the plurality of object images OVt in the surveillance image SV. This region vector Bt can be expressed through the coordinates (x, y, w, h) of the region box B. The time vector Tt represents the time when each of the plurality of object images OVt is generated. And the deep neural network 200 is the navigation state of the object (obj) after a predetermined time from a plurality of navigation vectors (Xt) each including a direction vector (Dt), a region vector (Bt) and a time vector (Tt), that is, , Calculate the navigation prediction vector (Yk) representing the navigation direction, navigation area and time.

항해예측벡터(Yk)는 예측된 방향벡터(Dk), 영역벡터(Bk) 및 시간벡터(Tk)를 포함하며, 관제부(400)는 항해예측벡터(Yk)의 방향벡터(Dk)로부터 도출되는 향해 방행이 충돌 가능 영역을 지향하고, 영역벡터(Bk)로부터 도출되는 영역상자(B)가 차지하는 영역이 감시영상(SV) 내의 기 설정된 경고 영역과 적어도 일부 중첩되면, 경보를 발령한다. 예컨대, 도 4의 감시영상(SV)은 3행4열의 12개의 셀로 구분되며, 2행3열 및 3행3열[(2, 3), (3, 3)]의 셀이 경고 영역이며, 해당 셀을 지향하는 방향이 충돌 가능 방향으로 설정될 수 있다. 예컨대, 도출된 바에 따르면, 항해예측벡터(Yk)의 방향벡터(Dk)는 180도로 충돌 가능 방향인 3행3열(3, 3)을 지향하며, 감시영상(SV)의 3행3열(3, 3)의 셀과 항해예측벡터(Yk)의 영역벡터(Bk)의 영역상자(B)가 일부 중첩된다. 이에 따라, 관제부(400)는 위험도가 기 설정된 임계치 이상인 것으로 판단하고, 통신부(11)를 통해 사용자장치, 경보기, 관제서버 등에 충돌 위험을 알리는 경보 메시지를 전송한다. The navigation prediction vector (Yk) includes the predicted direction vector (Dk), the region vector (Bk) and the time vector (Tk), and the control unit 400 is derived from the direction vector (Dk) of the navigation prediction vector (Yk). When the heading toward which is directed toward a possible collision area and the area occupied by the area box B derived from the area vector Bk overlaps at least partially with a preset warning area in the surveillance image SV, an alarm is issued. For example, the surveillance image (SV) of FIG. 4 is divided into 12 cells of 3 rows and 4 columns, and cells of 2 rows 3 columns and 3 rows 3 columns [(2, 3), (3, 3)] are warning areas, A direction toward the cell may be set as a possible collision direction. For example, according to the derivation, the direction vector Dk of the navigation prediction vector Yk is aimed at 3 rows and 3 columns (3, 3), which is a possible collision direction at 180 degrees, and the 3 rows and 3 columns of the surveillance image SV ( The cells of 3 and 3) and the region box B of the region vector Bk of the navigation prediction vector Yk are partially overlapped. Accordingly, the control unit 400 determines that the risk is equal to or greater than a preset threshold, and transmits an alarm message notifying the risk of collision to the user device, the alarm, and the control server through the communication unit 11.

다음으로, 본 발명의 실시예에 따른 심층신경망에 대해 보다 상세하게 설명하기로 한다. 먼저, 본 발명의 제1 실시예에 따른 심층신경망에 대해 설명하기로 한다. 도 5는 본 발명의 제1 실시예에 따른 심층신경망의 구성을 설명하기 위한 도면이다. Next, a deep neural network according to an embodiment of the present invention will be described in more detail. First, a deep neural network according to a first embodiment of the present invention will be described. 5 is a view for explaining the configuration of a deep neural network according to the first embodiment of the present invention.

도 5를 참조하면, 제1 실시예에 따른 심층신경망(200)은 객체식별망(210)을 포함한다. 이러한 객체식별망(210)은 복수의 계층을 포함한다. 즉, 객체식별망(210)은 입력계층(input layer: INL), 은닉계층(Hidden Layer) 및 출력계층(output layer: OUL)을 포함한다. 은닉계층은 하나 이상의 컨볼루션계층(convolution layer: CVL), 하나 이상의 풀링계층(pooling layer: POL) 및 하나 이상의 완전연결계층(fully-connected layer: FCL)을 포함한다. 도 2에 따르면, 본 발명의 제1 실시예에 따른 객체식별망(210)은 순차로 입력계층(INL), 제1 컨볼루션계층(CVL1), 제1 풀링계층(POL1), 제2 컨볼루션계층(CVL2), 제2 풀링계층(POL2), 제1 완전연결계층(FCL1), 제2 완전연결계층(FCL2) 및 출력계층(OUL)을 포함한다. Referring to FIG. 5, the deep neural network 200 according to the first embodiment includes an object identification network 210. The object identification network 210 includes a plurality of layers. That is, the object identification network 210 includes an input layer (INL), a hidden layer, and an output layer (OUL). The hidden layer includes one or more convolution layers (CVL), one or more pooling layers (POL), and one or more fully-connected layers (FCL). Referring to FIG. 2, the object identification network 210 according to the first embodiment of the present invention sequentially includes an input layer (INL), a first convolution layer (CVL1), a first pooling layer (POL1), and a second convolution layer. It includes a layer CVL2, a second pooling layer POL2, a first fully connected layer FCL1, a second fully connected layer FCL2, and an output layer OUL.

입력계층(INL)은 소정 크기의 입력 행렬로 이루어진다. 입력계층(INL)의 입력 행렬의 원소는 객체영상(OV)의 복수의 픽셀에 대응한다. 이에 따라, 객체영상(OV)의 복수의 픽셀값이 입력계층(INL) 입력 행렬의 각 원소의 값으로 입력된다. The input layer INL consists of an input matrix having a predetermined size. Elements of the input matrix of the input layer INL correspond to a plurality of pixels of the object image OV. Accordingly, a plurality of pixel values of the object image OV are input as values of each element of the input layer INL input matrix.

제1 및 제2 컨볼루션계층(CVL1, CVL2) 및 제1 및 제2 풀링계층(POL1, POL2)은 적어도 하나의 특징영상(FM: Feature Map)으로 구성된다. 특징영상(FM)은 픽셀값으로 이루어진 소정 크기의 행렬로 구성된다. 특징영상(FM)은 이전 계층의 값에 대해 가중치가 적용된 연산 수행 결과로 생성된다. 이러한 가중치는 필터(W)를 통해 적용된다. 여기서, 필터는 소정 크기(예컨대, N이 자연수일 때, N×N)의 행렬이며, 필터를 이루는 행렬의 각 원소의 값이 가중치(w)가 된다. 따라서 특징영상(FM)을 이루는 행렬의 각 원소에는 이전 계층의 가중치 필터(W)를 이용한 가중치가 적용된 연산에 따라 산출된 값이 저장된다. The first and second convolution layers CVL1 and CVL2 and the first and second pooling layers POL1 and POL2 are composed of at least one feature map (FM). The feature image FM is composed of a matrix of a predetermined size made of pixel values. The feature image FM is generated as a result of performing an operation to which a weight is applied to the value of the previous layer. These weights are applied through the filter (W). Here, the filter is a matrix of a predetermined size (for example, when N is a natural number, N×N), and the value of each element of the matrix forming the filter becomes the weight (w). Accordingly, a value calculated according to a weighted operation using the weight filter W of the previous layer is stored in each element of the matrix constituting the feature image FM.

제1 컨볼루션계층(CVL1)은 적어도 하나의 특징영상(FM)을 포함한다. 제1 컨볼루션계층(CVL1)을 구성하는 특징영상(FM)은 입력계층(INL)의 객체영상(OV)에 대해 각 원소가 가중치로 이루어진 행렬인 필터(W)를 통해 컨볼루션 연산(convolution)을 수행한 결과로 생성된다. The first convolution layer CVL1 includes at least one feature image FM. The feature image FM constituting the first convolution layer CVL1 is a convolution operation through a filter W, which is a matrix in which each element is a weighted object image OV of the input layer INL. It is created as a result of performing.

제1 풀링계층(POL1)은 적어도 하나의 특징영상(FM)을 포함한다. 제1 풀링계층(POL1)을 구성하는 특징영상(FM)은 제1 컨볼루션계층(CVL1)의 특징영상(FM)에 대해 각 원소가 가중치로 이루어진 행렬인 필터(W)를 통해 풀링(pooling 또는 subsampling) 연산을 수행한 결과로 생성된다. The first pooling layer POL1 includes at least one feature image FM. The feature image FM constituting the first pooling layer POL1 is pooled through a filter W, which is a matrix in which each element is a weight of the feature image FM of the first convolutional layer CVL1. It is generated as a result of executing the subsampling) operation.

제2 컨볼루션계층(CVL2)은 적어도 하나의 특징영상(FM)을 포함한다. 제2 컨볼루션계층(CVL2)을 구성하는 특징영상(FM)은 제1 풀링계층(POL1)의 특징영상에 대해 각 원소가 가중치로 이루어진 행렬인 필터(W)를 통해 컨볼루션 연산(convolution)을 수행한 결과로 생성된다. The second convolution layer CVL2 includes at least one feature image FM. The feature image FM constituting the second convolution layer CVL2 performs a convolution operation through a filter W, which is a matrix in which each element is a weight, for the feature image of the first pooling layer POL1. It is created as a result of performing it.

제2 풀링계층(POL2)은 적어도 하나의 특징영상(FM)을 포함한다. 제2 풀링계층(POL2)을 구성하는 특징영상(FM)은 제2 컨볼루션계층(CVL2)의 특징영상(FM)에 대해 각 원소가 가중치로 이루어진 행렬인 필터(W)를 통해 풀링(pooling 또는 subsampling) 연산을 수행한 결과로 생성된다. The second pooling layer POL2 includes at least one feature image FM. The feature image FM constituting the second pooling layer POL2 is pooled through a filter W, which is a matrix in which each element is a weight of the feature image FM of the second convolutional layer CVL2. It is generated as a result of executing the subsampling) operation.

제1 완전연결계층(FCL1)은 복수의 연산 노드(f1, f2, ..., fn)를 포함한다. 제1 완전연결계층(FCL1)의 복수의 연산 노드(f1, f2, ..., fn) 각각의 값은 제2 풀링계층(POL2)의 특징영상(FM)의 값에 대해 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 산출된다. 예컨대, 제2 풀링계층(POL2)은 2개의 특징영상(FM)을 가지며, 제2 풀링계층(POL2)의 2개의 특징영상(FM)이 2×8×8(채널×행×열)의 형태일 때, 이러한 형태의 값들이 1×128(채널×열)의 형태의 값으로 정렬되어 제1 완전연결계층(FCL1)의 복수의 연산 노드(f1, f2, ..., fn) 각각에 입력값으로 입력된다. 그러면, 복수의 연산 노드(f1, f2, ..., fn) 각각은 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 노드값을 산출한다. 여기서, 활성화함수는 시그모이드(Sigmoid), 하이퍼볼릭탄젠트(tanh: Hyperbolic tangent), ELU(Exponential Linear Unit), ReLU(Rectified Linear Unit), Leakly ReLU, Maxout, Minout, Softmax 등을 예시할 수 있다. The first fully connected layer FCL1 includes a plurality of computing nodes f1, f2, ..., fn. Each value of the plurality of computational nodes (f1, f2, ..., fn) of the first fully connected layer (FCL1) is weighted through the activation function for the value of the feature image (FM) of the second pooling layer (POL2). Is calculated by performing an operation to which is applied. For example, the second pooling layer (POL2) has two feature images (FM), and the two feature images (FM) of the second pooling layer (POL2) are in the form of 2 × 8 × 8 (channel × row × column) When, these types of values are arranged in a 1×128 (channel×column) type value and input to each of the plurality of operation nodes (f1, f2, ..., fn) of the first fully connected layer (FCL1). It is entered as a value. Then, each of the plurality of operation nodes f1, f2, ..., fn calculates a node value by performing an operation to which a weight is applied through an activation function. Here, the activation function may exemplify Sigmoid, Hyperbolic tangent (tanh), Exponential Linear Unit (ELU), Rectified Linear Unit (ReLU), Leakly ReLU, Maxout, Minout, Softmax, etc. .

제2 완전연결계층(FCL2)은 복수의 연산 노드(g1, g2, ..., gn)를 포함한다. 제2 완전연결계층(FCL2)의 복수의 연산 노드(g1, g2, ..., gn) 각각의 노드값은 제1 완전연결계층(FCL2)의 복수의 연산 노드(f1, f2, ..., fn)의 노드값에 대해 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 산출된다. The second fully connected layer FCL2 includes a plurality of computing nodes g1, g2, ..., gn. The node values of each of the plurality of computing nodes g1, g2, ..., gn of the second fully connected layer FCL2 are the plurality of computing nodes f1, f2, ... of the first fully connected layer FCL2. , fn) is calculated by performing an operation in which a weight is applied through an activation function.

출력계층(OUL)은 2개의 출력 노드(O1, O2)를 포함한다. 출력계층(OUL)의 2개의 출력 노드(O1, O2)의 노드값, 즉, 출력값은 제2 완전연결계층(FCL)의 복수의 연산 노드(g1, g2, ..., gn) 각각의 노드값에 대해 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 산출된다. The output layer OUL includes two output nodes O1 and O2. Node values of the two output nodes O1 and O2 of the output layer OUL, that is, the output values are nodes of each of the plurality of computation nodes g1, g2, ..., gn of the second fully connected layer FCL It is calculated by performing an operation in which a weight is applied to a value through an activation function.

이러한 출력계층(OUL)의 2개의 출력 노드(O1, O2)의 출력값은 객체식별망(210)의 출력값이 된다. 2개의 출력 노드(O1, O2) 각각은 선박 및 비선박에 대응하며, 2개의 출력 노드(O1, O2) 각각의 출력값은 입력된 객체영상(OV)의 객체(obj)가 선박일 확률과, 선박이 아닐 확률을 의미한다. 이때, 제1 출력노드(O1)의 출력값이 0.846이고, 제2 출력노드(O2)의 출력값이 0.154라면, 객체(obj)가 선박일 확률이 85%이고, 선박이 아닐 확률이 15%임을 의미한다. The output values of the two output nodes O1 and O2 of the output layer OUL become the output values of the object identification network 210. Each of the two output nodes (O1, O2) corresponds to a ship and a non-ship, and the output value of each of the two output nodes (O1, O2) is the probability that the object (obj) of the input object image (OV) is a ship, It means the probability that it is not a ship. At this time, if the output value of the first output node (O1) is 0.846 and the output value of the second output node (O2) is 0.154, it means that the probability that the object (obj) is a ship is 85%, and the probability that it is not a ship is 15%. do.

그러면, 전술한 객체식별망(210)에 대한 학습 방법에 대해 설명하기로 한다. 학습부(300)는 그 영상에 포함된 객체(obj)가 선박인지 여부가 알려진 학습용 영상을 획득할 수 있다. 그러면, 학습부(300)는 학습용 영상에 포함된 객체(obj)가 선박인지 여부에 따라 기댓값을 설정할 수 있다. 예컨대, 학습부(300)는 학습용 영상에 포함된 객체(obj)가 선박인 경우, 제1 출력노드 및 제2 출력노드 각각에 대해 "(O1, O2) = [1, 0]"와 같이 기댓값을 설정할 수 있다. 역으로, 학습부(300)는 영상에 포함된 객체(obj)가 선박이 아닌 대조군 학습용 영상에 대해서, 제1 출력노드(O1) 및 제2 출력노드(O2) 각각에 대해 "(O1, O2) = [0, 1]"와 같이 기댓값을 설정할 수 있다. Then, a method of learning the object identification network 210 described above will be described. The learning unit 300 may acquire an image for learning in which the object obj included in the image is a ship. Then, the learning unit 300 may set the expected value according to whether the object obj included in the training image is a ship. For example, when the object (obj) included in the learning image is a ship, the learning unit 300 is an expected value such as "(O1, O2) = [1, 0]" for each of the first output node and the second output node. Can be set. Conversely, the learning unit 300 uses "(O1, O2) for each of the first output node O1 and the second output node O2 for the control training image in which the object obj included in the image is not a ship. ) = [0, 1]", you can set the expected value.

기댓값을 설정한 후, 학습부(300)는 학습용 영상을 객체식별망(210)에 입력한다. 그러면, 객체식별망(210)은 학습용 영상에 대해 복수의 계층의 가중치가 적용되는 복수의 연산을 통해 출력값을 산출하여 출력할 것이다. 여기서, 객체식별망(210)의 출력값은 제1 출력노드(O1)의 출력값인 객체가 선박일 확률과 제2 출력노드(O2)의 출력값인 선박이 아닐 확률을 포함한다. 학습부(300)는 객체식별망(210)의 출력값과 앞서 설정된 기댓값의 차이인 이진 교차 엔트로피 손실(binary cross entropy loss)을 산출하고 이진 교차 엔트로피 손실이 최소가 되도록 최적화 알고리즘을 통해 객체식별망(210)의 가중치를 최적화한다. After setting the expected value, the learning unit 300 inputs the training image into the object identification network 210. Then, the object identification network 210 will calculate and output an output value through a plurality of operations in which weights of a plurality of layers are applied to the training image. Here, the output value of the object identification network 210 includes the probability that the object, which is the output value of the first output node O1, is a ship, and the probability that the object, which is the output value of the second output node O2, is not a ship. The learning unit 300 calculates the binary cross entropy loss, which is the difference between the output value of the object identification network 210 and the previously set expected value, and uses an optimization algorithm to minimize the binary cross entropy loss. 210) is optimized.

다음으로, 본 발명의 제2 실시예에 따른 심층신경망(200)의 구성에 대해서 설명하기로 한다. 도 6 내지 도 9는 본 발명의 제2 실시예에 따른 심층신경망의 구성을 설명하기 위한 도면이다. 먼저, 도 6을 참조하면, 본 발명의 제2 실시예에 따른 심층신경망(200)은 방향식별망(220), 덧셈기(230) 및 항해예측망(240)을 포함한다. Next, a configuration of the deep neural network 200 according to the second embodiment of the present invention will be described. 6 to 9 are diagrams for explaining the configuration of a deep neural network according to a second embodiment of the present invention. First, referring to FIG. 6, a deep neural network 200 according to a second embodiment of the present invention includes a direction identification network 220, an adder 230, and a navigation prediction network 240.

제2 실시예에 따른 심층신경망(200)은 전처리부(100)로부터 복수의 객체영상(OVt), 복수의 객체영상(OVt) 각각의 감시영상(SV) 상에서의 영역상자(B)의 좌표를 나타내는 복수의 영역벡터(Bt) 및 감시영상(SV)으로부터 복수의 객체영상(OVt) 각각이 생성된 시간을 나타내는 시간벡터(Tt)를 입력 받는다. 그러면, 방향식별망(220)은 복수의 객체영상(OVt)에 포함된 객체(obj)의 항해 방향을 나타내는 방향벡터(Dt)를 산출한다. 그리고 덧셈기(230)는 방향벡터(Dt), 영역벡터(Bt) 및 시간벡터(Tt)를 더하여 시간 순서로 정렬된 복수의 항해벡터(Xt)를 생성한다(Xt = Dt + Bt + Tt). 그런 다음, 항해예측망(240)은 시간 순서로 정렬된 복수의 항해벡터(Xt) 각각에 대해 순차로 가중치가 적용되는 연산을 수행하여 소정 시간 후(k-t+0, k>t)의 객체(obj)의 항해 상태를 예측하는 항해예측벡터(Yk)를 산출한다. 항해예측벡터(Y)는 소정 시간 후의 객체(obj)의 예측된 항해 방향을 나타내는 방향예측벡터(Dk), 소정 시간 후의 객체(obj)가 감시영상(SV)에서 차지하는 영역을 영역박스(B)의 좌표로 예측한 영역예측벡터(Bt) 및 소정 시간 후(k-t+0, k>t)를 나타내는 시간벡터(Tt)를 포함한다. In the deep neural network 200 according to the second embodiment, the coordinates of the area box B on the surveillance image SV of the plurality of object images OVt and the plurality of object images OVt from the preprocessor 100 A time vector Tt representing a time when each of the plurality of object images OVt is generated is input from the plurality of region vectors Bt and the surveillance image SV shown. Then, the direction identification network 220 calculates a direction vector Dt indicating the navigation direction of the object obj included in the plurality of object images OVt. In addition, the adder 230 generates a plurality of navigation vectors (Xt) arranged in chronological order by adding a direction vector (Dt), a region vector (Bt), and a time vector (Tt) (Xt = Dt + Bt + Tt). Then, the navigation prediction network 240 performs an operation in which weights are sequentially applied to each of the plurality of navigation vectors (Xt) arranged in chronological order, and after a predetermined time (k-t+0, k>t) The navigation prediction vector (Yk) that predicts the navigation state of the object (obj) is calculated. The navigation prediction vector (Y) is a direction prediction vector (Dk) representing the predicted navigation direction of the object (obj) after a predetermined time, and the area occupied by the object (obj) after a predetermined time in the surveillance image (SV) is an area box (B). It includes a region prediction vector Bt predicted by the coordinates of and a time vector Tt indicating after a predetermined time (k-t+0, k>t).

그러면, 전술한 방향식별망(220) 및 항해예측망(240) 각각에 대해서 상세하게 설명하기로 한다. 먼저, 본 발명의 실시예에 따른 방향식별망(220)의 구성에 대해서 설명하기로 한다. 도 7을 참조하면, 방향식별망(220)은 앞서 설명된 객체식별망(210)과 출력계층(OUL)의 노드의 수가 상이하며, 다른 계층은 동일한 구조를 가진다. 즉, 방향식별망(220)은 순차로 입력계층(INL), 제1 컨볼루션계층(CVL1), 제1 풀링계층(POL1), 제2 컨볼루션계층(CVL2), 제2 풀링계층(POL2), 제1 완전연결계층(FCL1), 제2 완전연결계층(FCL2) 및 출력계층(OUL)을 포함한다. Then, each of the above-described direction identification network 220 and navigation prediction network 240 will be described in detail. First, the configuration of the direction identification network 220 according to an embodiment of the present invention will be described. Referring to FIG. 7, the direction identification network 220 has a different number of nodes of the object identification network 210 and the output layer (OUL) described above, and the other layers have the same structure. That is, the direction identification network 220 sequentially includes an input layer (INL), a first convolution layer (CVL1), a first pooling layer (POL1), a second convolution layer (CVL2), and a second pooling layer (POL2). , A first fully connected layer (FCL1), a second fully connected layer (FCL2), and an output layer (OUL).

제1 완전연결계층(FCL1)은 복수의 연산 노드(f1, f2, ..., fn)를 포함한다. 제1 완전연결계층(FCL1)의 복수의 연산 노드(f1, f2, ..., fn) 각각의 노드값은 제2 풀링계층(POL2)의 특징영상(FM)의 값에 대해 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 산출된다. 예컨대, 제2 풀링계층(POL2)은 2개의 특징영상(FM)을 가지며, 제2 풀링계층(POL2)의 2개의 특징영상(FM)이 2×8×8(채널×행×열)의 형태일 때, 이러한 형태의 값들이 1×128(채널×열)의 형태의 값으로 정렬되어 제1 완전연결계층(FCL1)의 복수의 연산 노드(f1, f2, ..., fn) 각각에 입력값으로 입력된다. 그러면, 복수의 연산 노드(f1, f2, ..., fn) 각각은 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 노드값을 산출한다. 여기서, 활성화함수는 시그모이드(Sigmoid), 하이퍼볼릭탄젠트(tanh: Hyperbolic tangent), ELU(Exponential Linear Unit), ReLU(Rectified Linear Unit), Leakly ReLU, Maxout, Minout, Softmax 등을 예시할 수 있다. The first fully connected layer FCL1 includes a plurality of computing nodes f1, f2, ..., fn. The node value of each of the plurality of computational nodes f1, f2, ..., fn of the first fully connected layer FCL1 is determined through an activation function for the value of the feature image FM of the second pooling layer POL2. It is calculated by performing a weighted operation. For example, the second pooling layer (POL2) has two feature images (FM), and the two feature images (FM) of the second pooling layer (POL2) are in the form of 2 × 8 × 8 (channel × row × column) When, these types of values are arranged in a 1×128 (channel×column) type value and input to each of the plurality of operation nodes (f1, f2, ..., fn) of the first fully connected layer (FCL1). It is entered as a value. Then, each of the plurality of operation nodes f1, f2, ..., fn calculates a node value by performing an operation to which a weight is applied through an activation function. Here, the activation function may exemplify Sigmoid, Hyperbolic tangent (tanh), Exponential Linear Unit (ELU), Rectified Linear Unit (ReLU), Leakly ReLU, Maxout, Minout, Softmax, etc. .

출력계층(OUL)은 9개의 출력 노드(a1 내지 a9)를 포함한다. 출력계층(OUL)의 9개의 출력 노드(a1 내지 a9)의 노드값, 즉, 출력값은 제2 완전연결계층(FCL)의 복수의 연산 노드(g1, g2, ..., gn) 각각의 노드값에 대해 활성화함수를 통해 가중치가 적용되는 연산을 수행하여 산출된다. The output layer OUL includes 9 output nodes a1 to a9. Node values of the 9 output nodes a1 to a9 of the output layer OUL, that is, the output values are nodes of each of the plurality of computation nodes g1, g2, ..., gn of the second fully connected layer FCL It is calculated by performing an operation in which a weight is applied to a value through an activation function.

이러한 출력계층(OUL)의 9개의 출력 노드(a1 내지 a9)의 출력값은 방향식별망(220)의 출력값이 된다. 9개의 출력 노드(a1 내지 a9) 중 제1 내지 제8 출력노드(a1 내지 a8) 각각은 도 3에 도시된 바와 같은 선박의 항해 방향(0, 45, 90, 135, 180, 225, 270, 315도)에 대응하며, 제9 출력노드(a9)는 비선박에 대응한다. 제1 내지 제8 출력노드(a1 내지 a8) 각각의 출력값은 객체영상(OV)의 객체(obj)의 항해 방향이 제1 내지 제8 출력노드(a1 내지 a8) 각각에 대응하는 방향(0, 45, 90, 135, 180, 225, 270, 315도)일 확률을 나타낸다. 제9 출력노드(a9)의 출력값은 객체영상(OV)의 객체(obj)가 선박이 아닐 확률에 대응한다. 예컨대, 제1 내지 제9 출력노드(a1 내지 a9)의 출력값이 (a1 a2 a3 a4 a5 a6 a7 a8 a9)=[0 0 0 1 0 0 0 0 0]이면, 객체영상(OV)의 객체(obj)의 항해 방향이 180도일 확률이 100%이고 나머지 확률은 모두 0%임을 의미한다. The output values of the nine output nodes a1 to a9 of the output layer OUL become output values of the direction identification network 220. Of the nine output nodes a1 to a9, each of the first to eighth output nodes a1 to a8 is the navigation direction of the ship as shown in FIG. 3 (0, 45, 90, 135, 180, 225, 270, 315 degrees), and the ninth output node a9 corresponds to a non-ship. The output value of each of the first to eighth output nodes a1 to a8 is a direction (0, where the navigation direction of the object obj of the object image OV) corresponds to each of the first to eighth output nodes a1 to a8. 45, 90, 135, 180, 225, 270, 315 degrees). The output value of the ninth output node a9 corresponds to the probability that the object obj of the object image OV is not a ship. For example, if the output values of the first to ninth output nodes (a1 to a9) are (a1 a2 a3 a4 a5 a6 a7 a8 a9) = [0 0 0 1 0 0 0 0 0], the object of the object image (OV) ( It means that the probability that the navigation direction of obj) is 180 degrees is 100%, and the other probability is 0%.

그러면, 방향식별망(220)에 대한 학습 방법에 대해 설명한다. 학습부(300)는 그 영상에 포함된 객체(obj)가 선박인지 여부가 알려지고, 객체(obj)가 선박이면, 항해 방향(0, 45, 90, 135, 180, 225, 270, 315도)이 알려진 학습용 영상을 획득할 수 있다. 이에 따라, 학습부(300)는 학습용 영상에 대해 다음의 표 2와 같이 기댓값을 설정할 수 있다. Then, a learning method for the direction identification network 220 will be described. The learning unit 300 knows whether the object obj included in the image is a ship, and if the object obj is a ship, the navigation direction (0, 45, 90, 135, 180, 225, 270, 315 degrees) ) Can acquire a known learning image. Accordingly, the learning unit 300 may set an expected value for the training image as shown in Table 2 below.

비선박/선박
항해방향(도)Non-Ship/Ship
Navigation direction (degree) 기댓값Expected value a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a8 a9a9 비선박Non-ship 00 00 00 00 00 00 00 00 1One 항해방향 0(360)Navigation direction 0(360) 00 00 00 00 00 00 00 1One 00 4545 00 00 00 00 00 00 1One 00 00 9090 00 00 00 00 00 1One 00 00 00 135135 00 00 00 00 1One 00 00 00 00 180180 00 00 00 1One 00 00 00 00 00 225225 00 00 1One 00 00 00 00 00 00 270270 00 1One 00 00 00 00 00 00 00 315315 1One 00 00 00 00 00 00 00 00

기댓값을 설정한 후, 학습부(300)는 학습용 영상을 객체식별망(210)에 입력한다. 그러면, 객체식별망(210)은 학습용 영상에 대해 복수의 계층의 가중치가 적용되는 복수의 연산을 통해 출력값을 산출하여 출력할 것이다. 여기서, 객체식별망(210)의 출력값은 제1 내지 제9 출력노드(a1 내지 a9) 각각의 출력값을 포함한다. 제1 내지 제9 출력노드(a1 내지 a9) 각각의 출력값은 객체영상(OV)에 포함된 객체(obj)가 선박일 확률과 선박인 경우, 객체영상(OV)의 객체(obj)의 항해 방향이 제1 내지 제8 출력노드(a1 내지 a8) 각각에 대응하는 방향(0, 45, 90, 135, 180, 225, 270, 315도)일 확률을 나타낸다. 학습부(300)는 객체식별망(210)의 출력값과 앞서 표 2와 같이 설정된 기댓값의 차이인 교차 엔트로피 손실(cross entropy loss)을 산출하고 교차 엔트로피 손실이 최소가 되도록 최적화 알고리즘을 통해 객체식별망(210)의 가중치를 최적화한다. After setting the expected value, the learning unit 300 inputs the training image into the object identification network 210. Then, the object identification network 210 will calculate and output an output value through a plurality of operations in which weights of a plurality of layers are applied to the training image. Here, the output value of the object identification network 210 includes output values of each of the first to ninth output nodes a1 to a9. The output values of each of the first to ninth output nodes (a1 to a9) are the probability that the object (obj) included in the object image (OV) is a ship and, in the case of a ship, the navigation direction of the object (obj) of the object image (OV) It represents the probability of a direction (0, 45, 90, 135, 180, 225, 270, 315 degrees) corresponding to each of the first to eighth output nodes a1 to a8. The learning unit 300 calculates a cross entropy loss, which is a difference between the output value of the object identification network 210 and the expected value set as shown in Table 2 above, and uses an optimization algorithm to minimize the cross entropy loss. Optimize the weight of (210).

다음으로, 본 발명의 제2 실시예에 따른 항해예측망(220)에 대해 보다 상세하게 설명하기로 한다. 도 8 및 도 9를 참조하면, 항해예측망(240)은 방향벡터(Dt), 영역벡터(Bt) 및 시간벡터(Tt)가 벡터합에 의해 결합(Xt = Dt + Bt + Tt)되어 시간 순서로 정렬된 복수의 항해벡터(Xt)를 입력 받을 수 있다. 그러면, 항해예측망(240)은 시간 순서로 정렬된 복수(예컨대, n개, 여기서, n은 양의 정수)의 항해벡터(Xt: X1 내지 Xn) 각각에 대해 순차로 가중치가 적용되는 연산을 수행하여 소정 시간 후(k-n+0, k>n)의 객체(obj)의 항해 상태를 나타내는 항해예측벡터(Yk)를 산출한다. 이러한 항해예측망(220)은 RNN(Recurrent Neural Network), LTSM(Long Short-Term Memory models), GRU(Gated recurrent unit) 등을 예시할 수 있다. Next, the navigation prediction network 220 according to the second embodiment of the present invention will be described in more detail. 8 and 9, in the navigation prediction network 240, a direction vector (Dt), a region vector (Bt), and a time vector (Tt) are combined by a vector sum (Xt = Dt + Bt + Tt). A plurality of navigation vectors (Xt) arranged in order can be input. Then, the navigation prediction network 240 performs an operation in which weights are sequentially applied to each of the navigation vectors (Xt: X1 to Xn) of a plurality (eg, n, where n is a positive integer) arranged in chronological order. Then, a navigation prediction vector Yk representing the navigation state of the object obj after a predetermined time (k-n+0, k>n) is calculated. The voyage prediction network 220 may exemplify a recurrent neural network (RNN), long short-term memory models (LTSM), a gated recurrent unit (GRU), and the like.

항해예측망(220)은 복수의 스테이지(St: S1 내지 Sk)로 이루어지며, 순환입력층(RIL: Recurrent Input Layer), 순환은닉층(RHL: Recurrent Hidden Layer) 및 순환출력층(ROL: Recurrent Outpu Layer)을 포함한다. The navigation prediction network 220 is composed of a plurality of stages (St: S1 to Sk), and includes a recurrent input layer (RIL), a recurrent hidden layer (RHL), and a recurrent output layer (ROL). ).

순환입력층(RIL)은 소정 시간 후(k-n+0, k>n)의 객체(obj)의 항해 상태를 측정하기 위해 현재의 항해 상태를 검출한 복수(n)의 입력으로 이루어지며, 시간 순서로 정렬된 복수의 항해벡터(Xt: X1 내지 Xn)가 입력으로 사용된다. 순환은닉층(RHL)은 스테이지의 개수(k)에 해당하는 복수의 은닉셀(HC)을 포함하며, 복수의 은닉셀(HC)은 복수의 항해벡터(Xt: X1 내지 Xn)에 대해 가중치가 적용되는 하나 이상의 연산을 순환 방식으로 수행하여 항해예측벡터(Yk)를 산출한다. 복수의 은닉셀(HC)은 제1 내지 제3 은닉셀그룹(HCG1, HCG2, HCG3)을 포함한다. 순환출력층(ROL)은 마지막 스테이지(Sk)의 은닉셀(HC)에 의해 산출된 항해예측벡터(Yk)를 출력한다. The circular input layer RIL consists of multiple (n) inputs that detect the current navigation state in order to measure the navigation state of the object obj after a predetermined time (k-n+0, k>n), A plurality of navigation vectors (Xt: X1 to Xn) arranged in chronological order are used as inputs. The cyclic hidden layer (RHL) includes a plurality of hidden cells (HC) corresponding to the number of stages (k), and the plurality of hidden cells (HC) is weighted for a plurality of navigation vectors (Xt: X1 to Xn). One or more of the calculations are performed in a cyclic manner to calculate the navigation prediction vector (Yk). The plurality of hidden cells HC includes first to third hidden cell groups HCG1, HCG2, and HCG3. The circulation output layer ROL outputs the navigation prediction vector Yk calculated by the hidden cell HC of the last stage Sk.

한편, 도 13을 참조하면, 은닉셀(HC)은 가중치(Weight, Wx, Wh, Wy)가 적용되는 하나 이상의 연산으로 이루어진다. 여기서, 연산은 활성화함수(Activation Function)를 적용한 연산을 의미한다. 활성화함수는 시그모이드(Sigmoid), 하이퍼볼릭탄젠트(tanh: Hyperbolic tangent), ELU(Exponential Linear Unit), ReLU(Rectified Linear Unit), Leakly ReLU, Maxout, Minout, Softmax 등을 예시할 수 있다. 또한, 하나의 은닉셀(HC)에서 가중치는 입력값인 항해벡터(Xt)에 적용되는 입력 가중치 Wx, 이전 스테이지의 상태값 Ht-1에 대해 적용되는 상태 가중치 Wh 및 출력값 Yt에 대해 적용되는 출력 가중치 Wy를 포함한다. 예컨대, 은닉셀(HC)에 적용되는 가중치가 적용되는 연산은 다음의 수학식 1을 예시할 수 있다. Meanwhile, referring to FIG. 13, the hidden cell HC consists of one or more operations to which weights (Weight, Wx, Wh, Wy) are applied. Here, the operation means an operation to which an activation function is applied. The activation function can be exemplified by Sigmoid, Hyperbolic tangent (tanh), Exponential Linear Unit (ELU), Rectified Linear Unit (ReLU), Leakly ReLU, Maxout, Minout, Softmax, and the like. In addition, in one hidden cell (HC), the weight is the input weight Wx applied to the input voyage vector Xt, the state weight Wh applied to the state value Ht-1 of the previous stage, and the output applied to the output value Yt. Include the weight Wy. For example, an operation to which a weight applied to the hidden cell HC is applied may be illustrated in Equation 1 below.

여기서, b는 임계치 혹은 바이어스이다. 특히, tanh 함수, ReLU 함수는 다른 활성화함수로 변경될 수 있다. 수학식 1과 도 9를 참조하면, 복수의 은닉셀(HCt) 각각은 이전 스테이지(St-1)의 은닉셀(HCt-1)이 연산한 이전 스테이지(St-1)의 상태값(Ht-1)과 자신의 스테이지(St)의 입력값(Xt)에 대해 가중치(W: Wh, Wx, Wy)가 적용되는 연산을 수행하여 현 스테이지(St)의 상태값(Ht) 및 출력값(Yt)을 산출할 수 있다. Here, b is a threshold or bias. In particular, the tanh function and the ReLU function can be changed to other activation functions. Referring to Equation 1 and FIG. 9, each of the plurality of hidden cells HCt is a state value Ht- of the previous stage St-1 calculated by the hidden cell HCT-1 of the previous stage St-1. 1) and the current stage (St) status value (Ht) and output value (Yt) by performing an operation in which the weight (W: Wh, Wx, Wy) is applied to the input value (Xt) of the stage (St) Can be calculated.

항해예측망(220)의 순환입력층(RIL)이 시간 순서에 따라 정렬된 제1 내지 제4 항해벡터(X1, X2, X3, X4) 각각을 순환은닉층(RHL)의 제1 은닉셀그룹(HCG1)에 대응하는 은닉셀, 즉, 제1 내지 제4 은닉셀(HC1, HC2, HC3, HC4)에 입력한다. 그러면, 순환은닉층(RHL) 제1 은닉셀그룹(HCG1)의 제1 내지 제4 은닉셀(HC1, HC2, HC3, HC4) 각각은 이전 스테이지(St-1)의 은닉셀(HC)이 연산한 이전 스테이지(St-1)의 상태값(Ht-1)과 자신의 스테이지(St)의 입력값(Xt)에 대해 가중치가 적용되는 연산을 수행하여 현 스테이지(St)의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지(St+1)의 은닉셀(HCt+1)로 전달한다. 예컨대, 제1 은닉셀(HC1)은 초기 상태값(H0)에 상태 가중치(Wh)를 적용하고, 제1 항해벡터(X1)에 입력 가중치(Wx)를 적용하여 제1 스테이지(S1)의 상태값(H1)을 산출한다. 제1 은닉셀(HC1)의 경우, 이전 스테이지가 없기 때문에 초기 상태값(H0)을 이용한다. 이어서, 제2 은닉셀(HC2)은 이전 스테이지인 제1 스테이지(S1)의 상태값(H1)에 상태 가중치(Wh)를 적용하고, 제2 항해벡터(X2)에 입력 가중치(Wx)를 적용하여 제2 스테이지(S2)의 상태값(H2)을 산출한다. 그리고 제3 은닉셀(HC3)은 이전 스테이지인 제2 스테이지(S2)의 상태값(H2)에 상태 가중치(Wh)를 적용하고, 제3 항해벡터(X3)에 입력 가중치(Wx)를 적용하여 제3 스테이지(S3)의 상태값(H3)을 산출한다. 이어서, 제4 은닉셀(HC4)은 이전 스테이지인 제3 스테이지(S3)의 상태값(H3)에 상태 가중치(Wh)를 적용하고, 제4 항해벡터(X4)에 입력 가중치(Wx)를 적용하여 제4 스테이지(S4)의 상태값(H4)을 산출한다. 이와 같이, 제1 은닉셀그룹(HCG1)에 속한 복수의 은닉셀(HCt)은 이전 스테이지의 상태값(Ht-1)과 현 스테이지의 입력값인 항해벡터(Xt)에 대해 상태 및 입력 가중치(Wh, Wx)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출하고, 산출된 상태값(Ht)을 다음 스테이지(St+1)로 전달한다. Each of the first to fourth navigation vectors (X1, X2, X3, X4) in which the cyclic input layer (RIL) of the navigation prediction network 220 is arranged in chronological order is a first hidden cell group of the cyclic hidden layer (RHL) ( It inputs to the hidden cells corresponding to HCG1), that is, the first to fourth hidden cells HC1, HC2, HC3, and HC4. Then, each of the first to fourth hidden cells HC1, HC2, HC3, and HC4 of the first hidden cell group HCG1 of the circulating hidden layer RHL is calculated by the hidden cell HC of the previous stage St-1. The state value (Ht) of the current stage (St) is calculated by performing an operation in which a weight is applied to the state value (Ht-1) of the previous stage (St-1) and the input value (Xt) of the own stage (St). After calculation, the calculated state value Ht is transferred to the hidden cell HCT+1 of the next stage St+1. For example, the first hidden cell HC1 applies the state weight Wh to the initial state value H0 and applies the input weight Wx to the first navigation vector X1 to determine the state of the first stage S1. Calculate the value (H1). In the case of the first hidden cell HC1, since there is no previous stage, the initial state value H0 is used. Subsequently, the second hidden cell HC2 applies the state weight Wh to the state value H1 of the first stage S1, which is the previous stage, and applies the input weight Wx to the second navigation vector X2. Thus, the state value H2 of the second stage S2 is calculated. In addition, the third hidden cell HC3 applies the state weight Wh to the state value H2 of the second stage S2, which is the previous stage, and applies the input weight Wx to the third navigation vector X3. The state value H3 of the third stage S3 is calculated. Subsequently, the fourth hidden cell HC4 applies the state weight Wh to the state value H3 of the third stage S3, which is the previous stage, and applies the input weight Wx to the fourth navigation vector X4. Thus, the state value H4 of the fourth stage S4 is calculated. In this way, the plurality of hidden cells HCt belonging to the first hidden cell group HCG1 are the state values Ht-1 of the previous stage and the state and input weights ( The operation to which Wh and Wx is applied is performed to calculate the state value Ht of the current stage, and the calculated state value Ht is transferred to the next stage St+1.

그러면, 순환은닉층(RHL) 제2 은닉셀그룹(HCG2)에 속한 복수의 은닉셀(HC5 내지 HCk-1)은 이전 스테이지(St-1)의 은닉셀(HC)이 연산한 이전 스테이지(St-1)의 상태값(Ht-1)에 대해 상태 가중치(Wh)가 적용되는 연산을 수행하여 현 스테이지(St)의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지(St+1)의 은닉셀(HCt+1)로 전달한다. 예컨대, 제5 은닉셀(HC5)은 이전 스테이지인 제4 스테이지(S4)의 상태값(H4)에 상태 가중치(Wh)를 적용하여 제5 스테이지(S5)의 상태값(H5)을 산출한다. 이어서, 제6 은닉셀(HC6)은 이전 스테이지인 제5 스테이지(S5)의 상태값(H5)에 상태 가중치(Wh)를 적용하여 제6 스테이지(S6)의 상태값(H6)을 산출한다. 제2 은닉셀그룹(HCG2)에 속한 나머지 은닉셀(HC6 내지 HCk-1)도 전술한 방식과 동일하게 상태값(Ht)을 산출하여 산출한 상태값(Ht)을 다음 스테이지로 제공한다. 그리고 제2 은닉셀그룹(HCG2)의 마지막 은닉셀인 k-1 은닉셀(HCk-1)은 이전 스테이지인 제k-2 스테이지(Sk-2)의 상태값(Hk-2)에 상태 가중치(Wh)를 적용하여 현 스테이지인 제k-1 스테이지(Sk-1)의 상태값(Hk-1)을 산출한다. Then, the plurality of hidden cells HC5 to HCk-1 belonging to the second hidden cell group HCG2 of the cyclic hidden layer RHL are the previous stage St- which is calculated by the hidden cell HC of the previous stage St-1. The state value (Ht) of the current stage (St) is calculated by performing an operation in which the state weight (Wh) is applied to the state value (Ht-1) of 1), and then the calculated state value (Ht) is applied to the next stage. It is delivered to the hidden cell (HCt+1) of (St+1). For example, the fifth hidden cell HC5 calculates the state value H5 of the fifth stage S5 by applying the state weight Wh to the state value H4 of the fourth stage S4 that is the previous stage. Subsequently, the sixth hidden cell HC6 calculates the state value H6 of the sixth stage S6 by applying the state weight Wh to the state value H5 of the fifth stage S5 which is the previous stage. The remaining hidden cells HC6 to HCk-1 belonging to the second hidden cell group HCG2 also calculate the state value Ht in the same manner as described above, and provide the calculated state value Ht to the next stage. In addition, the k-1 hidden cell HCk-1, which is the last hidden cell of the second hidden cell group HCG2, is the state weight Hk-2 of the k-2th stage Sk-2 that is the previous stage. Wh) is applied to calculate the state value Hk-1 of the k-1th stage Sk-1 which is the current stage.

다음으로, 제3 은닉셀그룹(HCG3)의 제k 은닉셀(HCk)은 이전 스테이지인 제k-1 스테이지(Sk-1)의 상태값(Hk-1)에 상태 가중치(Wh)를 적용하여 현 스테이지인 제k 스테이지(Sk)의 상태값(Hk)을 산출한다. 그런 다음, 제k 은닉셀(HCk)은 현 스테이지인 제k 스테이지(Sk)의 상태값(Hk)에 출력 가중치(Wy)를 적용하는 연산을 수행하여 출력값, 즉, 항해예측벡터(Yk)를 산출한다. Next, the k-th hidden cell HCk of the third hidden cell group HCG3 applies a state weight Wh to the state value Hk-1 of the k-1th stage Sk-1, which is a previous stage. The state value Hk of the current stage k-th stage Sk is calculated. Then, the k-th hidden cell HCk performs an operation that applies the output weight Wy to the state value Hk of the k-th stage Sk, which is the current stage, to obtain an output value, that is, a navigation prediction vector Yk. Calculate.

이와 같이, 본 발명의 실시예에 따른 항해예측망(220)에서 제1 은닉셀그룹(HCG1)의 은닉셀은 이전 스테이지의 상태값(Ht-1)과 현 스테이지의 입력값(Xt)에 대해 상태 및 입력 가중치(Wh, Wx)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지에 전달한다. 이어서, 제2 은닉셀그룹(HCG2)의 은닉셀은 이전 스테이지의 상태값(Ht-1)에 대해 상태 가중치(Wh)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지에 전달한다. 그리고 마지막 은닉셀인 제3 은닉셀그룹의 은닉셀은 이전 스테이지의 상태값(Ht-1)에 대해 상태 가중치(Wh)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 현 스테이지의 상태값(Ht)에 출력 가중치(Wy)를 적용하는 연산을 수행하여 출력값, 즉, 항해예측벡터(Yk)를 산출한다. 이러한 항해예측벡터(Yk)는 입력값이 고려되며, 이전의 상태가 고려되는 상태값이 마지막 은닉셀로 전달되어 최종적으로 산출된 값이다. 이에 따라, 항해예측망(220)은 현재(n)까지 측정 가능한 항해 상태를 나타내는 복수의 항해벡터(Xt)로부터 소정 시간 이후(k-n, k>n)의 객체(obj)의 항해 상태를 예측한 항해예측벡터(Yk)를 산출할 수 있다. In this way, in the navigation prediction network 220 according to an embodiment of the present invention, the hidden cells of the first hidden cell group HCG1 are based on the state value Ht-1 of the previous stage and the input value Xt of the current stage. The state value Ht of the current stage is calculated by performing an operation to which the state and input weights Wh and Wx are applied, and then the calculated state value Ht is transferred to the next stage. Subsequently, the hidden cell of the second hidden cell group (HCG2) calculates the state value (Ht) of the current stage by performing an operation to which the state weight (Wh) is applied to the state value (Ht-1) of the previous stage. , The calculated state value Ht is transferred to the next stage. And the last hidden cell, the hidden cell of the third hidden cell group, calculates the state value (Ht) of the current stage by performing an operation in which the state weight (Wh) is applied to the state value (Ht-1) of the previous stage. , An output value, that is, a navigation prediction vector Yk, is calculated by performing an operation that applies the output weight Wy to the calculated state value Ht of the current stage. The voyage prediction vector Yk is a value that is finally calculated by taking an input value into account and a state value taking into account the previous state to the last hidden cell. Accordingly, the navigation prediction network 220 predicts the navigation state of the object obj after a predetermined time (kn, k>n) from a plurality of navigation vectors (Xt) representing the navigation conditions that can be measured up to the present (n). The navigation prediction vector (Yk) can be calculated.

다음으로, 본 발명의 제1 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법에 대해서 설명하기로 한다. 도 10은 본 발명의 제1 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법을 설명하기 위한 흐름도이다. Next, a description will be given of a method for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to the first embodiment of the present invention. 10 is a flowchart illustrating a method for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to the first embodiment of the present invention.

도 10을 참조하면, 제어부(16)의 전처리부(100)는 S110 단계에서 도 2에 도시된 바와 같이 카메라부(12)를 통해 선박 주변을 지속적으로 촬영하여 소정 넓이(SW×SH)의 감시영상(SV)을 생성한다. 감시영상(SV)은 시간 순서로 정렬되는 복수의 프레임(Ft)으로 이루어진다. 이러한 감시영상(SV)은 표시부(14)를 통해 표시될 수 있다. Referring to FIG. 10, the preprocessing unit 100 of the control unit 16 continuously photographs the surroundings of the vessel through the camera unit 12 as shown in FIG. 2 in step S110 to monitor a predetermined area (SW×SH). Create an image (SV). The surveillance image SV consists of a plurality of frames Ft arranged in chronological order. This surveillance image SV may be displayed through the display unit 14.

그런 다음, 전처리부(100)는 S120 단계에서 도 2에 도시된 바와 같이, 촬영된 감시영상(SV)에서 프레임(Ft) 별로 감시영상(SV)에 포함된 객체(obj)가 차지하는 영역을 나타내는 영역상자(B)를 특징점 검출을 통해 특정함으로써 객체영상(OVt)을 검출한다. 영역상자(B)는 감시영상(SV)에서 객체(obj)가 차지하는 영역을 사각형으로 나타낸 것이며, 객체(obj)가 모두 포함되는 최소 크기의 사격형을 의미한다. 이러한 영역상자(B)는 중심좌표(x, y), 폭(w) 및 높이(h)를 가진다. 한편, 전처리부(100)가 검출한 객체영상(OVt)은 심층신경망(200)의 객체식별망(210)에 입력된다. Then, the preprocessor 100 represents an area occupied by the object obj included in the surveillance image SV for each frame Ft in the captured surveillance image SV, as shown in FIG. 2 in step S120. The object image OVt is detected by specifying the area box B through feature point detection. The area box B represents the area occupied by the object obj in the surveillance image SV as a rectangle, and means a shooting type having the minimum size in which all the objects obj are included. This area box B has a central coordinate (x, y), a width (w) and a height (h). Meanwhile, the object image OVt detected by the preprocessor 100 is input to the object identification network 210 of the deep neural network 200.

그러면, 객체식별망(210)은 S130 단계에서 가중치가 적용되는 복수의 연산을 수행하여 객체영상(OVt)의 객체(obj)가 선박인지 여부를 확률로 산출한다. 즉, 객체식별망(210)의 2개의 출력 노드(O1, O2) 각각의 출력값은 입력된 객체영상(OV)의 객체(obj)가 선박일 확률과, 선박이 아닐 확률을 의미한다. Then, the object identification network 210 calculates, as a probability, whether the object obj of the object image OVt is a ship by performing a plurality of operations to which the weight is applied in step S130. That is, the output value of each of the two output nodes O1 and O2 of the object identification network 210 indicates a probability that the object obj of the input object image OV is a ship and a probability that the object is not a ship.

이에 따라, 관제부(400)는 S140 단계에서 객체식별망(210)이 산출한 확률에 따라 객체(obj)가 선박인지 여부를 판정한다. 예컨대, 제1 출력노드(O1)의 출력값이 0.846이고, 제2 출력노드(O2)의 출력값이 0.154라면, 객체(obj)가 선박일 확률이 85%이고, 선박이 아닐 확률이 15%임을 의미한다. 이러한 경우, 관제부(400)는 객체(obj)가 선박인 것으로 판정할 수 있다. 반면, 제1 출력노드(O1)의 출력값이 0.444이고, 제2 출력노드(O2)의 출력값이 0.556라면, 객체(obj)가 선박일 확률이 44%이고, 선박이 아닐 확률이 56%임을 의미한다. 이러한 경우, 관제부(400)는 객체(obj)가 선박이 아닌 것으로 판정할 수 있다. Accordingly, the control unit 400 determines whether the object obj is a ship according to the probability calculated by the object identification network 210 in step S140. For example, if the output value of the first output node (O1) is 0.846 and the output value of the second output node (O2) is 0.154, it means that the probability that the object (obj) is a ship is 85%, and the probability that it is not a ship is 15%. do. In this case, the control unit 400 may determine that the object obj is a ship. On the other hand, if the output value of the first output node (O1) is 0.444 and the output value of the second output node (O2) is 0.556, it means that the probability that the object (obj) is a ship is 44%, and the probability that it is not a ship is 56%. do. In this case, the control unit 400 may determine that the object obj is not a ship.

S140 단계의 판정 결과, 객체(obj)가 선박이면, 관제부(400)는 S150 단계에서 그 선박으로 판정된 객체(obj)와의 충돌이 발생할 확률을 나타내는 위험도를 산출한다. 이를 위하여, 관제부(400)는 감시영상(SV)에서 객체(obj)의 영역상자(B)의 좌표를 통해 객체영상의 넓이(w×h)를 산출한 후, 감시영상(SV)의 넓이(SW×SH) 대 객체영상(OV)의 넓이(w×h)의 비율을 통해 위험도를 산출할 수 있다. 즉, 관제부(400)는 객체영상(OV)이 감시영상(SV)에서 차지하는 면적의 비율이 높을수록 위험도가 높은 것으로 산정한다. As a result of the determination in step S140, if the object obj is a ship, the control unit 400 calculates a risk indicating the probability of a collision with the object obj determined as the ship in step S150. To this end, the control unit 400 calculates the width (w×h) of the object image through the coordinates of the area box B of the object obj in the surveillance image SV, and then calculates the width of the surveillance image SV. The risk can be calculated through the ratio of (SW×SH) to the area (w×h) of the object image (OV). That is, the control unit 400 calculates that the higher the risk is as the ratio of the area occupied by the object image OV in the surveillance image SV increases.

관제부(400)는 S160 단계에서 산출된 위험도가 기 설정된 임계치 이상인지 여부를 판별한다. 상기 판별 결과, 위험도가 임계치 이상이면, 충돌 위험이 있는 것으로 판단한다. 이에 따라, 관제부(400)는 S170 단계에서 통신부(11)를 통해 사용자장치, 경보기, 관제서버 등에 충돌 위험을 알리는 경보 메시지를 전송한다. 이러한 경보 메시지는 해당 객체의 영상과 위치 정보를 포함한다. The control unit 400 determines whether the risk calculated in step S160 is equal to or greater than a preset threshold. As a result of the determination, if the risk is greater than or equal to the threshold, it is determined that there is a risk of collision. Accordingly, the control unit 400 transmits an alarm message notifying the risk of collision to a user device, an alarm, and a control server through the communication unit 11 in step S170. These alert messages include the image and location information of the object.

한편, 본 발명의 실시예에 따르면, 전술한 S120 단계의 경우, 객체(obj)와의 거리에 따라 다른 방식으로 객체영상을 검출한다. 이러한 S120 단계에 대해 보다 상세하게 설명하기로 한다. 도 11은 본 발명의 실시예에 따른 객체영상을 검출하는 방법을 설명하기 위한 흐름도이다. 도 12는 본 발명의 실시예에 따른 근거리 객체에 대응하는 객체영상을 검출하는 방법을 설명하기 위한 도면이다. 도 13은 본 발명의 실시예에 따른 원거리 객체에 대응하는 객체영상을 검출하는 방법을 설명하기 위한 도면이다. Meanwhile, according to an embodiment of the present invention, in the case of step S120 described above, the object image is detected in a different manner according to the distance to the object obj. This step S120 will be described in more detail. 11 is a flowchart illustrating a method of detecting an object image according to an embodiment of the present invention. 12 is a diagram illustrating a method of detecting an object image corresponding to a near object according to an embodiment of the present invention. 13 is a diagram for describing a method of detecting an object image corresponding to a distant object according to an embodiment of the present invention.

도 11을 참조하면, 전처리부(100)는 S210 단계에서 감시영상(SV)이 입력되면, S220 단계에서 소정의 알고리즘을 통해 특징점 추출을 시도한다. 특징점 추출을 위한 알고리즘은 Harris Corner, Shi & Tomasi, SIFT(Scale Invariant Feature Transform), SURF(Speeded up robust features), BRIEF(Binary robust independent elementary features), ORB(Oriented FAST and Rotated BRIEF), FAST(Features from Accelerated Segment Test), AGAST 등을 예시할 수 있으며, 바람직하게는, FAST를 이용할 수 있다. Referring to FIG. 11, when a surveillance image (SV) is input in step S210, the preprocessor 100 attempts to extract feature points through a predetermined algorithm in step S220. Algorithms for feature point extraction are Harris Corner, Shi & Tomasi, SIFT (Scale Invariant Feature Transform), SURF (Speeded up robust features), BRIEF (Binary robust independent elementary features), ORB (Oriented FAST and Rotated BRIEF), FAST (Features from Accelerated Segment Test), AGAST, and the like, and preferably, FAST may be used.

전술한 바와 같이 특징점 추출을 시도한 후, 전처리부(100)는 S230 단계에서 특징점이 추출되는지 여부를 판별한다. 이러한 판별에 따라, 전처리부(100)는 특징점이 추출되면, S240 단계로 진행하고, 특징점이 추출되지 않으면 S250 단계로 진행한다. 전처리부(100)는 특징점이 추출되면, 감시영상(SV) 내의 객체가 소정의 기준 보다 가까이에 위치하는 근거리 객체인 것으로 판단하고 S240 단계에서 근거리 객체에 대응하는 객체영상 검출 절차를 수행한다. 반면, 전처리부(100)는 특징점이 추출되지 않으면, 감시영상(SV) 내의 객체가 소정의 기준 보다 멀리 위치한 원거리 객체인 것으로 판단하고 S250 단계에서 원거리 객체에 대응하는 객체영상 검출 절차를 수행한다. After attempting to extract the feature points as described above, the preprocessor 100 determines whether or not the feature points are extracted in step S230. According to this determination, when the feature point is extracted, the preprocessor 100 proceeds to step S240, and if the feature point is not extracted, the preprocessor 100 proceeds to step S250. When the feature point is extracted, the preprocessor 100 determines that the object in the surveillance image SV is a near object located closer than a predetermined reference, and performs an object image detection procedure corresponding to the near object in step S240. On the other hand, if the feature point is not extracted, the preprocessor 100 determines that the object in the surveillance image SV is a distant object located farther than a predetermined reference, and performs an object image detection procedure corresponding to the distant object in step S250.

그러면, 이러한 S240 단계 및 S250 단계에 대해 보다 상세하게 설명하기로 한다. 먼저, 도 12를 참조로 근거리 객체에 대응하는 객체영상 검출 방법에 대해 설명한다. 이러한 도 12는 S240 단계에 대응하며, S240 단계를 보다 상세하게 설명하기 위한 것이다. 도 12를 참조하면, 전처리부(100)는 S241 단계에서 도 12와 같은 감시영상(SV)이 입력되면, S242 단계에서 입력된 감시영상(SV)에 대해 블러링(Blurring) 처리를 하여 전체적으로 이미지를 흐리게 만든다. 그런 다음, 전처리부(100)는 S243 단계에서 감시영상(SV)에 대해 그레이스케일(Grayscale)을 적용하여 RGB의 3채널로 이루어진 이미지를 1개의 채널로 변경한다. 이어서, 전처리부(100)는 S244 단계에서 해리스 코너(Harris corner) 알고리즘을 이용하여 감시영상(SV) 내의 복수의 코너점을 검출한다. 이어서, 전처리부(100)는 S244 단계에서 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 사각형의 영역박스(B)를 통해 객체영상(OV)을 검출한다. Then, these steps S240 and S250 will be described in more detail. First, an object image detection method corresponding to a near object will be described with reference to FIG. 12. This FIG. 12 corresponds to step S240 and is for explaining step S240 in more detail. Referring to FIG. 12, when a surveillance image (SV) as shown in FIG. 12 is input in step S241, the preprocessor 100 performs a blurring process on the surveillance image (SV) input in step S242 to obtain an overall image. Blur Then, the preprocessor 100 changes the image consisting of 3 channels of RGB into one channel by applying grayscale to the surveillance image SV in step S243. Subsequently, the preprocessor 100 detects a plurality of corner points in the surveillance image SV by using a Harris corner algorithm in step S244. Subsequently, the preprocessor 100 searches for a dense area of a plurality of corner points detected in step S244 and detects an object image OV through a rectangular area box B indicating an area occupied by the object.

다음으로, 도 13을 참조로 근거리 객체에 대응하는 객체영상 검출 방법에 대해 설명한다. 이러한 도 13은 S250 단계에 대응하며, S250 단계를 보다 상세하게 설명하기 위한 것이다. 도 13을 참조하면, 전처리부(100)는 S251 단계에서 도 13과 같은 감시영상(SV)이 입력되면, S252 단계에서 입력된 감시영상(SV)에 대해 블러링(Blurring) 처리를 하여 전체적으로 이미지를 흐리게 만든다. 그런 다음, 전처리부(100)는 S253 단계에서 감시영상(SV)에 대해 그레이스케일(Grayscale)을 적용하여 RGB의 3채널로 이루어진 이미지를 1개의 채널로 변경한다. Next, an object image detection method corresponding to a near object will be described with reference to FIG. 13. This FIG. 13 corresponds to step S250 and is for explaining step S250 in more detail. Referring to FIG. 13, when a surveillance image (SV) as shown in FIG. 13 is input in step S251, the preprocessor 100 performs a blurring process on the surveillance image (SV) input in step S252 to provide an overall image. Blur Then, the preprocessor 100 changes the image consisting of 3 channels of RGB to one channel by applying grayscale to the surveillance image SV in step S253.

이어서, 전처리부(100)는 S254 단계에서 캐니 엣지(Canny edge) 검출을 통해 고주파 성분의 라인을 찾아 이진화된 라인을 찾는다. 그런 다음, 전처리부(100)는 S255 단계에서 허프 변환(Hough Transform)을 통해 수평선을 검출한다. 허프 변환을 이용하면 긴 길이부터 짧은 길이의 직선이 검출되며 해상에서 제일 긴 직선이 수평선인 것을 이용하여 각 직선의 양 끝 좌표를 이용하여 피타고라스 정리를 통해 거리를 구하고 그 중 가장 긴 직선을 찾아 수평선을 찾는다. Subsequently, in step S254, the preprocessor 100 finds a line of a high-frequency component through Canny edge detection and finds a binarized line. Then, the preprocessor 100 detects a horizontal line through Hough Transform in step S255. Using the Hough transform, straight lines from long to short lengths are detected, and the longest straight line on the sea is a horizontal line, and the distance is obtained through Pythagorean theorem using the coordinates of both ends of each straight line, and the longest straight line is found. Look for

수평선을 검출한 후, 전처리부(100)는 S256 단계에서 처음(S251) 입력된 감시영상(SV)에서 수평선 위를 제외한 아래를 제거하여 수평선 윗부분 영상만을 남긴다. 이어서, 전처리부(100)는 S257 단계에서 수평선 윗부분 영상을 소정 크기로 확대한다. 그런 다음, 전처리부(100)는 S258 단계에서 확대된 수평선 윗부분 영상에서 해리스 코너(Harris corner) 알고리즘을 이용하여 복수의 코너점을 검출한다. 이어서, 전처리부(100)는 S259 단계에서 검출된 복수의 코너점의 밀집 구역을 찾아 객체가 차지하는 영역을 나타내는 사각형의 영역박스(B)를 통해 객체영상(OV)을 검출한다. After detecting the horizontal line, the pre-processing unit 100 removes the bottom except the above horizontal line from the surveillance image SV initially input in step S256 (S251), leaving only the image above the horizontal line. Subsequently, the preprocessor 100 enlarges the image above the horizontal line to a predetermined size in step S257. Then, the preprocessor 100 detects a plurality of corner points using a Harris corner algorithm in the image above the horizontal line enlarged in step S258. Subsequently, the preprocessor 100 searches for a dense area of a plurality of corner points detected in step S259 and detects an object image OV through a rectangular area box B indicating an area occupied by the object.

다음으로, 본 발명의 제2 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법에 대해서 설명하기로 한다. 도 14는 본 발명의 제2 실시예에 따른 심층신경망을 기반으로 해상 객체 거리를 고려한 접근 선박을 인식하기 위한 방법을 설명하기 위한 흐름도이다. Next, a description will be made of a method for recognizing an approaching vessel in consideration of a sea object distance based on a deep neural network according to a second embodiment of the present invention. 14 is a flowchart illustrating a method for recognizing an approaching vessel in consideration of a maritime object distance based on a deep neural network according to a second embodiment of the present invention.

도 14를 참조하면, 제어부(16)의 전처리부(100)는 S310 단계에서 도 2에 도시된 바와 같이 카메라부(12)를 통해 선박 주변을 지속적으로 촬영하여 소정 넓이(SW×SH)의 감시영상(SV)을 생성한다. 감시영상(SV)은 시간 순서로 정렬되는 복수의 프레임(Ft)으로 이루어진다. 이러한 감시영상(SV)은 표시부(14)를 통해 표시될 수 있다. Referring to FIG. 14, the preprocessing unit 100 of the control unit 16 continuously photographs the surrounding of the ship through the camera unit 12 as shown in FIG. 2 in step S310 to monitor a predetermined area (SW×SH). Create an image (SV). The surveillance image SV consists of a plurality of frames Ft arranged in chronological order. This surveillance image SV may be displayed through the display unit 14.

그런 다음, 전처리부(100)는 S320 단계에서 도 2에 도시된 바와 같이, 촬영된 감시영상(SV)에서 프레임(Ft) 별로 감시영상(SV)에 포함된 객체(obj)가 차지하는 영역을 영역상자(B)를 통해 검출하여 복수의 객체영상(OVt)을 생성한다. 영역상자(B)는 감시영상(SV)에서 객체(obj)가 차지하는 영역을 사각형으로 나타낸 것이며, 객체(obj)가 모두 포함되는 최소 크기의 사격형을 의미한다. 이러한 영역상자(B)는 중심좌표(x, y), 폭(w) 및 높이(h)를 가진다. 이러한 S320 단계의 객체영상(OVt)의 검출 방법은 제1 실시예와 동일하다. 즉, 제2 실시예에서도 제1 실시예와 마찬가지로 도 11 내지 도 13에서 설명한 바와 같이, 전처리부(100)는 특징점 추출을 시도하여 특징점이 추출되면, 감시영상(SV) 내의 객체가 기준 보다 가까이에 위치하는 근거리 객체인 것으로 판단하여 근거리 객체에 대응하는 객체영상 검출 절차를 수행하고(S240), 특징점이 추출되지 않으면, 감시영상(SV) 내의 객체가 기준 보다 멀리 위치한 원거리 객체인 것으로 판단하여 원거리 객체에 대응하는 객체영상 검출 절차를 수행한다(S250). Then, as shown in FIG. 2 in step S320, the preprocessor 100 determines the area occupied by the object obj included in the surveillance image SV for each frame Ft in the captured surveillance image SV. By detecting through the box (B), a plurality of object images (OVt) are generated. The area box B represents the area occupied by the object obj in the surveillance image SV as a rectangle, and means a shooting type having the minimum size in which all the objects obj are included. This area box B has a central coordinate (x, y), a width (w) and a height (h). The method of detecting the object image OVt in step S320 is the same as in the first embodiment. That is, in the second embodiment, as described with reference to FIGS. 11 to 13 as in the first embodiment, when the feature point is extracted by attempting to extract the feature point, the preprocessor 100 makes the object in the surveillance image SV closer than the reference. The object image detection procedure corresponding to the near object is determined to be a near object located in (S240), and if the feature point is not extracted, it is determined that the object in the surveillance image (SV) is a far object located farther than the reference, and An object image detection procedure corresponding to an object is performed (S250).

전처리부(100)는 S330 단계에서 복수의 객체영상(OVt), 복수의 객체영상(OVt) 각각의 감시영상(SV) 상에서의 영역상자(B)의 좌표를 나타내는 복수의 영역벡터(Bt) 및 감시영상(SV)으로부터 복수의 객체영상(OVt) 각각이 생성된 시간을 나타내는 시간벡터(Tt)를 제2 실시예에 따른 심층신경망(200)에 입력한다. 그러면, 심층신경망(200)의 방향식별망(220)은 복수의 객체영상(OVt) 각각을 순차로 입력받고, 객체영상(OVt)에 포함된 객체(obj)가 선박일 확률과, 선박인 경우, 객체영상(OVt)의 객체(obj)의 항해 방향이 복수의 방향(0, 45, 90, 135, 180, 225, 270, 315도) 각각에 해당할 확률을 산출한다. 여기서, 복수의 객체영상(OVt) 모두에 포함된 객체(obj)가 선박이라고 가정한다. 이에 따라, 방향식별망(220)은 S340 단계에서 객체영상(OVt)의 객체(obj)의 항해 방향이 복수의 방향(0, 45, 90, 135, 180, 225, 270, 315도) 각각에 해당할 확률을 산출하며, 이러한 확률이 방향벡터(Dt)가 된다. 방향식별망(220)은 제1 내지 제9 출력 노드(a1 내지 a9) 중 방향식별망(220)은 제1 내지 제8 출력 노드(a1 내지 a8)의 출력값만 출력함으로써 방향벡터(Dt)를 생성할 수 있다. 예컨대, 방향식별망(220)에 도 4의 제1 내지 제4 객체영상(OV1, OV2, OV3, OV4)이 입력된 경우, 출력값 제1 내지 제4 방향벡터(D1, D2, D3, D4)는 D1=[0.01, 0.77, 0.02, 0.11, 0.09, 0.06 0.02 0.03], D2=[0.01, 0.81, 0.02, 0.11, 0.09, 0.06 0.02 0.03], D3=[0.01, 0.01, 0.79, 0.11, 0.03, 0.04 0.12 0.03], D4=[0.01, 0.02, 0.11, 0.83, 0.09, 0.06 0.02 0.03]가 될 수 있다. 즉, 배의 방향이 90도, 90도, 135도 및 180도로 변화하는 것을 알 수 있다. The preprocessor 100 includes a plurality of area vectors Bt representing coordinates of the area box B on the surveillance image SV of the plurality of object images OVt and the plurality of object images OVt in step S330, and A time vector Tt representing a time when each of the plurality of object images OVt is generated from the surveillance image SV is input into the deep neural network 200 according to the second embodiment. Then, the direction identification network 220 of the deep neural network 200 sequentially receives each of the plurality of object images OVt, and the probability that the object obj included in the object image OVt is a ship, and if it is a ship , The probability that the navigation direction of the object obj of the object image OVt corresponds to each of a plurality of directions (0, 45, 90, 135, 180, 225, 270, 315 degrees) is calculated. Here, it is assumed that the object obj included in all of the plurality of object images OVt is a ship. Accordingly, the direction identification network 220 determines the navigation direction of the object obj of the object image OVt in step S340 in each of a plurality of directions (0, 45, 90, 135, 180, 225, 270, 315 degrees). The corresponding probability is calculated, and this probability becomes a direction vector (Dt). The direction identification network 220 outputs only the output values of the first to eighth output nodes a1 to a8 among the first to ninth output nodes a1 to a9 to obtain a direction vector Dt. Can be generated. For example, when the first to fourth object images OV1, OV2, OV3, OV4 of FIG. 4 are input to the direction identification network 220, the output values first to fourth direction vectors (D1, D2, D3, D4) Is D1=[0.01, 0.77, 0.02, 0.11, 0.09, 0.06 0.02 0.03], D2=[0.01, 0.81, 0.02, 0.11, 0.09, 0.06 0.02 0.03], D3=[0.01, 0.01, 0.79, 0.11, 0.03, 0.04 0.12 0.03], D4 = [0.01, 0.02, 0.11, 0.83, 0.09, 0.06 0.02 0.03]. That is, it can be seen that the direction of the ship changes 90 degrees, 90 degrees, 135 degrees, and 180 degrees.

다음으로, 심층신경망(200)의 덧셈기(230)는 S350 단계에서 복수의 방향벡터(Dt) 각각에 대응하는 복수의 영역벡터(Bt) 및 복수의 시간벡터(Tt)를 더하여 시간 순서로 정렬된 복수의 항해벡터(Xt)를 생성한다. 예컨대, 도 4의 제1 내지 제4 객체영상(OV1, OV2, OV3, OV4)에 대응하는 제1 내지 제4 항해벡터(X1, X2, X3, X4)는 X1=D1+B1+T1, X2=D2+B2+T2, X3=D3+B3+T3, X4=D4+B4+T4와 같이 도출된다. Next, the adder 230 of the deep neural network 200 adds a plurality of region vectors (Bt) and a plurality of time vectors (Tt) corresponding to each of the plurality of direction vectors (Dt) in step S350, and is arranged in chronological order. Generate a plurality of navigation vectors (Xt). For example, the first to fourth navigation vectors (X1, X2, X3, X4) corresponding to the first to fourth object images OV1, OV2, OV3, and OV4 of FIG. 4 are X1=D1+B1+T1, X2. =D2+B2+T2, X3=D3+B3+T3, X4=D4+B4+T4.

다음으로, 심층신경망(200)의 항해예측망(220)은 S360 단계에서 복수의 항해벡터에 대해 가중치가 적용되는 복수의 연산을 수행하여 항해예측벡터를 산출한다. 이때, 항해예측망(220)에서 제1 은닉셀그룹(HCG1)의 은닉셀은 이전 스테이지의 상태값(Ht-1)과 현 스테이지의 입력값인 항해벡터(Xt)에 대해 상태 및 입력 가중치(Wh, Wx)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지에 전달한다. 이어서, 제2 은닉셀그룹(HCG2)의 은닉셀은 이전 스테이지의 상태값(Ht-1)에 대해 상태 가중치(Wh)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 상태값(Ht)을 다음 스테이지에 전달한다. 그리고 마지막 은닉셀인 제3 은닉셀그룹의 은닉셀은 이전 스테이지의 상태값(Ht-1)에 대해 상태 가중치(Wh)가 적용되는 연산을 수행하여 현 스테이지의 상태값(Ht)을 산출한 후, 산출된 현 스테이지의 상태값(Ht)에 출력 가중치(Wy)를 적용하는 연산을 수행하여 출력값, 즉, 항해예측벡터(Yk)를 산출한다. 이와 같이, 항해예측망(220)은 해당 객체(obj)의 항해 상태의 경향성을 나타내는 복수의 항해벡터(Xt)를 기초로 소정 시간 이후(k-n, k>n)의 해당 객체(obj)의 항해 상태를 예측한 항해예측벡터(Yk)를 산출할 수 있다. 이러한 항해예측벡터(Yk)는 소정 시간 이후(k-n, k>n)의 방향벡터(Dk), 영역벡터(Bk) 및 시간벡터(Tk)를 포함할 수 있다(Yk=Dk+Bk+Tk). Next, the navigation prediction network 220 of the deep neural network 200 calculates a navigation prediction vector by performing a plurality of calculations in which weights are applied to the plurality of navigation vectors in step S360. At this time, in the navigation prediction network 220, the hidden cells of the first hidden cell group (HCG1) are the state values (Ht-1) of the previous stage and the state and input weights (Xt), which are the input values of the current stage. After calculating the state value Ht of the current stage by performing an operation to which Wh, Wx) is applied, the calculated state value Ht is transferred to the next stage. Subsequently, the hidden cell of the second hidden cell group (HCG2) calculates the state value (Ht) of the current stage by performing an operation to which the state weight (Wh) is applied to the state value (Ht-1) of the previous stage. , The calculated state value Ht is transferred to the next stage. And the last hidden cell, the hidden cell of the third hidden cell group, calculates the state value (Ht) of the current stage by performing an operation in which the state weight (Wh) is applied to the state value (Ht-1) of the previous stage. , An output value, that is, a navigation prediction vector Yk, is calculated by performing an operation that applies the output weight Wy to the calculated state value Ht of the current stage. In this way, the navigation prediction network 220 navigates the object obj after a predetermined time (kn, k> n) based on a plurality of navigation vectors (Xt) representing the trend of the navigation state of the object (obj). The navigation prediction vector (Yk) that predicts the state can be calculated. The navigation prediction vector Yk may include a direction vector Dk, a region vector Bk, and a time vector Tk after a predetermined time (kn, k>n) (Yk=Dk+Bk+Tk). .

이에 따라, 관제부(400)는 S370 단계에서 항해예측벡터(Yk)의 시간벡터(Tk)가 나타내는 시간에 영역벡터(Bk)로부터 도출되는 영역상자(B)가 감시영상(SV) 내의 기 설정된 경고 영역과 적어도 일부가 중첩되며, 방향벡터(Dk)가 충돌 가능 방향을 지향하는지 여부를 통해 위험도를 분석한다. 그리고 관제부(400)는 S380 단계에서 위험도가 임계치 이상인지 여부를 판별한다. Accordingly, in step S370, the control unit 400 sets the area box B derived from the area vector Bk at the time indicated by the time vector Tk of the navigation prediction vector Yk in step S370. At least a part of the warning area overlaps, and the degree of risk is analyzed through whether the direction vector Dk is oriented in a possible collision direction. In addition, the control unit 400 determines whether the risk is greater than or equal to a threshold value in step S380.

이때, 관제부(400)는 항해예측벡터(Yk)의 시간벡터(Tk)가 나타내는 시간에 영역벡터(Bk)로부터 도출되는 영역상자(B)가 감시영상(SV) 내의 기 설정된 경고 영역과 적어도 일부가 중첩되며, 방향벡터(Dk)가 충돌 가능 방향을 지향하면, 위험도가 임계치 이상인 것으로 판별할 수 있다. 예컨대, 도 4의 감시영상(SV)은 3행4열의 12개의 셀로 구분되며, 2행3열 및 3행3열[(2, 3), (3, 3)]의 셀이 경고 영역이며, 해당 영역을 지향하는 방향이 충돌 가능 방향이라고 가정한다. 도 4에 도시된 바와 같이, 도출된 항해예측벡터(Yk)에 따르면, 시간벡터(Tk)가 나타내는 시간 k에 영역벡터(Bk)의 영역상자(B)는 감시영상(SV)의 3행3열(3, 3)의 셀과 일부 중첩되며, 방향벡터(Dk)는 감시영상(SV)의 3행3열(3, 3)을 지향한다. 이러한 경우, 관제부(400)는 충돌 위험을 나타내는 위험도가 기 설정된 임계치 이상인 것으로 판단한다. 그러면, 관제부(400)는 S390 단계에서 통신부(11)를 통해 사용자장치, 경보기, 관제서버 등에 충돌 위험을 알리는 경보 메시지를 전송한다. 이러한 경보 메시지는 현재, 즉, 항해벡터 X4 = OV4 + B4 + T4의 영상과 위치 정보를 포함한다. At this time, the control unit 400 determines the area box B derived from the area vector Bk at the time indicated by the time vector Tk of the navigation prediction vector Yk at least with a preset warning area in the surveillance image SV. If some overlap and the direction vector Dk is directed toward a possible collision direction, it can be determined that the risk is equal to or greater than the threshold. For example, the surveillance image (SV) of FIG. 4 is divided into 12 cells of 3 rows and 4 columns, and cells of 2 rows 3 columns and 3 rows 3 columns [(2, 3), (3, 3)] are warning areas, It is assumed that the direction facing the area is a possible collision direction. As shown in Fig. 4, according to the derived navigation prediction vector Yk, the area box B of the area vector Bk at the time k indicated by the time vector Tk is 3 rows 3 of the surveillance image SV. It partially overlaps with the cells in the columns 3 and 3, and the direction vector Dk is directed toward the 3rd row and 3rd column (3, 3) of the surveillance image SV. In this case, the control unit 400 determines that the risk indicating the risk of collision is equal to or greater than a preset threshold. Then, the control unit 400 transmits an alarm message notifying the risk of collision to the user device, the alarm, and the control server through the communication unit 11 in step S390. This warning message includes the current, that is, the image of the navigation vector X4 = OV4 + B4 + T4 and location information.

이와 같이, 본 발명의 실시예에 따르면, 복수의 항해벡터로부터 심층신경망을 이용하여 소정 시간 후의 객체의 항해 상태를 예측함으로써 충돌 가능 여부를 미리 예측하고, 충돌 가능성이 있는 경우, 이를 경고할 수 있다. As described above, according to an embodiment of the present invention, by predicting the navigation state of an object after a predetermined time using a deep neural network from a plurality of navigation vectors, it is possible to predict in advance whether a collision is possible, and if there is a possibility of collision, it can be warned .

한편, 전술한 본 발명의 실시예에 따른 방법은 다양한 컴퓨터수단을 통하여 판독 가능한 프로그램 형태로 구현되어 컴퓨터로 판독 가능한 기록매체에 기록될 수 있다. 여기서, 기록매체는 프로그램 명령, 데이터 파일, 데이터구조 등을 단독으로 또는 조합하여 포함할 수 있다. 기록매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 예컨대 기록매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광 기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치를 포함한다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 와이어뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 와이어를 포함할 수 있다. 이러한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다. Meanwhile, the method according to the embodiment of the present invention described above may be implemented in the form of a program that can be read through various computer means and recorded on a computer-readable recording medium. Here, the recording medium may include a program command, a data file, a data structure, or the like alone or in combination. The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in computer software. For example, the recording medium includes magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic-optical media such as floptical disks ( magneto-optical media) and hardware devices specially configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of the program instruction may include not only machine language wires such as those made by a compiler, but also high-level language wires that can be executed by a computer using an interpreter or the like. These hardware devices may be configured to operate as one or more software modules to perform the operation of the present invention, and vice versa.

이상 본 발명을 몇 가지 바람직한 실시예를 사용하여 설명하였으나, 이들 실시예는 예시적인 것이며 한정적인 것이 아니다. 이와 같이, 본 발명이 속하는 기술분야에서 통상의 지식을 지닌 자라면 본 발명의 사상과 첨부된 특허청구범위에 제시된 권리범위에서 벗어나지 않으면서 균등론에 따라 다양한 변화와 수정을 가할 수 있음을 이해할 것이다. Although the present invention has been described using several preferred embodiments, these embodiments are exemplary and not limiting. As such, those of ordinary skill in the art to which the present invention pertains will understand that various changes and modifications can be made according to the equivalence theory without departing from the spirit of the present invention and the scope of the rights presented in the appended claims.

10: 관제장치 11: 통신부
12: 카메라부 13: 입력부
14: 표시부 15: 저장부
16: 제어부 20: 서비스서버
30: 쇼핑몰서버 100: 전처리부
200: 심층신경망 210: 객체식별망
220: 방향식별망 230: 덧셈기
240: 항해예측망 300: 학습부
400: 관제부 10: control device 11: communication unit
12: camera part 13: input part
14: display unit 15: storage unit
16: control unit 20: service server
30: shopping mall server 100: preprocessor
200: deep neural network 210: object identification network
220: direction identification network 230: adder
240: navigation network 300: learning department
400: control unit

Claims

In the device for recognizing an approaching vessel considering the distance of a sea object based on a deep neural network,
A camera unit for photographing a surveillance image;
A preprocessor for detecting an object image by specifying an area box representing an area occupied by an object included in the captured surveillance image through feature point detection;
A deep neural network for outputting with probability whether the object of the object image is a ship; And
Determines whether the object is a ship according to the probability, and as a result of the determination, if the object is a ship, a risk indicating a probability of occurrence of a collision with the ship is calculated, and if the calculated risk is above a threshold, a collision risk is warned. Control department;
Including,
The pretreatment unit
If no feature point is detected in the image,
A horizontal line is detected in the surveillance image, the image above the detected horizontal line is enlarged to a predetermined size, and a plurality of corner points are detected using the Harris Corner algorithm in the image above the enlarged horizontal line. Characterized in that, by finding a dense area of a plurality of corner points, specifying an area box representing an area occupied by an object, and detecting the object image through the specified area box.
Device for recognizing approaching vessels.

The method of claim 1,
The pretreatment unit
When a feature point is detected in the surveillance image,
After detecting a plurality of corner points in the surveillance image using a Harris corner algorithm, an area box representing an area occupied by an object is specified by searching for a dense area of the detected plurality of corner points, and the specified area box The object image is detected through
Device for recognizing approaching vessels.

delete

The method of claim 1,
The control unit
After calculating the area of the object image through the coordinates of the area box of the object,
If the ratio of the area of the surveillance image to the area of the object image is greater than or equal to a preset threshold, it is determined that the risk is greater than or equal to the threshold.
Device for recognizing approaching vessels.

The method of claim 1,
The deep neural network
An input layer into which an object image is input;
At least one convolutional layer including at least one feature image derived by a convolution operation on the object image or feature image;
At least one pooling layer for deriving at least one feature image derived through a pooling operation on the feature image generated through a convolution operation;
At least one fully connected layer including a plurality of operation nodes for receiving a feature image or node value of a previous layer and calculating a node value through an operation by an activation function; And
An output layer including a plurality of output nodes for receiving a node value of the fully connected layer and calculating an output value through an operation by an activation function;
Characterized in that it comprises a
Device for recognizing approaching vessels.

In the device for recognizing an approaching vessel considering the distance of a sea object based on a deep neural network,
A camera unit for photographing a surveillance image including a plurality of frames;
A preprocessor configured to sequentially detect a plurality of object images through an area box indicating an area occupied by an object in each of the plurality of frames;
Generates a plurality of navigation vectors arranged in chronological order corresponding to each of the plurality of object images, and predicts the navigation state of the object after a predetermined time by performing a plurality of calculations to which weights are applied to the plurality of navigation vectors A deep neural network that calculates a navigational prediction vector; And
From the navigation prediction vector, an area box indicating the navigation direction of the object after the predetermined time and an area occupied by the object in the surveillance image is derived, the derived navigation direction is a collision possible direction, and the derived area box is the surveillance image A control unit that warns of a risk of collision when at least a part of the warning area is overlapped with each other;
Including,
The deep neural network
A direction identification network for deriving a plurality of direction vectors indicating navigation directions of each of the plurality of object images;
The plurality of direction vectors are combined with an area vector representing the coordinates of the area box on the surveillance image of each of the plurality of object images and a time vector representing the time when each of the plurality of object images is generated. An adder that generates a navigation vector; And
It consists of a plurality of stages arranged in sequence,
Multiple concealment that calculates the state value of the current stage by calculating the state value of the current stage by applying the state and input weights to the state value of the previous stage and the voyage vector, which is the input value of the current stage, and then passes the calculated state value to the next stage. A first hidden cell group including a cell,
A second hidden cell group including a plurality of hidden cells for calculating a state value of the current stage by performing an operation in which a state weight is applied to the state value of the previous stage, and transmitting the calculated state value to the next stage;
Calculates the state value of the current stage by performing an operation in which the state weight is applied to the state value of the previous stage, and then performs an operation that applies the output weight to the calculated state value of the current stage to calculate the output value, the voyage prediction vector. Containing a third hidden cell group including a hidden cell
Navigation forecasting network;
Characterized in that it comprises a
A device for recognizing an approaching vessel.

delete

The method of claim 6,
The navigation forecasting network is
A circular input layer receiving a plurality of navigation vectors arranged according to a time order;
A cyclic concealment layer including a plurality of hidden cells for calculating the navigation prediction vector by performing one or more operations in which weights are applied in the order of the plurality of stages corresponding to each of the plurality of navigation vectors; And
A circulation output layer outputting the calculated risk level;
Characterized in that it comprises a
Device for recognizing approaching vessels.

The method of claim 6,
The pretreatment unit
When a feature point is detected in the surveillance image,
After detecting a plurality of corner points in the surveillance image using a Harris corner algorithm, an area box representing an area occupied by an object is specified by searching for a dense area of the detected plurality of corner points, and the specified area box To detect the object image through,
If no feature point is detected in the image,
A horizontal line is detected in the surveillance image, the image above the detected horizontal line is enlarged to a predetermined size, and a plurality of corner points are detected using the Harris Corner algorithm in the image above the enlarged horizontal line. Characterized in that, by finding a dense area of a plurality of corner points, specifying an area box representing an area occupied by an object, and detecting the object image through the specified area box.
Device for recognizing approaching vessels.

In a method for recognizing an approaching vessel considering the distance of a sea object based on a deep neural network,
Capturing a surveillance image by a camera unit;
Detecting an object image by specifying an area box indicating an area occupied by an object included in the captured surveillance image through feature point detection by a preprocessor;
Outputting, by a deep neural network, whether the object of the object image is a ship with probability; And
Determining whether the object is a ship according to the probability;
If the object is a ship as a result of the determination, the control unit calculates a risk indicating a probability of a collision with the ship; And
Warning of a collision risk when the calculated risk is greater than or equal to a threshold;
Including,
The step of detecting the object image
If the preprocessor does not detect a feature point in the image, detecting a horizontal line in the surveillance image;
Enlarging, by the pre-processing unit, an image above the detected horizontal line to a predetermined size;
Detecting, by the preprocessor, a plurality of corner points in the image above the enlarged horizontal line using a Harris corner algorithm;
Specifying, by the preprocessor, an area box indicating an area occupied by an object by searching for a dense area of the detected plurality of corner points; And
Detecting the object image through the specified area box by the preprocessor;
Characterized in that it comprises a
Method for recognizing an approaching vessel.

The method of claim 11,
The step of detecting the object image
When the preprocessor detects a feature point in the surveillance image, detecting a plurality of corner points in the surveillance image using a Harris corner algorithm;
Specifying, by the preprocessor, an area box indicating an area occupied by an object by searching for a dense area of the detected plurality of corner points;
Detecting the object image through the specified area box by the preprocessor;
Characterized in that it comprises a
Method for recognizing an approaching vessel.

delete

The method of claim 11,
The step of calculating the risk is
Calculating, by the control unit, the width of the object image through coordinates of the area box of the object; And
Determining that the risk is greater than or equal to a threshold value when the ratio of the area of the surveillance image to the area of the object image is greater than or equal to a preset threshold;
Characterized in that it comprises a
Method for recognizing an approaching vessel.

The method of claim 11,
The deep neural network outputting with probability whether the object of the object image is a ship
Receiving, by an input layer of the deep neural network, an object image;
Deriving at least one feature image by performing a convolution operation using a filter on the object image by the first convolutional layer of the deep neural network;
Deriving at least one feature image by performing, by the first pooling layer of the deep neural network, a pooling operation using a filter on the feature image of the first convolutional layer;
Deriving at least one feature image by performing a convolution operation using a filter on the feature image of the first pooling layer by the second convolutional layer of the deep neural network;
Deriving at least one feature image by performing a pooling operation using a filter on the feature image of the second convolutional layer by the second pooling layer of the deep neural network;
Calculating a node value through an operation by an activation function on the feature image of the second pooling layer by a plurality of computing nodes of the first complete connection layer of the deep neural network;
Calculating a node value through an operation by an activation function on the node value of the first complete connection layer by a plurality of operation nodes of the second complete connection layer of the deep neural network; And
Calculating an output value, which is a probability of whether the object of the object image is a ship, by calculating the node value of the second fully connected layer by an activation function by a plurality of output nodes of the output layer of the deep neural network;
Characterized in that it comprises a
Method for recognizing an approaching vessel.

A computer-readable recording medium in which a program for performing a method for recognizing an approaching vessel according to any one of claims 11, 12, 14 and 15 is recorded.