KR102294687B1

KR102294687B1 - Method for alerting when surrounding situation of car is dangerous situation by driving guide, and device using the same

Info

Publication number: KR102294687B1
Application number: KR1020190178758A
Authority: KR
Inventors: 정선태; 김병희
Original assignee: 주식회사 써로마인드
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2021-08-31
Also published as: WO2021137313A1; KR20210086840A

Abstract

본 발명에 따르면, 자동차의 주변 상황이 위험상황인지를 판단하고, 주행가이드를 생성하여 경보하여 주는 방법에 있어서, (a) 자동차에 장착된 적어도 하나의 카메라로부터 상기 자동차의 주변 영상 이미지가 획득되면, 주행상황판단및가이드장치가, 상기 주변 영상 이미지를 영상분석모듈로 입력하여 상기 영상분석모듈로 하여금 상기 주변 영상 이미지를 분석하여 상기 자동차의 주변에 존재하는 적어도 하나의 오브젝트에 대한 정보를 포함하는 주변 환경 정보를 출력하도록 하는 단계; (b) 상기 주행상황판단및가이드장치가, 상기 주변 환경 정보를 위험상황판단모듈로 입력하여 상기 위험상황판단모듈로 하여금 상기 주변 환경 정보를 딥러닝 연산하여 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 예측값들을 출력하도록 하며, 상기 예측값들을 참조하여 상기 자동차의 주변 상황이 특정 위험상황 카테고리에 해당되는지를 판단하도록 하는 단계; 및 (c) 상기 주행상황판단및가이드장치가, 상기 자동차의 주변 상황이 상기 기설정된 복수개의 위험상황 카테고리 중 상기 특정 위험상황 카테고리에 해당되는 것으로 판단되면, 주행가이드생성모듈로 하여금 상기 특정 위험상황 카테고리에 대응되는 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 생성하여 상기 특정 위험상황 카테고리에 해당되는 특정 위험상황을 상기 자동차의 운전자가 인지할 수 있도록 하는 단계; 를 포함하는 방법이 제공된다.According to the present invention, in the method of determining whether the surrounding situation of the vehicle is a dangerous situation, generating a driving guide and giving an alert, (a) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle , the driving situation determination and guide device inputs the surrounding video image to the video analysis module, and the video analysis module analyzes the surrounding video image to include information on at least one object existing in the vicinity of the vehicle. outputting surrounding environment information; (b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to the category so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category; A method comprising:

Description

Method for judging whether the surrounding situation of a car is dangerous and generating a driving guide to give an alert, and a device using the same

본 발명은 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하여 경보하여 주는 방법 및 이를 이용한 장치에 관한 것으로, 보다 상세하게는, 자동차에 장착된 적어도 하나의 카메라로부터 획득되는 주변 영상 이미지를 분석하여 자동차 주변의 주변 환경 정보를 생성하고, 생성된 주변 환경 정보를 딥러닝 연산하여 기설정된 복수개의 위험상황 카테고리 각각에 해당되는 확률로서의 예측값을 출력하여, 이를 바탕으로 자동차의 주변 상황에 해당되는 위험상황을 판단하는 방법 및 이를 이용한 장치에 관한 것이다.The present invention relates to a method for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide and giving an alarm, and an apparatus using the same, and more particularly, to a surrounding video image obtained from at least one camera mounted on the vehicle. to generate information on the surrounding environment around the car by analyzing It relates to a method for judging a dangerous situation and an apparatus using the same.

최근 자동차 기술의 발전에 따라, 운전자로 하여금 안전운전이 가능하도록 지원하는 운전자 지원 기술들이 개발되어 자동차에 탑재되고 있으며, 그에 따라 고령 운전자 등 상대적으로 지각능력이 부족하고 및 반응속도가 느린 사람도 운전이 가능하게 되었다.With the recent development of automobile technology, driver assistance technologies that support drivers to drive safely have been developed and installed in automobiles. This became possible.

하지만, 이러한 운전자 지원 기술들은 어디까지나 운전을 보조해주는 기술로서, 운전을 완전히 대신해주지는 못하기 때문에 운전자가 운전 상황을 정확하게 파악해야 할 필요는 여전히 존재하며, 특히 지각능력 및 반응속도가 상대적으로 부족한 고령운전자 등은 자동차가 위험상황에 처한 경우 이를 파악하고 대응하지 못할 가능성이 높아 결국 사고로 이어질 가능성이 여전히 존재하는 실정이다.However, since these driver assistance technologies are technologies that assist driving and do not completely replace driving, there is still a need for drivers to accurately understand the driving situation, especially for older people who have relatively poor perception and reaction speed. Drivers, etc., are more likely to be unable to identify and respond to a dangerous situation when the vehicle is in danger, so there is still a possibility that it will eventually lead to an accident.

선행문헌인 한국등록특허공보 제10-0997412호에는, 자동차가 위험하게 운전된 것으로 판단되면, 자동차에 장착된 카메라에서 촬상된 영상을 분석하여 전방차량과의 이격거리 및 좌우차선의 인접차량을 파악하여 위험운전의 원인을 파악할 수 있는 발명이 개시되어 있다. 하지만, 해당 선행문헌에는 자동차에 장착된 카메라로부터 획득되는 주변 영상 이미지로부터 자동차의 주변 상황이 특정 위험상황에 해당되는지를 스스로 판단할 수 있는 방법 및 그 결과를 바탕으로 운전자에게 소정의 주행가이드를 생성하여 제공할 수 있는 방법에 대한 내용은 개시되어 있지 않다.In the preceding document, Korean Patent Publication No. 10-0997412, when it is determined that the vehicle is being driven dangerously, the image captured by the camera mounted on the vehicle is analyzed to determine the separation distance from the vehicle in front and the adjacent vehicle in the left and right lanes. Thus, an invention that can determine the cause of dangerous driving is disclosed. However, in the prior literature, there is a method that can determine whether the surrounding situation of the vehicle corresponds to a specific dangerous situation from the surrounding image image obtained from the camera mounted on the vehicle, and a predetermined driving guide is generated for the driver based on the result. The contents of the method that can be provided are not disclosed.

따라서, 소정의 알고리즘을 사용하여 자동차의 주변 환경이 위험상황인지의 여부 및 어떠한 위험상황인지를 스스로 판단하고, 그 결과를 바탕으로 운전자에게 주행가이드 정보를 제공함으로써 운전자로 하여금 자동차의 주변 상황이 위험상황임을 정확하게 인지하고 대처할 수 있도록 지원하는 기술은 여전히 그 필요성이 요구된다.Therefore, by using a predetermined algorithm to determine whether the surrounding environment of the car is a dangerous situation and what kind of dangerous situation it is, and based on the result, the driving guide information is provided to the driver to make the driver feel that the surrounding situation of the car is dangerous. There is still a need for the technology to accurately recognize the situation and to respond to it.

따라서, 본 발명은 상술한 문제점을 모두 해결하는 것을 그 목적으로 한다.Accordingly, an object of the present invention is to solve all of the above problems.

또한, 본 발명은, 자동차의 주변 상황이 위험상황인지를 판단하기 위한 주변 상황 정보를 생성하기 위하여, 자동차에 장착된 카메라로부터 획득되는 주변 상황 이미지를 분석하여 그에 포함된 적어도 하나의 오브젝트를 검출할 수 있는 방법을 제공하는 것을 다른 목적으로 한다.In addition, the present invention analyzes the surrounding situation image obtained from the camera mounted on the vehicle in order to generate surrounding situation information for determining whether the surrounding situation of the vehicle is a dangerous situation to detect at least one object included therein. Another purpose is to provide a way to

또한, 본 발명은, 주변 상황 정보를 입력받아 자동차의 주변 환경에 해당되는 위험상황을 판단할 수 있는 방법을 제공하는 것을 다른 목적으로 한다.Another object of the present invention is to provide a method capable of determining a dangerous situation corresponding to the surrounding environment of a vehicle by receiving surrounding situation information.

또한, 본 발명은, 자동차의 주변 환경이 위험상황으로 판단되면 그에 대응되는 음성주행가이드 정보 및 시각주행가이드 정보 중 적어도 하나를 생성하고, 이를 사용하여 운전자에게 경보하여 줄 수 있는 방법을 제공하는 것을 다른 목적으로 한다.In addition, the present invention is to provide a method of generating at least one of voice driving guide information and visual driving guide information corresponding thereto when the surrounding environment of the vehicle is determined to be a dangerous situation, and using it to provide a method for alerting the driver for other purposes.

또한, 본 발명은, 입력된 주변 상황 이미지에 대하여 실시간 오브젝트 디텍션 또는 실시간 인스턴스 세그멘테이션을 더욱 정확하고 빠르게 수행할 수 있는 컨볼루셔널 뉴럴 네트워크를 제공하는 것을 다른 목적으로 한다.Another object of the present invention is to provide a convolutional neural network capable of more accurately and quickly performing real-time object detection or real-time instance segmentation on an input surrounding image.

상기한 바와 같은 본 발명의 목적을 달성하고, 후술하는 본 발명의 특징적인 효과를 실현하기 위한, 본 발명의 특징적인 구성은 하기와 같다.In order to achieve the object of the present invention as described above and to realize the characteristic effects of the present invention to be described later, the characteristic configuration of the present invention is as follows.

본 발명의 일 태양에 따르면, 자동차의 주변 상황이 위험상황인지를 판단하고, 주행가이드를 생성하여 경보하여 주는 방법에 있어서, (a) 자동차에 장착된 적어도 하나의 카메라로부터 상기 자동차의 주변 영상 이미지가 획득되면, 주행상황판단및가이드장치가, 상기 주변 영상 이미지를 영상분석모듈로 입력하여 상기 영상분석모듈로 하여금 상기 주변 영상 이미지를 분석하여 상기 자동차의 주변에 존재하는 적어도 하나의 오브젝트에 대한 정보를 포함하는 주변 환경 정보를 출력하도록 하는 단계; (b) 상기 주행상황판단및가이드장치가, 상기 주변 환경 정보를 위험상황판단모듈로 입력하여 상기 위험상황판단모듈로 하여금 상기 주변 환경 정보를 딥러닝 연산하여 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 예측값들을 출력하도록 하며, 상기 예측값들을 참조하여 상기 자동차의 주변 상황이 특정 위험상황 카테고리에 해당되는지를 판단하도록 하는 단계; 및 (c) 상기 주행상황판단및가이드장치가, 상기 자동차의 주변 상황이 상기 기설정된 복수개의 위험상황 카테고리 중 상기 특정 위험상황 카테고리에 해당되는 것으로 판단되면, 주행가이드생성모듈로 하여금 상기 특정 위험상황 카테고리에 대응되는 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 생성하여 상기 특정 위험상황 카테고리에 해당되는 특정 위험상황을 상기 자동차의 운전자가 인지할 수 있도록 하는 단계; 를 포함하는 방법이 제공된다.According to one aspect of the present invention, there is provided a method for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide and giving an alarm, (a) a video image of the surrounding area of the vehicle from at least one camera mounted on the vehicle When is obtained, the driving situation determination and guide device inputs the surrounding video image to the video analysis module to cause the video analysis module to analyze the surrounding video image, and information on at least one object existing in the vicinity of the vehicle outputting surrounding environment information including; (b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to the category so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category; A method comprising:

일례로서, 소정의 데이터베이스에 복수개의 음성가이드템플릿 - 상기 음성가이드템플릿 각각은, 상황에 따라 그 내용이 변경되어 입력될 수 있는 부분인 적어도 하나의 슬롯(slot)이 존재하며, 상기 슬롯 각각에는 상기 주변 환경 정보에 포함된 적어도 하나의 오브젝트 각각에 대응되는 위치정보, 크기정보, 이동여부정보 및 종류정보 중 어느 하나의 정보가 입력될 수 있음 - 이 상기 기설정된 복수개의 위험상황 카테고리별로 분류되어 저장되어 있는 것을 특징으로 하고, 상기 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각의 위험상황으로 판별될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하는 것을 특징으로 하며, 상기 (c) 단계에서, 상기 주행상황판단및가이드장치는, 상기 주행가이드생성모듈로 하여금 (i) 상기 특정 위험상황 카테고리에 대한 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 특정 위험상황 카테고리에 대응되는 특정 음성가이드템플릿에 대한 정보를 참조하여, 상기 특정 음성가이드템플릿에 포함된 상기 슬롯 각각이 입력되어 완성된 상기 특정 음성주행가이드 정보를 생성하도록 하고, 상기 생성된 특정 음성주행가이드 정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 방법이 제공된다.As an example, a plurality of audio guide templates in a predetermined database - each of the audio guide templates has at least one slot, which is a part whose contents can be changed and input according to circumstances, and in each of the slots, the Any one of location information, size information, movement status information, and type information corresponding to each of at least one object included in the surrounding environment information may be inputted - classified and stored according to a plurality of preset dangerous situation categories (c) In the step, the driving situation determination and guide device causes the driving guide generation module to (i) information on the specific dangerous situation category, (ii) information on a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) with reference to information on a specific voice guide template corresponding to the specific dangerous situation category, each of the slots included in the specific voice guide template is input to generate the completed specific voice driving guide information; There is provided a method characterized in that the generated specific voice driving guide information is directly provided to the driver or transmitted to a predetermined driver's terminal and supported to be provided.

일례로서, 상기 주행가이드생성모듈은 소정의 TTS(Text-To-Speech) 엔진과 연동되어 있는 것을 특징으로 하고, 상기 (c) 단계에서, 상기 주행상황판단및가이드장치는, 상기 주행가이드생성모듈로 하여금 상기 생성된 특정 음성주행가이드 정보에 대하여 상기 TTS 엔진을 적용하여 특정 TTS 데이터를 생성하도록 하는 프로세스를 추가로 수행하여, 상기 특정 TTS 데이터를 재생한 특정 음성정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 방법이 제공된다.As an example, the driving guide generating module is characterized in that it is interlocked with a predetermined text-to-speech (TTS) engine, and in step (c), the driving situation determination and guide device includes the driving guide generating module a process of generating specific TTS data by applying the TTS engine to the generated specific voice driving guide information to directly provide specific voice information reproducing the specific TTS data to the driver; or A method is provided, characterized in that it is transmitted to a predetermined driver's terminal and supported to be provided.

일례로서, 상기 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각의 위험상황으로 판별될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하는 것을 특징으로 하고, 상기 (b) 단계에서, 상기 주행상황판단및가이드장치는, 상기 위험상황판단모듈로 하여금 (i) 상기 특정 위험상황 카테고리 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 주변 환경 정보를 참조하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 중 상기 위험조건에 해당되는 위험오브젝트를 특정하고, 상기 위험오브젝트에 대한 정보를 포함하는 위험요인정보를 생성하도록 하는 프로세스를 추가로 수행하는 것을 특징으로 하는 방법이 제공된다.As an example, each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as each dangerous situation, and in step (b), the The driving situation determination and guide device causes the dangerous situation determination module to (i) the specific dangerous situation category information, (ii) information on a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the surrounding environment A process of specifying a risk object corresponding to the risk condition among at least one of the objects included in the surrounding video image with reference to information and generating risk factor information including information on the risk object is additionally performed A method is provided, characterized in that

일례로서, 상기 (c) 단계에서, 상기 주행상황판단및가이드장치가, 상기 위험요인정보를 상기 주행가이드모듈로 하여금 입력받도록 하여, 상기 주변 영상 이미지에 대하여 상기 위험오브젝트 각각에 해당되는 좌표에 대응되도록 기설정된 위험안내신호가 추가로 표시된 이미지를 상기 특정 시각주행가이드 정보로서 생성하도록 하고, 상기 생성된 특정 시각주행가이드 정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 방법이 제공된다.As an example, in the step (c), the driving situation determination and guide device causes the driving guide module to receive the risk factor information, so that the surrounding image image corresponds to the coordinates corresponding to each of the dangerous objects To generate an image additionally displayed with a preset danger guide signal as the specific visual driving guide information, and to directly provide the generated specific visual driving guide information to the driver or transmit it to a predetermined driver terminal to support the provision A method is provided, characterized in that

일례로서, 상기 (a) 단계 이전에, 소정의 위험상황판단모듈 학습장치가, (i) 상기 자동차에 장착된 상기 카메라로부터 획득되거나, 별도로 준비된 적어도 하나의 제1 학습용 주변 영상 이미지 각각 - 상기 제1 학습용 주변 영상 이미지 각각은, 기설정된 위험상황 카테고리 중 상기 제1 학습용 주변 영상 이미지 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제1 GT(Ground Truth)로서 포함함 - 을 학습데이터로 하여, 상기 영상분석모듈로 하여금 상기 학습용 제1 주변 영상 이미지 각각을 입력받아 분석한 결과로서 제1 학습용 주변 환경 정보를 출력하도록 하고, 상기 출력된 제1 학습용 주변 환경 정보를 사용하여 상기 위험상황판단모듈에 대한 학습을 수행하거나, (ii) 별도로 준비된 적어도 하나의 제2 학습용 주변 환경 정보 - 상기 제2 학습용 주변 환경 정보 각각은, 그 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제2 GT(Ground Truth)로서 포함함 - 를 학습데이터로 사용하여, 상기 위험상황판단모듈에 대한 학습을 수행하는 것을 특징으로 하고, 상기 위험상황판단모듈의 학습은, 상기 위험상황판단모듈 학습장치가, (i) 상기 위험상황판단모듈로 하여금 상기 제1 학습용 주변 환경 정보를 입력받아 상기 소정의 딥러닝 연산을 수행하도록 하고, 그 결과로서 출력되는 상기 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제1 학습용 예측값들에 대한 정보를 상기 제1 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행하거나, (ii) 상기 위험상황판단모듈로 하여금 상기 제2 학습용 주변 환경 정보를 입력받아 상기 소정의 딥러닝 연산을 수행하도록 하고, 그 결과로서 출력되는 상기 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제2 학습용 예측값들에 대한 정보를 상기 제2 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어지는 것을 특징으로 하는 방법이 제공된다.As an example, before the step (a), a predetermined dangerous situation determination module learning device, (i) each of at least one first learning surrounding video image obtained from the camera mounted on the vehicle or prepared separately - the first 1 Each of the surrounding image images for learning includes information about a specific correct dangerous situation category corresponding to each of the first surrounding image images among the preset dangerous situation categories as the first GT (Ground Truth) - as learning data , cause the image analysis module to receive each of the first surrounding image images for learning and output first surrounding environment information for learning as a result of analysis, and use the outputted first surrounding environment information for learning to determine the dangerous situation module or (ii) at least one piece of separately prepared second learning surrounding environment information - each of the second learning surrounding environment information includes information on a specific correct dangerous situation category corresponding to the second GT ( Including as ground truth) as learning data, characterized in that learning for the dangerous situation determination module is performed, and the learning of the dangerous situation determination module is performed by the risk situation determination module learning device, (i ) Let the dangerous situation determination module receive the first learning surrounding environment information and perform the predetermined deep learning operation, and the first as a probability corresponding to each of the plurality of preset dangerous situation categories output as a result A process of optimizing a plurality of parameters included in the dangerous situation determination module is performed so that the difference is minimized by comparing the information on the learning prediction values with the first GT, or (ii) the dangerous situation determination module causes the first GT 2 Receive information on the surrounding environment for learning to perform the predetermined deep learning operation, and output as a result information on second prediction values for learning as a probability corresponding to each of the plurality of preset risk situation categories, the second Multiple included in the dangerous situation judgment module to minimize the difference compared to the GT There is provided a method characterized in that it is achieved by performing a process of optimizing the parameters.

일례로서, 상기 (a) 단계에서, 상기 주행상황판단및가이드장치가, 상기 영상분석모듈로 하여금 소정의 알고리즘을 사용하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 각각을 검출하고, 상기 검출된 오브젝트 각각의 위치정보, 크기정보, 이동여부정보 및 종류정보 중 적어도 일부를 포함하는 정보를 상기 주변 환경 정보로서 출력하도록 하는 것을 특징으로 하는 방법이 제공된다.As an example, in step (a), the driving condition determination and guide device causes the image analysis module to detect each of the at least one object included in the surrounding image image using a predetermined algorithm, and the detection There is provided a method for outputting information including at least some of position information, size information, movement status information, and type information of each object as the surrounding environment information.

일례로서, 상기 (a) 단계 이전에, 소정의 영상분석모듈 학습장치가, 상기 영상분석모듈에 대하여 상기 소정의 실시간 오브젝트 디텍션(real-time object detection) 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션(real-time instance segmentation) 알고리즘을 바탕으로 한 학습을 수행하는 것을 특징으로 하며, 상기 영상분석모듈의 학습은, 상기 영상분석모듈 학습장치가, 상기 자동차에 장착된 상기 카메라로부터 획득되거나, 별도로 준비된 적어도 하나의 제2 학습용 주변 영상 이미지 - 상기 제2 학습용 주변 영상 이미지 각각은, 그 각각에 포함된 적어도 하나의 학습용 오브젝트 각각에 대한 정답 오브젝트 정보를 제3 GT(Ground Truth)로서 포함함 - 을 학습데이터로 사용하여, 상기 영상분석모듈로 하여금 상기 제2 학습용 주변 영상 이미지를 입력받아 상기 소정의 실시간 오브젝트 디텍션 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션 알고리즘을 사용하여 분석하도록 하고, 그 결과로서 출력되는 상기 학습용 오브젝트 각각에 대한 정보를 상기 제3 GT와 비교하여 그 차이가 최소화되도록 상기 영상분석모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어지는 것을 특징으로 하는 방법이 제공된다.As an example, before the step (a), the predetermined image analysis module learning apparatus may perform the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation with respect to the image analysis module. time instance segmentation) algorithm, characterized in that the learning of the image analysis module is obtained by the image analysis module learning apparatus from the camera mounted on the vehicle, or at least one separately prepared A second surrounding image image for learning - Each of the second surrounding image images for learning includes correct object information for each of at least one learning object included in each as a third GT (Ground Truth) - as learning data In this way, the image analysis module receives the second learning peripheral image image and analyzes it using the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm, and as a result, There is provided a method characterized in that by performing a process of optimizing a plurality of parameters included in the image analysis module to minimize the difference by comparing the information on the third GT.

일례로서, 상기 주행상황판단및가이드장치가, 상기 영상분석모듈로 하여금 소정의 실시간 오브젝트 디텍션(real-time object detection) 알고리즘 또는 소정의 실시간 인스턴스 세그먼테이션(real-time instance segmentation) 알고리즘을 사용하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 각각을 검출하거나 상기 주변 영상 이미지를 인스턴스 세그먼테이션하도록 하는 것을 특징으로 하고, 상기 (a) 단계는, (a1) 상기 주변 영상 이미지가 획득되면, 상기 주행상황판단및가이드장치가, 상기 영상분석모듈로 하여금 상기 주변 영상 이미지를 ResNET의 백본 블록으로 입력하여 상기 백본 블록으로 하여금 상기 주변 영상 이미지를 순차적으로 컨볼루션 연산하여 다운 샘플링된 제1 다운 샘플링 피처맵 내지 제m - 상기 m은 2 이상의 정수임 - 다운 샘플링 피처맵을 출력하도록 하는 단계; (a2) 상기 주행상황판단및가이드장치가, 상기 영상분석모듈로 하여금, 특정 다운 샘플링 피처맵을 제1 (1*1) 컨볼루션 레이어로 입력하여 상기 제1 (1*1) 컨볼루션 레이어로 하여금 상기 특정 다운 샘플링 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제1 피처맵을 생성하도록 하고, 상기 제1 피처맵을 (1*r) - 상기 r은 1 이상의 정수임 - 확장 비율을 가지는 (k*k) - 상기 k는 2 이상의 정수임 - 커널을 포함하는 제1 (k*k) 컨볼루션 레이어 내지 (n*r) - 상기 n은 2 이상의 정수임 - 확장 비율을 가지는 (k*k) 커널을 포함하는 제n (k*k) 컨볼루션 레이어로 각각 입력하여 상기 제1 (k*k) 컨볼루션 레이어 내지 상기 제n (k*k) 컨볼루션 레이어 각각으로 하여금 상기 제1 피처맵의 채널들을 적어도 2개의 그룹으로 구분하며, 구분된 적어도 2개의 그룹에 대응되는 각각의 제1 피처맵들을 (1*r) 확장 비율에 의한 (k*k) 컨볼루션 연산 내지 (n*r) 확장 비율에 의한 (k*k) 컨볼루션 연산하여 제2_1 피처맵 내지 제2_n 피처맵을 생성하도록 하는 제1 프로세스와, 상기 제2_1 피처맵 내지 상기 제2_n 피처맵 각각을 제2_1 (1*1) 컨볼루션 레이어 내지 제2_n (1*1) 컨볼루션 레이어로 입력하여 상기 제2_1 (1*1) 컨볼루션 레이어 내지 상기 제2_n (1*1) 컨볼루션 레이어 각각으로 하여금 각각의 상기 제2_1 피처맵 내지 상기 제2_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제3_1 피처맵 내지 제3_n 피처맵을 생성하도록 하고, 상기 제3_1 피처맵 내지 상기 제3_n 피처맵을 콘케이트네이트하여 제3 (1*1) 컨볼루션 레이어로 입력하여 상기 제3 (1*1) 컨볼루션 레이어로 하여금 콘케이트네이트된 상기 제3_1 피처맵 내지 상기 제3_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제4 피처맵을 생성하도록 하며, 상기 특정 다운 샘플링 피처맵과 상기 제4 피처맵을 콘케이트네이트하여 변환된 다운 샘플링 피처맵을 생성하는 제2 프로세스를 통해, 상기 제m 다운 샘플링 피처맵 내지 제(m-j) - 상기 j는 1 이상이며 m 미만인 정수임 - 다운 샘플링 피처맵 각각에 상기 제1 프로세스와 상기 제2 프로세스를 적용하여 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 생성하는 단계; 및 (a3) 상기 주행상황판단및가이드장치가, 상기 영상분석모듈로 하여금, 상기 제m 변환된 다운 샘플링 피처맵 내지 상기 제(m-j) 변환된 다운 샘플링 피처맵을 피처 피라미드 네트워크로 입력하여 상기 피처 피라미드 네트워크로 하여금 상기 제m 변환된 다운 샘플링 피처맵 내지 상기 제(m-j) 변환된 다운 샘플링 피처맵을 참조한 디컨볼루션 연산을 통해 제m 업 샘플링 피처맵 내지 제(m-j) 업 샘플링 피처맵을 생성하도록 하며, 상기 제(m-j) 업 샘플링 피처맵을 오브젝트 디텍션 네트워크 또는 인스턴스 세그먼테이션 네트워크로 입력하여 상기 오브젝트 디텍션 네트워크로 하여금 상기 주변 영상 이미지 상의 오브젝트를 검출하도록 하거나 상기 인스턴스 세그먼테이션 네트워크로 하여금 상기 주변 영상 이미지를 인스턴스 세그먼테이션하도록 하는 단계; 를 포함하는 방법이 제공된다.As an example, the driving situation determination and guidance device causes the image analysis module to use a predetermined real-time object detection algorithm or a predetermined real-time instance segmentation algorithm to determine the surrounding Detecting each of the at least one object included in the video image or performing instance segmentation of the surrounding video image, wherein the step (a) includes: (a1) When the surrounding video image is obtained, determining the driving situation and the guide device causes the image analysis module to input the peripheral image image to the backbone block of ResNET, and causes the backbone block to sequentially perform a convolution operation on the peripheral image image to down-sample the first down-sampling feature map to the second m - wherein m is an integer greater than or equal to 2 - outputting a down-sampling feature map; (a2) the driving situation determination and guidance device causes the image analysis module to input a specific down-sampling feature map to the first (1*1) convolutional layer to the first (1*1) convolutional layer (1*1) convolution operation on the specific down-sampling feature map to generate a first feature map with an adjusted number of channels, and expand the first feature map to (1*r) - where r is an integer greater than or equal to 1 - (k*k) with a ratio - where k is an integer greater than or equal to 2 - a first (k*k) convolutional layer including a kernel to (n*r) - where n is an integer greater than or equal to 2 - (k) with an extension ratio *k) each input to an nth (k*k) convolutional layer including a kernel to cause each of the first (k*k) convolutional layer to the nth (k*k) convolutional layer to form the first The channels of the feature map are divided into at least two groups, and each of the first feature maps corresponding to the at least two groups is divided into a (k*k) convolution operation using an (1*r) expansion ratio to (n* r) a first process for generating a 2_1 th feature map to a 2_n th feature map by performing a (k*k) convolution operation using an extension ratio, and a 2_1 (1) *1) Convolutional layers to 2_n (1*1) convolutional layers are input to each of the 2_1 (1*1) convolutional layers to the 2_n (1*1) convolutional layers. A (1*1) convolution operation is performed on the 2_1 feature map to the 2_nth feature map to generate 3_1 to 3_n feature maps with the number of channels adjusted, and the 3_1 to 3_n feature maps are convolutional. Kate and input as a third (1*1) convolution layer to cause the third (1*1) convolution layer to convert the concatenated 3_1 to 3_n feature maps (1*1) Convolution is performed to generate a fourth feature map with the number of channels adjusted, and the specific down-sampling feature map and the fourth feature map are combined. Through a second process of concatenating to generate a transformed down-sampling feature map, the m-th down-sampling feature map to (mj) - wherein j is an integer greater than or equal to 1 and less than m - is added to each of the down-sampling feature maps generating an m-th transformed down-sampling feature map to an (mj)-th transformed down-sampling feature map by applying the first process and the second process; and (a3) the driving situation determination and guide device causes the image analysis module to input the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map into a feature pyramid network to input the feature The pyramid network generates an m-th up-sampling feature map to (mj)-th up-sampling feature map through a deconvolution operation with reference to the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map. input the (mj) up-sampling feature map to an object detection network or an instance segmentation network to cause the object detection network to detect an object on the surrounding video image, or to cause the instance segmentation network to detect the surrounding video image causing instance segmentation; A method comprising:

일례로서, 상기 (a3) 단계에서, 상기 영상분석모듈은, 상기 피처 피라미드 네트워크로 하여금 상기 제m 업 샘플링 피처맵을 다운 샘플링하여 제(m+j) 업 샘플링 피처맵을 생성하도록 하며, 상기 제(m-j) 업 샘플링 피처맵 내지 상기 제(m+j) 업 샘플링 피처맵을 상기 인스턴스 세그먼테이션 네트워크에 추가적으로 입력하여 상기 인스턴스 세그먼테이션 네트워크로 하여금 상기 제(m-j) 업 샘플링 피처맵 내지 상기 제(m+j) 업 샘플링 피처맵을 더 참조하여 상기 주변 영상 이미지를 인스턴스 세그먼테이션 하도록 하는 방법이 제공된다.As an example, in step (a3), the image analysis module causes the feature pyramid network to down-sample the m-th up-sampling feature map to generate an (m+j)-th up-sampling feature map, (mj) an up-sampling feature map to the (m+j)-th up-sampling feature map are additionally input to the instance segmentation network, so that the instance segmentation network causes the (mj)-th up-sampling feature map to the (m+j)-th up-sampling feature map ) A method for instance segmenting the surrounding video image with further reference to an up-sampling feature map is provided.

일례로서, 상기 (a3) 단계에서, 상기 영상분석모듈은, 상기 피처 피라미드 네트워크로 하여금 제i - 상기 i는 (m-(j+1)) 이상이며 m 이하인 정수임 - 업 샘플링 피처맵을 업 샘플링하여 특정 업 샘플링 피처맵을 생성하며, 상기 특정 업 샘플링 피처맵과 이에 대응되는 제(i-1) 다운 샘플링 피처맵을 콘케이트네이트하여 제(i-1) 업 샘플링 피처맵을 생성하도록 하는 방법이 제공된다.As an example, in step (a3), the image analysis module causes the feature pyramid network to up-sample the feature map with i-th-i is an integer greater than or equal to (m-(j+1)) and less than or equal to m-up-sampling to generate a specific up-sampling feature map, and concatenate the specific up-sampling feature map with the (i-1)-th down-sampling feature map corresponding thereto to generate the (i-1)-th up-sampling feature map this is provided

또한, 본 발명의 다른 태양에 따르면, 자동차의 주변 상황이 위험상황인지를 판단하고, 주행가이드를 생성하여 경보하여 주는 장치에 있어서, 인스트럭션들을 저장하는 적어도 하나의 메모리; 및 상기 인스트럭션들을 실행하기 위해 구성된 적어도 하나의 프로세서; 를 포함하고, 상기 프로세서가, (I) 자동차에 장착된 적어도 하나의 카메라로부터 상기 자동차의 주변 영상 이미지가 획득되면, 상기 주변 영상 이미지를 영상분석모듈로 입력하여 상기 영상분석모듈로 하여금 상기 주변 영상 이미지를 분석하여 상기 자동차의 주변에 존재하는 적어도 하나의 오브젝트에 대한 정보를 포함하는 주변 환경 정보를 출력하도록 하는 프로세스; (II) 상기 주변 환경 정보를 위험상황판단모듈로 입력하여 상기 위험상황판단모듈로 하여금 상기 주변 환경 정보를 딥러닝 연산하여 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 예측값들을 출력하도록 하며, 상기 예측값들을 참조하여 상기 자동차의 주변 상황이 특정 위험상황 카테고리에 해당되는지를 판단하도록 하는 프로세스; 및 (III) 상기 자동차의 주변 상황이 상기 기설정된 복수개의 위험상황 카테고리 중 상기 특정 위험상황 카테고리에 해당되는 것으로 판단되면, 주행가이드생성모듈로 하여금 상기 특정 위험상황 카테고리에 대응되는 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 생성하여 상기 특정 위험상황 카테고리에 해당되는 특정 위험상황을 상기 자동차의 운전자가 인지할 수 있도록 하는 프로세스; 를 수행하는 장치가 제공된다.In addition, according to another aspect of the present invention, there is provided an apparatus for judging whether a surrounding situation of a vehicle is a dangerous situation, and generating and alerting a driving guide, comprising: at least one memory for storing instructions; and at least one processor configured to execute the instructions. including, wherein the processor (I) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, inputs the surrounding video image to an image analysis module to cause the video analysis module to cause the surrounding image a process of analyzing an image to output surrounding environment information including information on at least one object existing in the vicinity of the vehicle; (II) input the surrounding environment information to the dangerous situation determination module so that the dangerous situation determination module performs a deep learning operation on the surrounding environment information to output predicted values as probabilities corresponding to each of a plurality of preset dangerous situation categories, a process of determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (III) when it is determined that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause specific visual driving guide information corresponding to the specific dangerous situation category and generating at least one of specific voice driving guide information so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category. There is provided an apparatus for performing the

일례로서, 소정의 데이터베이스에 복수개의 음성가이드템플릿 - 상기 음성가이드템플릿 각각은, 상황에 따라 그 내용이 변경되어 입력될 수 있는 부분인 적어도 하나의 슬롯(slot)이 존재하며, 상기 슬롯 각각에는 상기 주변 환경 정보에 포함된 적어도 하나의 오브젝트 각각에 대응되는 위치정보, 크기정보, 이동여부정보 및 종류정보 중 어느 하나의 정보가 입력될 수 있음 - 이 상기 기설정된 복수개의 위험상황 카테고리별로 분류되어 저장되어 있는 것을 특징으로 하고, 상기 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각의 위험상황으로 판별될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하는 것을 특징으로 하며, 상기 (III) 프로세스에서, 상기 프로세서는, 상기 주행가이드생성모듈로 하여금 (i) 상기 특정 위험상황 카테고리에 대한 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 특정 위험상황 카테고리에 대응되는 특정 음성가이드템플릿에 대한 정보를 참조하여, 상기 특정 음성가이드템플릿에 포함된 상기 슬롯 각각이 입력되어 완성된 상기 특정 음성주행가이드 정보를 생성하도록 하고, 상기 생성된 특정 음성주행가이드 정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, a plurality of audio guide templates in a predetermined database - each of the audio guide templates has at least one slot, which is a part whose contents can be changed and input according to circumstances, and in each of the slots, the Any one of location information, size information, movement status information, and type information corresponding to each of at least one object included in the surrounding environment information may be inputted - classified and stored according to a plurality of preset dangerous situation categories (III) In the process, the processor causes the driving guide generation module to (i) information on the specific dangerous situation category, (ii) information on a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the specific dangerous situation category. With reference to information on a specific voice guide template corresponding to a dangerous situation category, each of the slots included in the specific voice guide template is input to generate the completed specific voice driving guide information, and the generated specific voice driving guide information is generated. There is provided an apparatus characterized in that the guide information is directly provided to the driver or is transmitted to a predetermined driver's terminal and supported to be provided.

일례로서, 상기 주행가이드생성모듈은 소정의 TTS(Text-To-Speech) 엔진과 연동되어 있는 것을 특징으로 하고, 상기 (III) 프로세스에서, 상기 프로세서는, 상기 주행가이드생성모듈로 하여금 상기 생성된 특정 음성주행가이드 정보에 대하여 상기 TTS 엔진을 적용하여 특정 TTS 데이터를 생성하도록 하는 프로세스를 추가로 수행하여, 상기 특정 TTS 데이터를 재생한 특정 음성정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, the driving guide generating module is characterized in that it is linked with a predetermined text-to-speech (TTS) engine, and in the process (III), the processor causes the driving guide generating module to generate the generated A process of generating specific TTS data by applying the TTS engine to specific voice driving guide information is additionally performed, so that specific voice information reproduced from the specific TTS data is directly provided to the driver or to a predetermined driver terminal An apparatus is provided, characterized in that it supports to be provided by transmitting.

일례로서, 상기 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각의 위험상황으로 판별될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하는 것을 특징으로 하고, 상기 (II) 프로세스에서, 상기 프로세서는, 상기 위험상황판단모듈로 하여금 (i) 상기 특정 위험상황 카테고리 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 주변 환경 정보를 참조하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 중 상기 위험조건에 해당되는 위험오브젝트를 특정하고, 상기 위험오브젝트에 대한 정보를 포함하는 위험요인정보를 생성하도록 하는 프로세스를 추가로 수행하는 것을 특징으로 하는 장치가 제공된다.As an example, each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as each dangerous situation, and in the process (II), the The processor is configured to cause the dangerous situation determination module to refer to (i) the specific dangerous situation category information, (ii) information about a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the surrounding environment information. Further performing a process of specifying a risk object corresponding to the risk condition among at least one of the objects included in the surrounding image image and generating risk factor information including information on the risk object A device is provided.

일례로서, 상기 (III) 프로세스에서, 상기 프로세서가, 상기 위험요인정보를 상기 주행가이드모듈로 하여금 입력받도록 하여, 상기 주변 영상 이미지에 대하여 상기 위험오브젝트 각각에 해당되는 좌표에 대응되도록 기설정된 위험안내신호가 추가로 표시된 이미지를 상기 특정 시각주행가이드 정보로서 생성하도록 하고, 상기 생성된 특정 시각주행가이드 정보를 상기 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, in the process (III), the processor causes the driving guide module to receive the risk factor information, so as to correspond to the coordinates corresponding to each of the danger objects with respect to the surrounding image image. An image additionally displayed with a signal is generated as the specific visual driving guide information, and the generated specific visual driving guide information is directly provided to the driver or transmitted to a predetermined driver terminal and supported to be provided A device is provided.

일례로서, 상기 (I) 프로세스 이전에, 소정의 위험상황판단모듈 학습장치가, (i) 상기 자동차에 장착된 상기 카메라로부터 획득되거나, 별도로 준비된 적어도 하나의 제1 학습용 주변 영상 이미지 각각 - 상기 제1 학습용 주변 영상 이미지 각각은, 기설정된 위험상황 카테고리 중 상기 제1 학습용 주변 영상 이미지 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제1 GT(Ground Truth)로서 포함함 - 을 학습데이터로 하여, 상기 영상분석모듈로 하여금 상기 학습용 제1 주변 영상 이미지 각각을 입력받아 분석한 결과로서 제1 학습용 주변 환경 정보를 출력하도록 하고, 상기 출력된 제1 학습용 주변 환경 정보를 사용하여 상기 위험상황판단모듈에 대한 학습을 수행하거나, (ii) 별도로 준비된 적어도 하나의 제2 학습용 주변 환경 정보 - 상기 제2 학습용 주변 환경 정보 각각은, 그 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제2 GT(Ground Truth)로서 포함함 - 를 학습데이터로 사용하여, 상기 위험상황판단모듈에 대한 학습을 수행하는 것을 특징으로 하고, 상기 위험상황판단모듈의 학습은, 상기 위험상황판단모듈 학습장치가, (i) 상기 위험상황판단모듈로 하여금 상기 제1 학습용 주변 환경 정보를 입력받아 상기 소정의 딥러닝 연산을 수행하도록 하고, 그 결과로서 출력되는 상기 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제1 학습용 예측값들에 대한 정보를 상기 제1 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행하거나, (ii) 상기 위험상황판단모듈로 하여금 상기 제2 학습용 주변 환경 정보를 입력받아 상기 소정의 딥러닝 연산을 수행하도록 하고, 그 결과로서 출력되는 상기 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제2 학습용 예측값들에 대한 정보를 상기 제2 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어지는 것을 특징으로 하는 장치가 제공된다.As an example, before the (I) process, a predetermined dangerous situation determination module learning device is configured to: 1 Each of the surrounding image images for learning includes information about a specific correct dangerous situation category corresponding to each of the first surrounding image images among the preset dangerous situation categories as the first GT (Ground Truth) - as learning data , cause the image analysis module to receive each of the first surrounding image images for learning and output first surrounding environment information for learning as a result of analysis, and use the outputted first surrounding environment information for learning to determine the dangerous situation module or (ii) at least one piece of separately prepared second learning surrounding environment information - each of the second learning surrounding environment information includes information on a specific correct dangerous situation category corresponding to the second GT ( Including as ground truth) as learning data, characterized in that learning for the dangerous situation determination module is performed, and the learning of the dangerous situation determination module is performed by the risk situation determination module learning device, (i ) Let the dangerous situation determination module receive the first learning surrounding environment information and perform the predetermined deep learning operation, and the first as a probability corresponding to each of the plurality of preset dangerous situation categories output as a result A process of optimizing a plurality of parameters included in the dangerous situation determination module is performed so that the difference is minimized by comparing the information on the learning prediction values with the first GT, or (ii) the dangerous situation determination module causes the first GT 2 Receive information on the surrounding environment for learning to perform the predetermined deep learning operation, and output as a result information on second prediction values for learning as a probability corresponding to each of the plurality of preset risk situation categories, the second In order to minimize the difference compared to the GT, the An apparatus is provided, characterized in that it is achieved by performing a process of optimizing a plurality of parameters.

일례로서, 상기 (I) 프로세스에서, 상기 프로세서가, 상기 영상분석모듈로 하여금 소정의 알고리즘을 사용하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 각각을 검출하고, 상기 검출된 오브젝트 각각의 위치정보, 크기정보, 이동여부정보 및 종류정보 중 적어도 일부를 포함하는 정보를 상기 주변 환경 정보로서 출력하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, in the process (I), the processor causes the image analysis module to detect each of the at least one object included in the surrounding image image using a predetermined algorithm, and a position of each detected object There is provided an apparatus for outputting information including at least some of information, size information, movement status information, and type information as the surrounding environment information.

일례로서, 상기 (I) 프로세스 이전에, 소정의 영상분석모듈 학습장치가, 상기 영상분석모듈에 대하여 상기 소정의 실시간 오브젝트 디텍션(real-time object detection) 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션(real-time instance segmentation) 알고리즘을 바탕으로 한 학습을 수행하는 것을 특징으로 하며, 상기 영상분석모듈의 학습은, 상기 영상분석모듈 학습장치가, 상기 자동차에 장착된 상기 카메라로부터 획득되거나, 별도로 준비된 적어도 하나의 제2 학습용 주변 영상 이미지 - 상기 제2 학습용 주변 영상 이미지 각각은, 그 각각에 포함된 적어도 하나의 학습용 오브젝트 각각에 대한 정답 오브젝트 정보를 제3 GT(Ground Truth)로서 포함함 - 을 학습데이터로 사용하여, 상기 영상분석모듈로 하여금 상기 제2 학습용 주변 영상 이미지를 입력받아 상기 소정의 실시간 오브젝트 디텍션 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션 알고리즘을 사용하여 분석하도록 하고, 그 결과로서 출력되는 상기 학습용 오브젝트 각각에 대한 정보를 상기 제3 GT와 비교하여 그 차이가 최소화되도록 상기 영상분석모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어지는 것을 특징으로 하는 장치가 제공된다.As an example, before the (I) process, a predetermined image analysis module learning apparatus may perform the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation with respect to the image analysis module. time instance segmentation) algorithm, characterized in that the learning of the image analysis module is obtained by the image analysis module learning apparatus from the camera mounted on the vehicle, or at least one separately prepared A second surrounding image image for learning - Each of the second surrounding image images for learning includes correct object information for each of at least one learning object included in each as a third GT (Ground Truth) - as learning data In this way, the image analysis module receives the second learning peripheral image image and analyzes it using the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm, and as a result, There is provided an apparatus characterized in that by performing a process of optimizing a plurality of parameters included in the image analysis module to minimize the difference by comparing the information on the third GT.

일례로서, 상기 프로세서가, 상기 영상분석모듈로 하여금 소정의 실시간 오브젝트 디텍션(real-time object detection) 알고리즘 또는 소정의 실시간 인스턴스 세그먼테이션(real-time instance segmentation) 알고리즘을 사용하여 상기 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 각각을 검출하거나 상기 주변 영상 이미지를 인스턴스 세그먼테이션하도록 하는 것을 특징으로 하고, 상기 (I) 프로세스는, (I-1) 상기 주변 영상 이미지가 획득되면, 상기 프로세서가, 상기 영상분석모듈로 하여금 상기 주변 영상 이미지를 ResNET의 백본 블록으로 입력하여 상기 백본 블록으로 하여금 상기 주변 영상 이미지를 순차적으로 컨볼루션 연산하여 다운 샘플링된 제1 다운 샘플링 피처맵 내지 제m - 상기 m은 2 이상의 정수임 - 다운 샘플링 피처맵을 출력하도록 하는 서브프로세스; (I-2) 상기 프로세서가, 상기 영상분석모듈로 하여금, 특정 다운 샘플링 피처맵을 제1 (1*1) 컨볼루션 레이어로 입력하여 상기 제1 (1*1) 컨볼루션 레이어로 하여금 상기 특정 다운 샘플링 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제1 피처맵을 생성하도록 하고, 상기 제1 피처맵을 (1*r) - 상기 r은 1 이상의 정수임 - 확장 비율을 가지는 (k*k) - 상기 k는 2 이상의 정수임 - 커널을 포함하는 제1 (k*k) 컨볼루션 레이어 내지 (n*r) - 상기 n은 2 이상의 정수임 - 확장 비율을 가지는 (k*k) 커널을 포함하는 제n (k*k) 컨볼루션 레이어로 각각 입력하여 상기 제1 (k*k) 컨볼루션 레이어 내지 상기 제n (k*k) 컨볼루션 레이어 각각으로 하여금 상기 제1 피처맵의 채널들을 적어도 2개의 그룹으로 구분하며, 구분된 적어도 2개의 그룹에 대응되는 각각의 제1 피처맵들을 (1*r) 확장 비율에 의한 (k*k) 컨볼루션 연산 내지 (n*r) 확장 비율에 의한 (k*k) 컨볼루션 연산하여 제2_1 피처맵 내지 제2_n 피처맵을 생성하도록 하는 제1 프로세스와, 상기 제2_1 피처맵 내지 상기 제2_n 피처맵 각각을 제2_1 (1*1) 컨볼루션 레이어 내지 제2_n (1*1) 컨볼루션 레이어로 입력하여 상기 제2_1 (1*1) 컨볼루션 레이어 내지 상기 제2_n (1*1) 컨볼루션 레이어 각각으로 하여금 각각의 상기 제2_1 피처맵 내지 상기 제2_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제3_1 피처맵 내지 제3_n 피처맵을 생성하도록 하고, 상기 제3_1 피처맵 내지 상기 제3_n 피처맵을 콘케이트네이트하여 제3 (1*1) 컨볼루션 레이어로 입력하여 상기 제3 (1*1) 컨볼루션 레이어로 하여금 콘케이트네이트된 상기 제3_1 피처맵 내지 상기 제3_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제4 피처맵을 생성하도록 하며, 상기 특정 다운 샘플링 피처맵과 상기 제4 피처맵을 콘케이트네이트하여 변환된 다운 샘플링 피처맵을 생성하는 제2 프로세스를 통해, 상기 제m 다운 샘플링 피처맵 내지 제(m-j) - 상기 j는 1 이상이며 m 미만인 정수임 - 다운 샘플링 피처맵 각각에 상기 제1 프로세스와 상기 제2 프로세스를 적용하여 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 생성하는 서브프로세스; 및 (I-3) 상기 프로세서가, 상기 영상분석모듈로 하여금, 상기 제m 변환된 다운 샘플링 피처맵 내지 상기 제(m-j) 변환된 다운 샘플링 피처맵을 피처 피라미드 네트워크로 입력하여 상기 피처 피라미드 네트워크로 하여금 상기 제m 변환된 다운 샘플링 피처맵 내지 상기 제(m-j) 변환된 다운 샘플링 피처맵을 참조한 디컨볼루션 연산을 통해 제m 업 샘플링 피처맵 내지 제(m-j) 업 샘플링 피처맵을 생성하도록 하며, 상기 제(m-j) 업 샘플링 피처맵을 오브젝트 디텍션 네트워크 또는 인스턴스 세그먼테이션 네트워크로 입력하여 상기 오브젝트 디텍션 네트워크로 하여금 상기 주변 영상 이미지 상의 오브젝트를 검출하도록 하거나 상기 인스턴스 세그먼테이션 네트워크로 하여금 상기 주변 영상 이미지를 인스턴스 세그먼테이션하도록 하는 서브프로세스; 를 수행하는 장치가 제공된다.As an example, the processor causes the image analysis module to use a predetermined real-time object detection algorithm or a predetermined real-time instance segmentation algorithm to be included in the surrounding video image. Detecting each of the at least one object or performing instance segmentation of the surrounding video image, wherein the (I) process includes: (I-1) When the surrounding video image is obtained, the processor performs the image analysis The first down-sampling feature map to mth down-sampled by a module inputting the surrounding video image to the backbone block of ResNET and causing the backbone block to sequentially perform a convolution operation on the surrounding video image - wherein m is an integer greater than or equal to 2 - a sub-process for outputting a down-sampling feature map; (I-2) the processor causes the image analysis module to input a specific down-sampling feature map as a first (1*1) convolutional layer to cause the first (1*1) convolutional layer to cause the specific downsampling feature map (1*1) convolution operation on the down-sampling feature map to generate a first feature map with the number of channels adjusted, and (1*r) - where r is an integer greater than or equal to 1 - has an expansion ratio (k*k) - wherein k is an integer greater than or equal to 2 - a first (k*k) convolutional layer including a kernel to (n*r) - where n is an integer greater than or equal to 2 - (k*k) with an extension ratio Each of the first (k*k) convolutional layer to the nth (k*k) convolutional layer is inputted to the nth (k*k) convolutional layer including the kernel, so that each of the first feature map The channels are divided into at least two groups, and each of the first feature maps corresponding to the at least two groups is expanded by a (k*k) convolution operation or (n*r) by a (1*r) extension ratio. A first process for generating a 2_1 th feature map to a 2_n th feature map by performing a (k*k) convolution operation by a ratio, and a 2_1 (1*1) The 2_1 (1*1) convolution layer to the 2_n (1*1) convolution layer are inputted as the 2_n (1*1) convolutional layer to each of the 2_1 feature maps. to (1*1) convolution operation on the 2_n-th feature map to generate 3_1 to 3_n feature maps with the number of channels adjusted, and concatenate the 3_1 to 3_n feature maps to By inputting the third (1*1) convolutional layer, the third (1*1) convolution layer performs a (1*1) convolution operation on the concatenated 3_1 to 3_n feature maps. to generate a fourth feature map with an adjusted number of channels, and concatenate the specific down-sampling feature map and the fourth feature map Through a second process of generating a down-sampling feature map transformed by and a sub-process for generating an m-th transformed down-sampling feature map to an (mj)-th transformed down-sampling feature map by applying the second process; and (I-3) the processor, causing the image analysis module to input the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map into the feature pyramid network to the feature pyramid network. to generate an m-th up-sampling feature map to (mj)-th up-sampling feature map through a deconvolution operation with reference to the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map, input the (mj)th up-sampling feature map to an object detection network or an instance segmentation network to cause the object detection network to detect an object on the surrounding video image or to cause the instance segmentation network to instance segment the surrounding video image subprocess; There is provided an apparatus for performing the

일례로서, 상기 (I-3) 프로세스에서, 상기 영상분석모듈은, 상기 피처 피라미드 네트워크로 하여금 상기 제m 업 샘플링 피처맵을 다운 샘플링하여 제(m+j) 업 샘플링 피처맵을 생성하도록 하며, 상기 제(m-j) 업 샘플링 피처맵 내지 상기 제(m+j) 업 샘플링 피처맵을 상기 인스턴스 세그먼테이션 네트워크에 추가적으로 입력하여 상기 인스턴스 세그먼테이션 네트워크로 하여금 상기 제(m-j) 업 샘플링 피처맵 내지 상기 제(m+j) 업 샘플링 피처맵을 더 참조하여 상기 주변 영상 이미지를 인스턴스 세그먼테이션 하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, in the process (I-3), the image analysis module causes the feature pyramid network to down-sample the m-th up-sampling feature map to generate a (m+j)-th up-sampling feature map, The (mj)th up-sampling feature map to the (m+j)th up-sampling feature map are additionally input to the instance segmentation network to cause the instance segmentation network to make the (mj)th up-sampling feature map to the (m)th up-sampling feature map. +j) An apparatus is provided, characterized in that the surrounding video image is instance-segmented by further referring to an up-sampling feature map.

일례로서, 상기 (I-3) 단계에서, 상기 영상분석모듈은, 상기 피처 피라미드 네트워크로 하여금 제i - 상기 i는 (m-(j+1)) 이상이며 m 이하인 정수임 - 업 샘플링 피처맵을 업 샘플링하여 특정 업 샘플링 피처맵을 생성하며, 상기 특정 업 샘플링 피처맵과 이에 대응되는 제(i-1) 다운 샘플링 피처맵을 콘케이트네이트하여 제(i-1) 업 샘플링 피처맵을 생성하도록 하는 것을 특징으로 하는 장치가 제공된다.As an example, in step (I-3), the image analysis module causes the feature pyramid network to generate an up-sampling feature map for i-th i is an integer greater than or equal to (m-(j+1)) and less than or equal to m. up-sampling to generate a specific up-sampling feature map, and concatenating the specific up-sampling feature map and a corresponding (i-1)-th down-sampling feature map to generate an (i-1)-th up-sampling feature map There is provided an apparatus, characterized in that

이 외에도, 본 발명의 방법을 실행하기 위한 컴퓨터 프로그램을 기록하기 위한 컴퓨터 판독 가능한 기록 매체가 더 제공된다.In addition to this, a computer-readable recording medium for recording a computer program for executing the method of the present invention is further provided.

본 발명에 의하면, 다음과 같은 효과가 있다.According to the present invention, the following effects are obtained.

본 발명은, 자동차의 주변 상황이 위험상황인지를 판단하기 위한 주변 상황 정보를 생성하기 위하여, 자동차에 장착된 카메라로부터 획득되는 주변 상황 이미지를 분석하여 그에 포함된 적어도 하나의 오브젝트를 검출할 수 있는 방법을 제공할 수 있는 효과가 있다.The present invention is capable of detecting at least one object included in the surrounding situation image obtained from a camera mounted on the vehicle to generate surrounding situation information for determining whether the surrounding situation of the vehicle is a dangerous situation. There is an effect that can provide a method.

또한, 본 발명은, 주변 상황 정보를 입력받아 자동차의 주변 환경에 해당되는 위험상황을 판단할 수 있는 방법을 제공할 수 있는 효과가 있다.In addition, the present invention has an effect of providing a method for determining a dangerous situation corresponding to the surrounding environment of the vehicle by receiving the surrounding situation information.

또한, 본 발명은, 자동차의 주변 환경이 위험상황으로 판단되면 그에 대응되는 음성주행가이드 정보 및 시각주행가이드 정보 중 적어도 하나를 생성하고, 이를 사용하여 운전자에게 경보하여 줄 수 있는 방법을 제공할 수 있는 효과가 있다.In addition, the present invention can provide a method of generating at least one of voice driving guide information and visual driving guide information corresponding to the surrounding environment of the vehicle as a dangerous situation, and alerting the driver using the generated information. there is an effect

또한, 본 발명은, 입력된 주변 상황 이미지에 대하여 실시간 오브젝트 디텍션 또는 실시간 인스턴스 세그멘테이션을 더욱 정확하고 빠르게 수행할 수 있는 컨볼루셔널 뉴럴 네트워크를 제공할 수 있는 효과가 있다.In addition, the present invention is effective in providing a convolutional neural network capable of more accurately and quickly performing real-time object detection or real-time instance segmentation with respect to an input surrounding image.

도 1은 본 발명의 일 실시예에 따른, 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하고 경보하여 주기 위한 주행상황판단및가이드장치를 개략적으로 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른, 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하고 경보하여 주기 위한 전체 시스템을 개략적으로 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른, 자동차에 장착된 카메라로부터 획득된 주변 영상 이미지에 대하여 실시간 오브젝트 디텍션(real-time object detection) 또는 실시간 인스턴스 세그먼테이션(real-time instance segmentation)을 수행하는 방법을 개략적으로 나타내는 도면이다.
도 4는 본 발명의 일 실시예에 따른, 자동차에 장착된 카메라로부터 획득된 주변 영상 이미지에 대하여 실시간 오브젝트 디텍션(real-time object detection) 또는 실시간 인스턴스 세그먼테이션(real-time instance segmentation)을 수행하기 위한 컨볼루셔널 뉴럴 네트워크의 피처맵을 생성하는 방법을 개략적으로 나타내는 도면이다.
도 5는 본 발명의 일 실시예에 따른, 자동차에 장착된 카메라로부터 획득된 주변 영상 이미지에 대하여 실시간 오브젝트 디텍션(real-time object detection) 또는 실시간 인스턴스 세그먼테이션(real-time instance segmentation)을 수행하기 위한 컨볼루셔널 뉴럴 네트워크의 피처맵을 생성하는 방법에서 서로 다른 확장 비율에 의해 컨볼루션 연산을 수행하는 상태를 개략적으로 나타내는 도면이다.
도 6a는 본 발명의 일 실시예에 따른, 학습용 주변 영상 이미지를 학습데이터로 하여 위험상황판단모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이다.
도 6b는 본 발명의 일 실시예에 따른, 별도로 준비된 학습용 주변 환경 정보를 학습데이터로 하여 위험상황판단모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이다.
도 7은 본 발명의 일 실시예에 따른, 학습용 주변 영상 이미지를 학습데이터로 하여 영상분석모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이다.
도 8a는 본 발명의 일 실시예에 따른, 주행차로 우측 전방에 버스정류장 및 정차된 버스가 존재하는 상황에 해당되는 경우의 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8b는 본 발명의 일 실시예에 따른, 주행차로의 우측차로에 대형트럭이 주행중인 상황에 해당되는 경우의 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8c는 본 발명의 일 실시예에 따른, 주행차로 전방에 택시가 주행중인 상황에 해당되는 경우의 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8d는 본 발명의 일 실시예에 따른, 주행차로 우측의 차로변에 자전거 및 오토바이가 존재하는 상황에 해당되는 경우의 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8e는 본 발명의 일 실시예에 따른, 주행차로 우측의 차로변에 복수의 자동차가 주차되어 있는 상황에 해당되는 경우의 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8f는 본 발명의 일 실시예에 따른, 주행차로의 전방 및 우측차로에 다른 자동차가 존재하여 차선변경이 불가능한 상황에서 좌측차로에 많은 차량이 주행중인 상황에 해당되는 주변 영상 이미지를 예시적으로 나타내는 도면이다.
도 8g는 본 발명의 일 실시예에 따른, 전방에 4지 교차로 및 횡단보도가 존재하는 상황에 해당되는 주변 영상 이미지를 예시적으로 나타내는 도면이다.1 is a diagram schematically illustrating a driving situation determination and guide device for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide, and giving an alarm, according to an embodiment of the present invention.
FIG. 2 is a diagram schematically illustrating an entire system for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide, and providing an alarm according to an embodiment of the present invention.
3 is a method of performing real-time object detection or real-time instance segmentation on a surrounding video image obtained from a camera mounted on a vehicle according to an embodiment of the present invention; is a diagram schematically showing
4 is a diagram for performing real-time object detection or real-time instance segmentation on a surrounding video image obtained from a camera mounted on a vehicle according to an embodiment of the present invention; It is a diagram schematically illustrating a method of generating a feature map of a convolutional neural network.
5 is a diagram for performing real-time object detection or real-time instance segmentation on a surrounding video image obtained from a camera mounted on a vehicle according to an embodiment of the present invention; It is a diagram schematically showing a state in which a convolution operation is performed by different expansion ratios in a method of generating a feature map of a convolutional neural network.
6A is a flowchart schematically illustrating a process in which learning of a dangerous situation determination module is performed using an image of a surrounding for learning as learning data, according to an embodiment of the present invention.
6B is a flowchart schematically illustrating a process in which learning of a dangerous situation determination module is performed using, as learning data, surrounding environment information for learning separately prepared according to an embodiment of the present invention.
7 is a flowchart schematically illustrating a process in which learning of an image analysis module is performed by using an image of a surrounding image for learning as learning data, according to an embodiment of the present invention.
8A is a view illustrating an image of a surrounding area in a case in which a bus stop and a stopped bus exist in front of a right front of a driving lane according to an embodiment of the present invention.
FIG. 8B is a view exemplarily showing a surrounding image image when a large truck is driving in a right lane of the driving lane according to an embodiment of the present invention.
FIG. 8C is a view illustrating an image of a surrounding area when a taxi is driving in front of a driving lane according to an embodiment of the present invention.
FIG. 8D is a view exemplarily illustrating a surrounding image image in a case in which bicycles and motorcycles exist on the right side of the driving lane according to an embodiment of the present invention.
FIG. 8E is a view exemplarily illustrating a surrounding image image when a plurality of cars are parked on the right side of the driving lane according to an embodiment of the present invention.
8F is an exemplary view of a surrounding video image corresponding to a situation in which many vehicles are driving in the left lane in a situation where it is impossible to change lanes due to the presence of other vehicles in the front and right lanes of the driving lane according to an embodiment of the present invention; It is a drawing showing
FIG. 8G is a view exemplarily showing a surrounding video image corresponding to a situation in which a four-way intersection and a crosswalk exist in front, according to an embodiment of the present invention.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0010] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0010] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0023] Reference is made to the accompanying drawings, which show by way of illustration specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different but need not be mutually exclusive. For example, certain shapes, structures, and characteristics described herein with respect to one embodiment may be embodied in other embodiments without departing from the spirit and scope of the invention.

또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다.In addition, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the present invention. Accordingly, the detailed description set forth below is not intended to be taken in a limiting sense, and the scope of the invention, if properly described, is limited only by the appended claims, along with all scope equivalents to those claimed. Like reference numerals in the drawings refer to the same or similar functions throughout the various aspects.

이하, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 하기 위하여, 본 발명의 바람직한 실시예들에 관하여 첨부된 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings in order to enable those of ordinary skill in the art to easily practice the present invention.

도 1은 본 발명의 일 실시예에 따른, 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하고 경보하여 주기 위한 주행상황판단및가이드장치를 개략적으로 나타내는 도면이다.1 is a diagram schematically illustrating a driving situation determination and guide device for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide, and giving an alarm, according to an embodiment of the present invention.

도 1을 참조하면, 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하고 경보하여 주기 위한 주행상황판단및가이드장치(10)는 메모리(11) 및 프로세서(12)를 포함할 수 있다. 이 때, 메모리(11)는, 프로세서(12)의 인스트럭션들을 저장할 수 있는데, 구체적으로, 인스트럭션들은 주행상황판단및가이드장치(10)로 하여금 특정의 방식으로 기능하게 하기 위한 목적으로 생성되는 코드로서, 컴퓨터 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장될 수 있다. 인스트럭션들은 본 발명의 명세서에서 설명되는 기능들을 실행하기 위한 프로세스들을 수행할 수 있다.Referring to FIG. 1 , the driving situation determination and guide device 10 for determining whether the surrounding situation of the vehicle is a dangerous situation, generating a driving guide, and giving an alert may include a memory 11 and a processor 12 . . At this time, the memory 11 may store the instructions of the processor 12. Specifically, the instructions are generated for the purpose of causing the driving situation determination and guide device 10 to function in a specific manner. , stored in a computer usable or computer readable memory that may be directed to a computer or other programmable data processing equipment. The instructions may perform processes for performing the functions described herein.

그리고, 프로세서(12)는, MPU(Micro Processing Unit) 또는 CPU(Central Processing Unit), 캐쉬 메모리(Cache Memory), 데이터 버스(Data Bus) 등의 하드웨어 구성을 포함할 수 있다. 또한, 운영체제, 특정 목적을 수행하는 애플리케이션의 소프트웨어 구성을 포함할 수 있다.In addition, the processor 12 may include a hardware configuration such as a micro processing unit (MPU) or a central processing unit (CPU), a cache memory, and a data bus. It may also include an operating system, a software configuration of an application that performs a specific purpose.

다음으로, 주행상황판단및가이드장치(10)는 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하여 경보하는 데 사용되는 정보를 포함하는 데이터베이스(미도시)와 연동될 수 있다. 이 때, 데이터베이스는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리), 램(Random Access Memory, RAM), SRAM(Static Random Access Memory), 롬(ReadOnly Memory, ROM), EEPROM(Electrically Erasable Programmable ReadOnly Memory), PROM(Programmable ReadOnly Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있으며, 이에 한정되지 않으며 데이터를 저장할 수 있는 모든 매체를 포함할 수 있다. 또한, 데이터베이스는 주행상황판단및가이드장치(10)의 내부에 설치되어 데이터를 전송하거나 수신되는 데이터를 기록할 수도 있으며, 이는 발명의 실시 조건에 따라 달라질 수 있다.Next, the driving situation determination and guide device 10 may be linked with a database (not shown) including information used to determine whether the surrounding situation of the vehicle is a dangerous situation, and generate a driving guide to alert. At this time, the database is a flash memory type (flash memory type), hard disk type (hard disk type), multimedia card micro type (multimedia card micro type), card type memory (eg SD or XD memory), RAM ( At least one of Random Access Memory (RAM), Static Random Access Memory (SRAM), ReadOnly Memory (ROM), Electrically Erasable Programmable ReadOnly Memory (EEPROM), Programmable ReadOnly Memory (PROM), magnetic memory, magnetic disk, and optical disk It may include any type of storage medium, but is not limited thereto, and may include any medium capable of storing data. In addition, the database may be installed inside the driving situation judgment and guide device 10 to transmit data or record data received, which may vary depending on the implementation conditions of the invention.

이와 같은 주행상황판단및가이드장치(10)를 사용하여 자동차(1)의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하여 경보하기 위한 일 예로서의 전체 시스템을 별도의 도면(도 2)를 참조하여 설명하면 다음과 같다.Refer to a separate drawing (FIG. 2) for the entire system as an example for determining whether the surrounding situation of the vehicle 1 is a dangerous situation using such a driving situation determination and guide device 10, and generating a driving guide to alert. It is explained as follows.

도 2는 본 발명의 일 실시예에 따른, 자동차의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하고 경보하여 주기 위한 전체 시스템을 개략적으로 나타내는 도면이다.FIG. 2 is a diagram schematically illustrating an entire system for determining whether a surrounding situation of a vehicle is a dangerous situation, generating a driving guide, and providing an alarm according to an embodiment of the present invention.

도 2를 참조하면, 주행상황판단및가이드장치(10)가 자동차(1)의 주변 상황이 위험상황인지를 판단하고 주행가이드를 생성하여 경보하는 과정은, 우선 자동차에 장착된 적어도 하나의 카메라(20)로부터 해당 자동차(1)의 주변 영상 이미지가 획득되는 것으로부터 시작된다. 이 때, 주행상황판단및가이드장치(10)는 영상분석모듈(30), 위험상황판단모듈(40), 주행가이드생성모듈(50) 각각을 포함할 수 있고, 각각의 모듈로 하여금 획득된 주변 영상 이미지 또는 다른 모듈에서 출력된 데이터를 입력받아 소정의 프로세스를 수행하도록 할 수 있다. 또한, 도 2에서는 주행상황판단및가이드장치(10)가 각각의 모듈을 포함하고 있는 것으로 도시되어 있으나, 이와는 달리 각각의 모듈은 별도의 장치로서 주행상황판단및가이드장치(10)와 연동되어 있을 수도 있으며, 이는 발명의 실시 조건에 따라 변형되어 이루어질 수 있다. 그리고, 본 발명의 실시 과정에서 생성될 수 있는 음성주행가이드가 TTS(Text-To-Speech) 방식으로 음성으로 변환되어 운전자에게 제공되는 경우, 이를 위한 TTS 엔진(51)이 추가로 포함되어 주행가이드생성모듈(50)과 연동될 수 있다.Referring to FIG. 2 , the process of determining whether the driving situation determination and guidance device 10 is a dangerous situation around the vehicle 1 and generating and alerting the driving guide is performed by first using at least one camera mounted on the vehicle ( 20), it starts with acquiring a surrounding video image of the corresponding vehicle 1 . At this time, the driving situation determination and guide device 10 may include an image analysis module 30 , a dangerous situation determination module 40 , and a driving guide generation module 50 , respectively, and allows each module to obtain A predetermined process may be performed by receiving a video image or data output from another module. In addition, although it is shown that the driving situation determination and guide device 10 includes each module in FIG. 2 , each module is a separate device and is interlocked with the driving situation determination and guide device 10 . Also, this may be modified according to the operating conditions of the invention. And, when the voice driving guide that can be generated in the practice of the present invention is converted to voice in a text-to-speech (TTS) method and provided to the driver, a TTS engine 51 for this purpose is additionally included to guide the driving. It may be linked with the generation module 50 .

다음으로, 주행상황판단및가이드장치(10)는 획득된 주변 영상 이미지를 영상분석모듈(30)로 입력하여 해당 영상분석모듈(30)로 하여금 입력된 주변 영상 이미지를 분석하여 자동차(1)의 주변에 존재하는 적어도 하나의 오브젝트에 대한 정보를 포함하는 주변 환경 정보를 출력하도록 할 수 있다. 이 때, 상기 오브젝트는, 자동차의 주변에서 주행중이거나 주정차중인 다른 자동차 및 이동체, 보행자, 자동차에 영향을 미칠 수 있는 장애물 등이 포함될 수 있고, 자동차의 진행방향에서 확인되는 신호등, 횡단보도, 차도에 표시된 차선, 차도와 인도의 경계로서의 연석 등도 포함될 수도 있으며, 발명의 실시 조건에 따라 자동차 주변의 다양한 존재가 그 대상이 될 수 있고, 상기 주변 환경 정보는, 주행상황판단및가이드장치(10)가 영상분석모듈(30)로 하여금 소정의 알고리즘을 사용하여 주변 영상 이미지에 포함된 적어도 하나의 상기 오브젝트 각각을 검출하도록 하고, 검출된 오브젝트 각각의 위치정보, 크기정보, 이동여부정보 및 종류정보 중 적어도 일부를 포함하는 정보를 상기 주변 환경 정보로서 출력하도록 한 것일 수 있다.Next, the driving situation determination and guide device 10 inputs the acquired surrounding image image to the image analysis module 30, and causes the corresponding image analysis module 30 to analyze the input surrounding image image of the vehicle 1 It is possible to output surrounding environment information including information on at least one object existing around. At this time, the object may include other vehicles, moving objects, pedestrians, and obstacles that may affect the vehicle while driving or parked around the vehicle. A curb as a boundary between a marked lane, a roadway and a sidewalk may also be included, and various existences around the vehicle may be the subject according to the implementation conditions of the invention. The image analysis module 30 detects each of the at least one object included in the surrounding video image by using a predetermined algorithm, and at least one of location information, size information, movement status information, and type information of each detected object. Information including a part may be output as the surrounding environment information.

이때, 영상분석모듈(30)은 카메라(20)로부터 획득되는 주변 영상 이미지를 분석하기 위하여 컨볼루셔널 뉴럴 네트워크(Convolutional Neural Network; CNN) 기반의 실시간 오브젝트 디텍션 알고리즘 또는 실시간 인스턴스 세그먼테이션 알고리즘을 이용할 수 있으며, 이를 좀 더 상세히 설명하면 다음과 같다.At this time, the image analysis module 30 may use a convolutional neural network (CNN)-based real-time object detection algorithm or real-time instance segmentation algorithm to analyze the surrounding image image obtained from the camera 20, , which will be described in more detail as follows.

도 3을 참조하면, 주변 영상 이미지가 획득되면, 영상 분석 모듈(30)는 주변 영상 이미지를 ResNET의 백본 블록(150)으로 입력하여 백본 블록(150)으로 하여금 이미지를 순차적으로 컨볼루션 연산하여 다운 샘플링된 제1 다운 샘플링 피처맵 내지 제m 다운 샘플링 피처맵을 출력하도록 한다. 이때, m은 2 이상의 정수일 수 있다. 그리고, 도 2는 설명의 편의를 위하여 제5 다운 샘플링 피처맵까지만 도시한 것이다.Referring to FIG. 3 , when the surrounding video image is acquired, the video analysis module 30 inputs the surrounding video image to the backbone block 150 of ResNET, and causes the backbone block 150 to sequentially perform a convolution operation to download the image. The sampled first down-sampling feature map to the m-th down-sampling feature map are output. In this case, m may be an integer of 2 or more. Also, FIG. 2 shows only the fifth down-sampling feature map for convenience of explanation.

다음으로, 영상 분석 모듈(30)는 제m 다운 샘플링 피처맵 내지 제(m-j) 다운 샘플링 피처맵 각각을 각각의 바틀넥 블록(200)으로 입력하여 각각의 바틀넥 블록(200)으로 하여금 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 생성하도록 한다.Next, the image analysis module 30 inputs each of the m-th down-sampling feature map to the (mj)-th down-sampling feature map to each bottleneck block 200 so that each bottleneck block 200 causes the m-th down-sampling feature map. The transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map are generated.

즉, 도 4를 참조하여 바틀넥 블록(200)의 동작을 설명하면 다음과 같다.That is, the operation of the bottleneck block 200 will be described with reference to FIG. 4 as follows.

영상 분석 모듈(30)는 제m 다운 샘플링 피처맵 내지 제(m-j) 다운 샘플링 피처맵 각각에 대응되는 특정 다운 샘플링 피처맵을 제1 (1*1) 컨볼루션 레이어(210)로 입력한다.The image analysis module 30 inputs a specific down-sampling feature map corresponding to each of the m-th down-sampling feature map to the (m-j)-th down-sampling feature map as the first (1*1) convolutional layer 210 .

그러면, 제1 (1*1) 컨볼루션 레이어(210)는 특정 다운 샘플링 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제1 피처맵을 생성하여 줄 수 있다.Then, the first (1*1) convolutional layer 210 may perform a (1*1) convolution operation on a specific down-sampling feature map to generate a first feature map with an adjusted number of channels.

이때, 후속 프로세스에서의 연산량을 줄이기 위하여 제1 피처맵의 채널수는 특정 다운 샘플링 피처맵보다 적게 되도록 할 수 있다. 일 예로, 특정 다운 샘플링 피처맵의 채널수를 C라 할경우, 제1 피처맵의 채널수는 C/S일 수 있다. 여기에서, C는 1 이상의 상수이고, S는 2 이상의 정수일 수 있으며, C는 S의 배수일 수 있다.In this case, in order to reduce the amount of computation in the subsequent process, the number of channels of the first feature map may be smaller than that of a specific downsampling feature map. For example, if the number of channels of a specific down-sampling feature map is C, the number of channels of the first feature map may be C/S. Here, C may be a constant of 1 or more, S may be an integer of 2 or more, and C may be a multiple of S.

다음으로, 영상 분석 모듈(30)은 제1 피처맵을 (1*r) 확장 비율을 가지는 (k*k) 커널을 포함하는 제1 (k*k) 컨볼루션 레이어(220-1) 내지 (n*r) 확장 비율을 가지는 (k*k) 커널을 포함하는 제n (k*k) 컨볼루션 레이어(220-n)로 각각 입력한다. 여기에서, r은 1 이상의 정수이며, k는 1 이상의 정수이고, n은 2 이상의 정수일 수 있다. 참고로, 도 2에서는 r이 1이며, k는 3인 경우를 도시한 것이다Next, the image analysis module 30 converts the first feature map to first (k*k) convolutional layers 220-1 to ( n*r) is input to the n-th (k*k) convolutional layer 220-n including a (k*k) kernel having an extension ratio, respectively. Here, r may be an integer of 1 or more, k may be an integer of 1 or more, and n may be an integer of 2 or more. For reference, in FIG. 2, r is 1 and k is 3

그러면, 제1 (k*k) 컨볼루션 레이어(220-1) 내지 제n (k*k) 컨볼루션 레이어(220-n) 각각은, 제1 피처맵의 채널들을 적어도 2개의 그룹으로 구분하며, 구분된 적어도 2개의 그룹에 대응되는 각각의 제1 피처맵들을 (1*r) 확장 비율에 의한 (k*k) 컨볼루션 연산 내지 (n*r) 확장 비율에 의한 (k*k) 컨볼루션 연산하여 제2_1 피처맵 내지 제2_n 피처맵을 생성할 수 있다. 이때, 제1 (k*k) 컨볼루션 레이어(220-1) 내지 제n (k*k) 컨볼루션 레이어(220-n)는 컨볼루션 파라미터를 공유할 수 있다.Then, each of the first (k*k) convolutional layer 220-1 to the nth (k*k) convolutional layer 220-n divides the channels of the first feature map into at least two groups, , (k*k) convolution operation by (1*r) extension ratio to (k*k) convolution operation by (n*r) extension ratio for each of the first feature maps corresponding to the at least two divided groups A second_1st feature map to a 2nd_nth feature map may be generated by performing a root solution operation. In this case, the first (k*k) convolutional layer 220-1 to the nth (k*k) convolutional layer 220-n may share a convolution parameter.

즉, 제1 (k*k) 컨볼루션 레이어(220-1) 내지 제n (k*k) 컨볼루션 레이어(220-n) 각각은, 제1 피처맵을 한번에 컨볼루션 연산을 하는 것이 아니라, 연산량을 최소화하기 위하여, 제1 피처맵의 채널들을 적어도 2개의 그룹으로 구분하며, 구분된 적어도 2개의 그룹별로 컨볼루션 연산을 수행할 수 있다.That is, each of the first (k*k) convolutional layer 220-1 to the nth (k*k) convolutional layer 220-n does not perform a convolution operation on the first feature map at once, In order to minimize the amount of calculation, the channels of the first feature map may be divided into at least two groups, and a convolution operation may be performed for each of the divided at least two groups.

이때, 제2_1 피처맵 내지 제2_n 피처맵 각각의 채널수는 제1 피처맵의 채널수와 같게 유지될 수 있도록 하며, 이를 통해, 병렬 처리를 위해 채널수를 구분하여 적은 채널수를 컨볼루션 연산함으로써 일부 특징 정보가 없어지는 것을 방지할 수 있게 된다. 일 예로, 특정 다운 샘플링 피처맵의 채널수를 C라 하며, 제1 피처맵의 채널수를 C/S라 할 경우, 제2_1 피처맵 내지 제2_n 피처맵 각각의 채널수는 C/S일 수 있다.At this time, the number of channels in each of the 2_1 feature maps to 2_n feature maps is maintained to be the same as the number of channels in the first feature map. By doing so, it is possible to prevent some characteristic information from being lost. For example, if the number of channels of a specific down-sampling feature map is C and the number of channels of the first feature map is C/S, the number of channels in each of the 2_1 feature maps to 2_n feature maps may be C/S. have.

한편, 도 5를 참조하면, (1*r) 확장 비율에 의한 (k*k) 컨볼루션 연산 내지 (n*r) 확장 비율에 의한 (k*k) 컨볼루션 연산은, 확장 비율에 따라 (k*k) 사이즈의 커널을 이용하여 컨볼루션 연산을 적용하기 위한 피처맵 상의 픽셀들을 선택하여 컨볼루션 연산을 하는 것일 수 있다. 참고로, 도 5는 k는 3이며, r은 1인 상태에서, 컨볼루션 연산을 수행하는 상태를 개략적으로 도시한 것이다.Meanwhile, referring to FIG. 5 , the (k*k) convolution operation by the (1*r) expansion ratio to the (k*k) convolution operation by the (n*r) expansion ratio are performed according to the expansion ratio ( The convolution operation may be performed by selecting pixels on a feature map to which the convolution operation is applied using a kernel of size k*k). For reference, FIG. 5 schematically illustrates a state in which a convolution operation is performed in a state where k is 3 and r is 1.

도 5에서 알 수 있는 바와 같이, 확장 비율은 컨볼루션 연산을 위한 피처맵의 이웃하는 픽셀 사이의 거리일 수 있으며, (a)에서와 같이 확장 비율이 1인 경우에는, 거리가 1, 즉, 서로 이웃하는 픽셀들을 커널 사이즈에 맞게 선택한 후 컨볼루션 연산하며, (b)에서와 같이 확장 비율이 2인 경우에는, 거리가 2인 픽셀들을 커널 사이즈에 맞게 선택한 후 컨볼루션 연산하게 된다.As can be seen from FIG. 5 , the expansion ratio may be the distance between neighboring pixels of the feature map for convolution operation, and when the expansion ratio is 1 as in (a), the distance is 1, that is, After selecting neighboring pixels according to the kernel size, a convolution operation is performed. As in (b), when the expansion ratio is 2, pixels with a distance of 2 are selected according to the kernel size and then the convolution operation is performed.

다시, 도 4를 참조하면, 영상 분석 모듈(30)은 제2_1 피처맵 내지 제2_n 피처맵 각각을 제2_1 (1*1) 컨볼루션 레이어(230-1) 내지 제2_n (1*1) 컨볼루션 레이어(230-n)로 입력한다.Referring again to FIG. 4 , the image analysis module 30 converts each of the 2_1 th feature map to the 2_n th feature map into a 2_1 (1*1) convolution layer 230-1 to 2_n (1*1) convolution. Input to the solution layer 230-n.

그러면, 제2_1 (1*1) 컨볼루션 레이어(230-1) 내지 제2_n (1*1) 컨볼루션 레이어(230-n) 각각은 각각의 제2_1 피처맵 내지 제2_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제3_1 피처맵 내지 제3_n 피처맵을 생성할 수 있다. 이때, 제2_1 (1*1) 컨볼루션 레이어(230-1) 내지 제2_n (1*1) 컨볼루션 레이어(230-n)는 컨볼루션 파라미터를 공유할 수 있다.Then, each of the 2_1 (1*1) convolutional layer 230-1 to the 2_n (1*1) convolutional layer 230-n is a (1*) 1) The 3_1 to 3_n feature maps in which the number of channels are adjusted may be generated by performing a convolution operation. In this case, the 2_1 (1*1) convolution layer 230-1 to the 2_n (1*1) convolution layer 230 - n may share a convolution parameter.

그리고, 제3_1 피처맵 내지 제3_n 피처맵의 채널수는 특정 다운 샘플링 피처맵의 채널수와 같게 되도록 조정될 수 있다.In addition, the number of channels in the 3_1 feature map to the 3_n feature map may be adjusted to be equal to the number of channels in the specific down-sampling feature map.

다음으로, 영상 분석 모듈(30)은 제3_1 피처맵 내지 제3_n 피처맵을 콘케이트네이트하여 제3 (1*1) 컨볼루션 레이어(240)로 입력한다.Next, the image analysis module 30 concatenates the 3_1 to 3_n feature maps and inputs them to the third (1*1) convolutional layer 240 .

이때, 영상 분석 모듈(30)은 제3_1 피처맵 내지 제3_n 피처맵을 콘케이트네이트한 다음, 콘케이트네이트된 제3_1 피처맵 내지 제3_n 피처맵을 재배열하고, 재배열된 제3_1 피처맵 내지 제3_n 피처맵을 제3 (1*1) 컨볼루션 레이어(240)로 입력할 수 있다.In this case, the image analysis module 30 concatenates the 3_1 th feature map to the 3_n feature map, then rearranges the concatenated 3_1 th feature map to the 3_n th feature map, and the rearranged 3_1 feature map to 3_n feature maps may be input as the third (1*1) convolutional layer 240 .

일 예로, 제3_1 피처맵 내지 제3_n 피처맵 각각을 채널 C, 높이 H, 및 넓이 W를 가지는 CHW 볼륨의 텐서라 할 경우, 영상 분석 모듈(30)은, 제3_1 피처맵 내지 제3_n 피처맵을 콘케이트네이트하여 3CHW 볼륨의 텐서를 생성한다. 그리고, 3CHW 텐서를 C3HW 순서로 재배열한 다음 (3C)HW 볼륨의 텐서를 제3 (1*1) 컨볼루션 레이어(240)로 입력할 수 있다.For example, if each of the 3_1 feature maps to 3_n feature maps is a tensor of a CHW volume having a channel C, a height H, and a width W, the image analysis module 30 may Concatenate to create a tensor of 3CHW volume. Then, after rearranging the 3CHW tensor in the C3HW order, the (3C)HW volume tensor may be input to the third (1*1) convolutional layer 240 .

그러면, 제3 (1*1) 컨볼루션 레이어(240)는 콘케이트네이트된 제3_1 피처맵 내지 제3_n 피처맵을 (1*1) 컨볼루션 연산하여 채널수가 조정된 제4 피처맵을 생성할 수 있다.Then, the third (1*1) convolutional layer 240 performs a (1*1) convolution operation on the concatenated 3_1 to 3_n feature maps to generate a fourth feature map with an adjusted number of channels. can

이때, 제4 피처맵의 채널수는 특정 다운 샘플링 피처맵의 채널수와 같게 되도록 조정될 수 있다. 즉, 특정 다운 샘플링 피처맵과 같은 채널수를 가지는 제3_1 피처맵 내지 제3_n 피처맵을 콘케이트네이트한 경우 3C의 채널수를 가질 수 있으며, 제3 (1*1) 컨볼루션 레이어(240)에서의 (1*1) 컨볼루션 연산에 의해 채널수가 특정 다운 샘플링 피처맵과 동일하게 조정될 수 있다.In this case, the number of channels of the fourth feature map may be adjusted to be equal to the number of channels of the specific down-sampling feature map. That is, when the 3_1 to 3_n feature maps having the same number of channels as the specific down-sampling feature map are concatenated, the number of channels may be 3C, and the third (1*1) convolutional layer 240 . The number of channels can be adjusted to be equal to a specific down-sampling feature map by a (1*1) convolution operation in .

다음으로, 영상 분석 모듈(30)은 특정 다운 샘플링 피처맵과 제4 피처맵을 콘케이트네이트하여 변환된 다운 샘플링 피처맵을 생성할 수 있다.Next, the image analysis module 30 may generate a converted down-sampling feature map by concatenating the specific down-sampling feature map and the fourth feature map.

이를 통해, 바틀넥 블록(200)에 의해 제m 다운 샘플링 피처맵 내지 제(m-j) 다운 샘플링 피처맵 각각에 대응하는 스케일 변동에 강건한 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 생성할 수 있다.Through this, the m-th down-sampling feature map to the (mj)-th transformed down-sampling feature map that is robust to scale variations corresponding to each of the m-th down-sampling feature map to the (mj)-th down-sampling feature map by the bottleneck block 200 . You can create a sampling feature map.

다시 도 3을 참조하면, 영상 분석 모듈(30)은 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 피처 피라미드 네트워크(300)로 입력하여 피처 피라미드 네트워크(300)로 하여금 제m 변환된 다운 샘플링 피처맵 내지 제(m-j) 변환된 다운 샘플링 피처맵을 참조한 디컨볼루션 연산을 통해 제m 업 샘플링 피처맵 내지 제(m-j) 업 샘플링 피처맵을 생성하도록 하며, 제(m-j) 업 샘플링 피처맵을 오브젝트 디텍션 네트워크 또는 인스턴스 세그먼테이션 네트워크(400)로 입력하여 오브젝트 디텍션 네트워크로 하여금 이미지 상의 오브젝트를 검출하도록 하거나 인스턴스 세그먼테이션 네트워크(400)로 하여금 이미지를 인스턴스 세그먼테이션하도록 한다.Referring back to FIG. 3 , the image analysis module 30 inputs the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map into the feature pyramid network 300 to the feature pyramid network 300 . to generate an m-th up-sampling feature map to (mj)-th up-sampling feature map through a deconvolution operation with reference to the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map, mj) Input the up-sampling feature map to the object detection network or the instance segmentation network 400 to cause the object detection network to detect an object on the image or to cause the instance segmentation network 400 to instance segment the image.

이때, 영상 분석 모듈(30)은 피처 피라미드 네트워크(300)로 하여금 제m 업 샘플링 피처맵을 다운 샘플링하여 제(m+j) 업 샘플링 피처맵을 생성하도록 하며, 제(m-j) 업 샘플링 피처맵 내지 제(m+j) 업 샘플링 피처맵을 인스턴스 세그먼테이션 네트워크(400)에 추가적으로 입력하여 인스턴스 세그먼테이션 네트워크(400)로 하여금 제(m-j) 업 샘플링 피처맵 내지 제(m+j) 업 샘플링 피처맵을 더 참조하여 이미지를 인스턴스 세그먼테이션 하도록 할 수 있다.At this time, the image analysis module 30 causes the feature pyramid network 300 to down-sample the m-th up-sampling feature map to generate the (m+j)-th up-sampling feature map, and the (mj)-th up-sampling feature map. to (m+j) th up-sampling feature map is additionally input to the instance segmentation network 400 to cause the instance segmentation network 400 to generate (mj) th up-sampling feature map to (m+j) th up-sampling feature map For further reference, you can have the image instance segmented.

또한, 영상 분석 모듈(30)은 피처 피라미드 네트워크(300)로 하여금 제i 업 샘플링 피처맵을 업 샘플링하여 특정 업 샘플링 피처맵을 생성하며, 특정 업 샘플링 피처맵과 이에 대응되는 제(i-1) 다운 샘플링 피처맵을 콘케이트네이트하여 제(i-1) 업 샘플링 피처맵을 생성하도록 할 수 있다. 이때, i는 (m-(j+1)) 이상이며 m 이하인 정수일 수 있다.In addition, the image analysis module 30 causes the feature pyramid network 300 to up-sample the i-th up-sampling feature map to generate a specific up-sampling feature map, and the specific up-sampling feature map and the (i-1)th corresponding up-sampling feature map. ) by concatenating the down-sampling feature map to generate the (i-1)-th up-sampling feature map. In this case, i may be an integer greater than or equal to (m-(j+1)) and less than or equal to m.

이때, 인스턴스 세그먼테이션 네트워크(400)는 제(m-j) 업 샘플링 피처맵을 참조하여 프로토타입 마스크를 생성하며, 이와 동시에 제(m-j) 업 샘플링 피처맵 내지 제(m+j) 업 샘플링 피처맵을 참조하여 관심 영역마다 각각의 프로토타입을 반영할 계수를 예측하도록 한다. 그리고, 인스턴스 세그먼테이션 네트워크(400)는 예측된 계수들을 참조하여 프로토타입 마스크들을 선형결합하여 최종 마스크를 생성하며, 생성된 최종 마스크를 참조하여 주변 영상 이미지 상의 오브젝트들을 인스턴트 세그먼테이션할 수 있다.At this time, the instance segmentation network 400 generates a prototype mask by referring to the (mj)th up-sampling feature map, and at the same time, referring to the (mj)th up-sampling feature map to the (m+j)th up-sampling feature map. to predict the coefficient to reflect each prototype for each region of interest. In addition, the instance segmentation network 400 may generate a final mask by linearly combining the prototype masks with reference to the predicted coefficients, and may instant-segment objects on the surrounding video image with reference to the generated final mask.

이를 통해, 바틀넥 블록을 통해 처리 속도의 최소화하며 스케일 변동에 강건한 피처맵을 생성하여 인스턴스 세그먼테이션 네트워크(400)로 입력하여 줌으로써, 인스턴스 세그먼테이션 네트워크(400)는 오브젝트에 대하여 정확한 인식을 할 수 있어 성능이 향상되며 실시간 인스턴스 세그먼테이션이 가능하게 된다.Through this, by generating a feature map that minimizes the processing speed and is robust to scale fluctuations through the bottleneck block and inputs it to the instance segmentation network 400, the instance segmentation network 400 can accurately recognize the object. This is improved and real-time instance segmentation becomes possible.

다시 도 2를 참조하면, 주행상황판단및가이드장치(10)가, 영상분석모듈(30)로부터 출력된 주변 환경 정보를 위험상황판단모듈(40)로 입력하여, 해당 위험상황판단모듈(40)로 하여금 입력된 주변 환경 정보를 딥러닝 연산하여 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 예측값들을 출력하도록 할 수 있고, 출력된 예측값들을 참조하여 자동차(1)의 주변 상황이 위험상황인지를 판단하도록 할 수 있다. 구체적으로, 주행상황판단가이드장치(10)는 위험상황판단모듈(40)로 하여금 입력된 주변 환경 정보에 소정의 딥러닝 연산을 수행하도록 할 수 있는데, 이 때 해당 딥러닝 연산은 사용되는 딥러닝 알고리즘에 따라 결정될 수 있으며, 그 일 예로서 CNN(Convolutional Neural Network)에 기반한 딥러닝 알고리즘이 사용되는 경우 해당 딥러닝 연산은 컨벌루션(convolutional)연산일 수 있다. 그 결과로서, 위험상황판단모듈(40)로부터 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 예측값이 출력될 수 있는데, 상기 기설정된 위험상황 카테고리는 사전에 복수개가 정의되어 설정되어 있을 수 있고, 그 일 예로서 '버스/택시정류장이 존재(C1)', '주행차로의 전방/우측차로/좌측차로에 대형차가 존재(C2)', '주행차로의 전방에 택시가 존재(C3)'의 위험상황 카테고리가 기설정된 경우, 위험상황판단모듈(40)이 C1 카테고리에 대하여 0.9, C2 카테고리에 대하여 0.2, 및 C3 카테고리에 대하여 0.1의 예측값을 출력했다면, 그 중 가장 높은 예측값에 해당되는 C1 카테고리를 주변 환경 정보에 대응되는 위험상황 카테고리로 판단할 수 있다. 또한, 기설정된 복수개의 위험상황 카테고리 각각에 대한 예측값이 모두 소정의 기준치 이하 또는 미만인 경우에는, 자동차(1)의 주변 상황이 위험상황이 아닌 것으로 판단하거나 기설정된 위험상황이 아닌 새로운 상황인 것으로 판단할 수도 있다. 이와 같이, 출력되는 예측값 및 이를 참조하여 특정 위험상황 카테고리를 결정하는 구체적인 방법은 발명의 실시 조건에 따라 별도의 공식 등을 사용하여 다르게 이루어질 수 있다.Referring back to FIG. 2 , the driving situation determination and guide device 10 inputs the surrounding environment information output from the image analysis module 30 to the dangerous situation determination module 40, and the corresponding dangerous situation determination module 40 to output the predicted values as the probability corresponding to each of a plurality of preset dangerous situation categories by deep learning operation on the input surrounding environment information, and whether the surrounding situation of the car 1 is a dangerous situation with reference to the output predicted values can be judged. Specifically, the driving situation determination guide device 10 may cause the dangerous situation determination module 40 to perform a predetermined deep learning operation on the input surrounding environment information, in which case the deep learning operation is used. It may be determined according to an algorithm. As an example, when a deep learning algorithm based on a CNN (Convolutional Neural Network) is used, the corresponding deep learning operation may be a convolutional operation. As a result, a predicted value as a probability corresponding to each of a plurality of preset dangerous situation categories may be output from the dangerous situation determination module 40, wherein a plurality of preset dangerous situation categories may be defined and set in advance, , as an example, 'a bus/taxi stop exists (C1)', 'a large vehicle exists in the front/right lane/left lane of the driving lane (C2)', 'a taxi exists in front of the driving lane (C3)' When the dangerous situation category of is preset, if the dangerous situation determination module 40 outputs predicted values of 0.9 for the C1 category, 0.2 for the C2 category, and 0.1 for the C3 category, C1 corresponding to the highest predicted value among them The category may be determined as a dangerous situation category corresponding to the surrounding environment information. In addition, when the predicted values for each of a plurality of preset dangerous situation categories are all below or below a predetermined reference value, it is determined that the surrounding situation of the vehicle 1 is not a dangerous situation, or it is determined that it is a new situation that is not a preset dangerous situation You may. As such, the output predicted value and a specific method for determining a specific dangerous situation category with reference thereto may be made differently by using a separate formula or the like according to the implementation conditions of the invention.

아래 [표 1]은, 상기 기설정된 복수개의 위험상황 카테고리가 정해질 수 있는 예시이다.[Table 1] below is an example in which the plurality of preset dangerous situation categories can be determined.

기설정된 위험상황 카테고리 예시Example of preset dangerous situation category '버스/택시정류장이 존재''There is a bus/taxi stop' '주행차로의 전방/우측차로/좌측차로에 대형차가 존재''A large vehicle exists in the front/right/left lane of the driving lane' '주행차로의 전방에 택시가 존재''There is a taxi ahead of the driving lane' '주행차로의 우측/좌측의 전방/후방에 자전거/오토바이가 존재''Bicycles/motorcycles exist in front/rear on the right/left side of the driving lane' '주행차로의 우측/좌측 가로변에 주차된 자동차들이 존재''There are cars parked on the right/left side of the driving lane' '주행차로의 우측차로/좌측차로에 정체중인 자동차들이 존재''There are traffic jams in the right/left lane of the driving lane' '주행차로의 전방에 4지 교차로가 존재''There is a four-way intersection in front of the driving lane' ......

위 [표 1]에서와 같이, 자동차가 주행 중에 처할 수 있는 위험상황들을 사전에 정하여, 자동차(1)에 장착된 카메라(20)로부터 획득되는 주변 상황 이미지에 대응되는 특정 위험상황 카테고리를 판단함으로써 자동차의 주변 상황이 위험상황인지의 여부 및 어떠한 위험상황에 해당되는지에 대한 판단이 가능할 수 있다. 이 때, [표 1] 각각의 상황에 해당되는 주변 상황 이미지로부터 특정 위험상황 카테고리를 판단하는 예시로서의 방식을 별도의 도면(도 8a 내지 도 8g)을 참조하여 설명하면 다음과 같다.As shown in [Table 1] above, by determining in advance the dangerous situations that the vehicle may face while driving, and determining a specific dangerous situation category corresponding to the surrounding situation image obtained from the camera 20 mounted on the vehicle 1, It may be possible to determine whether the surrounding situation of the vehicle is a dangerous situation and what kind of dangerous situation it corresponds to. At this time, a method as an example of determining a specific dangerous situation category from the surrounding situation image corresponding to each situation in [Table 1] will be described with reference to separate drawings ( FIGS. 8A to 8G ) as follows.

도 8a 내지 도 8g는 본 발명의 일 실시예에 따른, 자동차(1)에 장착된 카메라(20)로부터 획득되는 주변 상황 이미지가 기설정된 특정 위험상황 카테고리에 해당되는 경우를 예시적으로 나타내는 도면들이다.8A to 8G are diagrams exemplarily illustrating a case in which an image of a surrounding situation obtained from a camera 20 mounted on a vehicle 1 corresponds to a predetermined specific dangerous situation category, according to an embodiment of the present invention. .

도 8a에는 자동차(1)가 주행중인 주행차로의 우측 전방에 버스정류장(812) 및 정차된 버스(811)가 존재하는 상황에 해당되는 경우의 주변 영상 이미지가 도시되어 있는데, 이와 같은 주변 영상 이미지가 입력되면, 영상분석모듈(30)이 좌측의 차선(814) 및 우측의 연석(813)에 해당되는 오브젝트를 검출하여 주행중인 차선을 판단하고, 버스정류장(812) 및 버스(811)에 해당되는 오브젝트를 검출할 수 있다. 이 때, 도 8a에 도시된 바와 같이, 오브젝트 각각을 검출함에 있어 사용되는 알고리즘에 따라 해당 오브젝트 각각에 대한 바운딩 박스(bounding box)가 생성되어, 해당 오브젝트 각각이 주변 영상 이미지에서 위치하는 좌표 및 크기 등에 대한 정보가 획득될 수 있고, 또한 오브젝트 각각에 해당되는 종류(클래스)에 대한 분석이 함께 이루어져, 해당 오브젝트의 위치정보, 크기정보 및 종류정보 각각이 획득될 수 있다. 또한, 해당 오브젝트에 대하여, 자동차(1)의 주행상태정보(속도, 가속도 등) 및 그에 대응되는 시간별 주변 영상 이미지 각각에서의 해당 오브젝트에 대한 정보를 참조하여 분석함으로써 해당 오브젝트를 트래킹하여 해당 오브젝트의 이동여부정보를 획득할 수도 있다. 상술한 바와 같이 주변 영상 이미지에 포함된 복수개의 오브젝트가 검출되어, 해당 정보를 포함하는 주변 환경 정보가 영상분석모듈(30)로부터 출력되면, 주행상황판단및가이드장치(10)가 이를 위험상황판단모듈(40)에 입력하여 위험상황판단모듈로 하여금 그에 대응되는 특정 위험상황 카테고리를 판단하도록 할 수 있으며, 도 8a에 도시된 주변 영상 이미지는 최종적으로 '버스/택시정류장이 존재'에 해당되는 위험상황 카테고리로 판단될 수 있을 것이다.In FIG. 8A, there is shown a surrounding image image corresponding to a situation in which a bus stop 812 and a stopped bus 811 exist in front of the right front of the driving lane on which the vehicle 1 is traveling. is input, the image analysis module 30 detects the object corresponding to the left lane 814 and the right curb 813 to determine the driving lane, and corresponds to the bus stop 812 and the bus 811 object can be detected. At this time, as shown in FIG. 8A , a bounding box is generated for each object according to an algorithm used for detecting each object, and coordinates and sizes of the respective objects are located in the surrounding video image. Information on the object may be acquired, and the type (class) corresponding to each object may be analyzed together, so that position information, size information, and type information of the corresponding object may be obtained respectively. In addition, with respect to the object, the object is tracked by analyzing the driving state information (speed, acceleration, etc.) of the vehicle 1 and information on the object in each of the corresponding time-by-time surrounding video images. It is also possible to obtain movement status information. As described above, when a plurality of objects included in the surrounding video image are detected and surrounding environment information including the corresponding information is output from the image analysis module 30, the driving situation determination and guide device 10 determines the dangerous situation. It is input to the module 40 so that the dangerous situation determination module can determine a specific dangerous situation category corresponding thereto, and the surrounding image image shown in FIG. 8a is a risk corresponding to 'there is a bus/taxi stop' It may be judged by the situation category.

이하 도 8b 내지 도 8g에 도시된 주변 영상 이미지 각각에 대해서 오브젝트 검출을 수행하는 방법의 내용은 도 8a의 경우와 유사하므로, 자세한 설명은 생략하도록 하겠다.Hereinafter, since the method of performing object detection for each of the peripheral video images shown in FIGS. 8B to 8G is similar to the case of FIG. 8A , a detailed description thereof will be omitted.

도 8b에는 자동차(1)가 주행중인 주행차로의 우측차로 전방에 화물을 실은 대형차(821)가 주행중인 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 좌측 차선(822) 오브젝트와 우측 차선(823) 오브젝트를 검출하여 이를 바탕으로 주행차선에 대한 정보를 획득할 수 있고, 대형차(821)에 해당되는 오브젝트를 검출하여 자동차(1)의 주변 상황이 '주행차로의 전방/우측차로/좌측차로에 대형차가 존재'의 위험상황 카테고리에 해당되는 것으로 판단할 수 있으며, 더 나아가 '주행차로의 우측차로 전방에 대형차가 존재'와 같이 더 구체적으로 판단할 수도 있다.FIG. 8B shows an image of a surrounding situation corresponding to a situation in which the large vehicle 821 loaded with cargo in front of the right lane of the driving lane in which the vehicle 1 is traveling is driving. At this time, the image analysis module 30 may detect the left lane 822 object and the right lane 823 object to obtain information on the driving lane based on this, and detect the object corresponding to the large vehicle 821 . Thus, it can be determined that the surrounding situation of the vehicle 1 falls under the dangerous situation category of 'large vehicle is present in the front/right lane/left lane of the driving lane', and furthermore, 'large vehicle in front of the right lane of the driving lane It can also be judged more specifically, such as 'existence'.

도 8c에는 자동차(1)가 주행중인 주행차로 전방에 택시(831)가 존재하는 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 해당 주변 상황 이미지로부터 차선(832, 833) 및 택시(831)에 해당되는 오브젝트 각각을 검출하고, 이를 바탕으로 주변 상황 정보를 생성하여 출력함으로써 위험상황판단모듈에 의해서 최종적으로 자동차(1)의 주변 상황이 '주행차로의 전방에 택시가 존재'의 위험상황 카테고리에 해당되는 것으로 판단할 수 있다.FIG. 8C shows an image of a surrounding situation corresponding to a situation in which a taxi 831 exists in front of a driving lane in which the vehicle 1 is traveling. At this time, the image analysis module 30 detects each of the objects corresponding to the lanes 832 and 833 and the taxi 831 from the image of the surrounding situation, and generates and outputs information about the surrounding situation based on this. Finally, it can be determined that the surrounding situation of the vehicle 1 corresponds to the dangerous situation category of 'a taxi exists in front of the driving lane'.

도 8d에는 자동차(1)가 주행중인 주행차로 우측의 차로변에 자전거 및 오토바이(841, 842)가 존재하는 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 해당 주변 상황 이미지로부터 차선(844), 연석(843) 및 자전거와 오토바이(841, 842)에 해당되는 오브젝트 각각을 검출하고, 이를 바탕으로 주변 상황 정보를 생성하여 출력함으로써 위험상황판단모듈(40)에 의하여 최종적으로 자동차(1)의 주변 상황이 '주행차로의 우측/좌측의 전방/후방에 자전거/오토바이가 존재'의 위험상황 카테고리에 해당되는 것으로 판단할 수 있고, 더 나아가 '주행차로의 우측 가로변 전방에 자전거와 오토바이가 존재'와 같이 더 구체적으로 판단할 수도 있다.FIG. 8D shows an image of a surrounding situation corresponding to a situation in which bicycles and motorcycles 841 and 842 exist on the right side of the driving lane in which the vehicle 1 is traveling. At this time, the image analysis module 30 detects each object corresponding to the lane 844 , the curb 843 , and the bicycles and motorcycles 841 and 842 from the image of the surrounding situation, and generates information about the surrounding situation based on this. Finally, by the dangerous situation determination module 40 by outputting the In addition, it is possible to make a more specific judgment such as 'there are bicycles and motorcycles in front of the roadside on the right side of the driving lane'.

도 8e에는 자동차(1)가 주행중인 주행차로 우측의 차로변에 복수의 주차된 자동차(851, 852)가 존재하는 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 해당 주변 상황 이미지로부터 차선(855), 연석(854) 및 주차된 자동차들(851, 852, 853)에 해당되는 오브젝트 각각을 검출하고, 이를 바탕으로 주변 상황 정보를 생성하여 출력함으로써 위험상황판단모듈(40)에 의하여 최종적으로 자동차(1)의 주변 상황이 '주행차로의 우측/좌측 차로변에 주차된 자동차들이 존재' 의 위험상황 카테고리에 해당되는 것으로 판단할 수 있고, 더 나아가 '주행차로의 우측 차로변에 주차된 자동차들이 존재'와 같이 더 구체적으로 판단할 수도 있다.FIG. 8E shows an image of a surrounding situation corresponding to a situation in which a plurality of parked cars 851 and 852 exist on the right side of the driving lane on which the vehicle 1 is traveling. At this time, the image analysis module 30 detects each object corresponding to the lane 855 , the curb 854 , and the parked cars 851 , 852 , 853 from the surrounding situation image, and based on this, the surrounding situation By generating and outputting information, it is finally determined by the dangerous situation determination module 40 that the surrounding situation of the vehicle 1 corresponds to the dangerous situation category of 'there are cars parked on the right/left side of the driving lane'. In addition, it is also possible to make a more specific judgment such as 'there are cars parked on the right side of the driving lane'.

도 8f에는 자동차(1)가 주행중인 주행차로의 우측차로 및 전방에 차량이 존재하여 가속 또는 우측추월이 불가능한 상태에서, 좌측차로에 복수의 자동차들이 정체중인 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 해당 주변 상황 이미지로부터 차선(866, 867), 전방 자동차(864), 우측차로 자동차(865) 및 좌측차로의 정체중인 자동차들(861, 862, 863)에 해당되는 오브젝트 각각을 검출하고, 이를 바탕으로 주변 상황 정보를 생성하여 출력함으로써 위험상황판단모듈(40)에 의하여 최종적으로 자동차(1)의 주변 상황이 '주행차로의 우측차로/좌측차로에 정체중인 자동차들이 존재'의 위험상황 카테고리에 해당되는 것으로 판단할 수 있고, 더 나아가 '주행차로의 좌측차로에 정체중인 자동차들이 존재' 및 '주행차로의 전방/우측차로에 자동차가 존재'와 같이 더 구체적으로 판단할 수도 있다.8f shows an image of a surrounding situation corresponding to a situation in which a plurality of cars are congested in the left lane in a state in which acceleration or right overtaking is impossible due to the presence of a vehicle in the right lane and in front of the driving lane in which the vehicle 1 is traveling. have. At this time, the image analysis module 30 from the image of the surrounding situation to the lanes 866 and 867, the front car 864, the car 865 in the right lane, and the cars 861, 862, 863 in the left lane. By detecting each of the corresponding objects, generating and outputting information about the surrounding situation based on this, the dangerous situation determination module 40 finally determines the surrounding situation of the vehicle 1 as 'the vehicle is congested in the right/left lane of the driving lane. It can be judged that it falls under the dangerous situation category of 'there are cars', and furthermore, more concrete may be judged as

도 8g에는 자동차(1)가 주행중인 주행차로 전방에 정지선(872) 및 주행차로에 대응되는 신호등(871)이 존재하고, 4지 교차로를 횡단하는 횡단보도(873, 874, 875, 876)가 존재하는 상황에 해당되는 주변 상황 이미지가 도시되어 있다. 이 때, 영상분석모듈(30)은 정지선(872), 신호등(871), 횡단보도(873, 874, 875, 876)에 해당되는 오브젝트 각각을 검출하고, 이를 바탕으로 주변 상황 정보를 생성하여 출력함으로써 위험상황판단모듈(40)에 의하여 최종적으로 자동차(1)의 주변 상황이 '주행차로의 전방에 4지 교차로가 존재'의 위험상황 카테고리에 해당되는 것으로 판단될 수 있다.In FIG. 8G , a stop line 872 and a traffic light 871 corresponding to the driving lane exist in front of the driving lane in which the vehicle 1 is traveling, and crosswalks 873, 874, 875, 876 crossing the 4 intersections are shown. An image of a surrounding situation corresponding to an existing situation is shown. At this time, the image analysis module 30 detects each object corresponding to the stop line 872 , the traffic light 871 , and the crosswalk 873 , 874 , 875 , 876 , and generates and outputs surrounding situation information based on this. By doing so, it can be finally determined by the dangerous situation determination module 40 that the surrounding situation of the vehicle 1 corresponds to the dangerous situation category of 'there is a four-way intersection in front of the driving lane'.

다음으로, 자동차(1)의 주변 상황이 상기 기설정된 복수개의 위험상황 카테고리 중 특정 위험상황 카테고리에 해당되는 것으로 판단되면, 주행상황판단가이드장치(10)가, 주행가이드생성모듈(50)로 하여금 특정 위험상황 카테고리에 대응되는 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 생성하여 특정 위험상황 카테고리에 해당되는 특정 위험상황을 자동차(1)의 운전자가 인지할 수 있도록 할 수 있다. 이 때, 주행상황판단가이드장치(10)는, 주행가이드생성모듈(50)로 하여금 생성된 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 소정의 운전자 단말(60)로 전송하도록 하여 운전자에게 제공되도록 할 수 있으며, 해당 운전자 단말(60)은 자동차에 설치되어 영상 및 음성의 재생이 가능한 네비게이션, 블랙박스 등의 장치일 수도 있고, 자동차에 설치되어 있지 않지만 주행상황판단가이드장치(10)와 유선/무선으로 데이터 송수신이 가능한 장치로서 생성된 특정 시각주행가이드 정보 및 특정 음성주행가이드 정보 중 적어도 하나를 획득하여 운전자에게 제공할 수 있는 스마트폰 등의 장치도 운전자 단말(60)로서 채택되어 사용될 수 있다.Next, when it is determined that the surrounding situation of the vehicle 1 corresponds to a specific dangerous situation category among the plurality of preset dangerous situation categories, the driving situation determination guide device 10 causes the driving guide generation module 50 to By generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to a specific dangerous situation category, the driver of the vehicle 1 may recognize a specific dangerous situation corresponding to a specific dangerous situation category. At this time, the driving situation determination guide device 10 causes the driving guide generation module 50 to transmit at least one of the generated specific visual driving guide information and specific voice driving guide information to a predetermined driver terminal 60 , It may be provided to the driver, and the driver terminal 60 may be a device such as a navigation device or a black box that is installed in a vehicle and capable of reproducing video and audio, and is not installed in the vehicle, but is not installed in the vehicle, but the driving situation determination guide device 10 ) and a device capable of transmitting and receiving data via wire/wireless, a device such as a smartphone that can obtain at least one of the generated specific visual driving guide information and specific voice driving guide information and provide it to the driver is also adopted as the driver terminal 60 and can be used

그리고, 상술한 바와 같은 과정을 거쳐 주행가이드생성모듈(50)로부터 생성될 수 있는 음성주행가이드 정보는 소정의 데이터베이스에 기설정된 복수개의 위험상황 카테고리별로 분류되어 저장되어 있는 음성가이드템플릿을 사용하여 생성될 수 있는데, 이 때, 음성가이드템플릿은 상황에 따라 그 내용이 변경되어 입력될 수 있는 부분인 적어도 하나의 슬롯(slot)이 존재하며, 상기 슬롯 각각에는 상기 주변 환경 정보에 포함된 적어도 하나의 오브젝트 각각에 대응되는 위치정보, 크기정보, 이동여부정보 및 종류정보 중 어느 하나의 정보가 입력될 수 있다. 또한, 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각이 위험상황으로 판단될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하고 있을 수 있는데, 예를 들어 위험상황 카테고리가 '버스/택시정류장이 존재'인 경우, 그에 대응되는 위험조건은 '주행차로의 우측 전방에 버스정류장이 존재'및 '주행차로의 우측 전방에 정차된 버스가 존재'일 수 있다. 이 때, 주행상황판단및가이드장치(10)는, 주행가이드생성모듈(50)로 하여금 (i) 상기 특정 위험상황 카테고리에 대한 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 특정 위험상황 카테고리에 대응되는 특정 음성가이드템플릿에 대한 정보를 참조하여, 특정 위험상황 카테고리에 해당되는 특정 음성가이드템플릿에 포함된 슬롯 각각이 입력되어 완성된 특정 음성주행가이드 정보를 생성하도록 하고, 생성된 특정 음성주행가이드 정보를 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 할 수 있다. 예를 들어, 위험상황 카테고리가 '버스/택시정류장이 존재'인 경우, 그에 대응되는 특정 음성가이드템플릿은 "[slot-1]에서 [slot-2]가 갑자기 나올 수 있습니다. 속도를 줄이고 [slot-2]를 잘 살피십시오"와 같이 저장되어 있을 수 있고, 이 때 주행상황판단및가이드장치(10)는 주행가이드생성모듈(50)로 하여금 [slot-1] 슬롯에 대해서는 '우측 전방'이 입력되도록 하고, [slot-2] 슬롯에 대해서는 '버스'가 입력되도록 하여, "우측 전방에서 버스가 갑자기 나올 수 있습니다. 속도를 줄이고 버스를 잘 살피십시오"와 같이 완성된 특정 음성주행가이드 정보를 생성하도록 할 수 있다.In addition, the voice driving guide information that can be generated from the driving guide generation module 50 through the process as described above is generated using a voice guide template classified and stored in a plurality of dangerous situation categories preset in a predetermined database. At this time, the voice guide template has at least one slot, which is a part in which the content can be changed and input according to the situation, and each of the slots includes at least one slot included in the surrounding environment information. Any one of location information, size information, movement status information, and type information corresponding to each object may be input. In addition, each of the plurality of preset dangerous situation categories may additionally include information on at least one dangerous condition that can be determined as a dangerous situation, for example, if the dangerous situation category is 'bus/taxi' In the case of 'there is a stop', the corresponding dangerous conditions may be 'there is a bus stop on the right front of the driving lane' and 'there is a bus stopped on the right front of the driving lane'. At this time, the driving situation determination and guide device 10 causes the driving guide generation module 50 to respond to (i) information on the specific dangerous situation category, (ii) a specific dangerous condition corresponding to the specific dangerous situation category. Specific voice driving completed by inputting each slot included in the specific voice guide template corresponding to the specific dangerous situation category with reference to information on information about the specific voice guide template corresponding to the specific dangerous situation category and (iii) the specific voice guide template corresponding to the specific dangerous situation category The guide information may be generated, and the generated specific voice driving guide information may be directly provided to the driver or transmitted to a predetermined driver terminal and supported to be provided. For example, if the dangerous situation category is 'bus/taxi stop exists', the corresponding specific audio guide template is "[slot-1] to [slot-2] may come out suddenly. Reduce the speed and [slot] -2] carefully", and at this time, the driving situation judgment and guide device 10 causes the driving guide generation module 50 to have 'right front' for the [slot-1] slot. input, and 'bus' is input for the [slot-2] slot, so that the specific voice driving guide information completed can be made to create

아래 [표 2]는, 상기 기설정된 복수개의 위험상황 카테고리에 대응되어 생성될 수 있는 음성주행가이드 정보의 예시이다.[Table 2] below is an example of voice driving guide information that can be generated in response to the plurality of preset dangerous situation categories.

기설정된 위험상황 카테고리 예시Example of preset dangerous situation category 음성주행가이드 정보 예시 Example of voice driving guide information '버스/택시정류장이 존재''There is a bus/taxi stop' "[우측 전방]에서 [버스]가 갑자기 나올 수 있습니다. 속도를 줄이고 [버스]를 잘 살피십시오",
"[버스] 앞으로 [사람]이 지나갈 수 있습니다. 속도를 줄이고 [우측 전방]을 주시하세요""[Bus] may suddenly come out of [Right front]. Slow down and watch [Bus]",
"[People] may pass in front of [bus]. Slow down and keep an eye on [Right front]" '주행차로의 전방/우측차로/좌측차로에 대형차가 존재''A large vehicle exists in the front/right/left lane of the driving lane' "[대형차]와 나란히 주행하는 것은 좋지 않습니다. 속도를 줄이고 [좌측차로]로 이동하세요""It is not recommended to drive side by side with [large car]. Reduce your speed and move [in the left lane]" '주행차로의 전방에 택시가 존재''There is a taxi ahead of the driving lane' "[전방]의 [택시]가 급정거할 수 있으니 안전거리를 확보하시기 바랍니다""The [Taxi] in [forward] may make a sudden stop, so please keep a safe distance." '주행차로의 우측/좌측의 전방/후방에 자전거/오토바이가 존재''Bicycles/motorcycles exist in front/rear on the right/left side of the driving lane' "[우측 전방]에 [자전거/오토바이]가 있으니 주의하시기 바랍니다.""Please note that there is a [bicycle/motorcycle] in [right front]." '주행차로의 우측/좌측 가로변에 주차된 자동차들이 존재''There are cars parked on the right/left side of the driving lane' "[우측 전방]의 [주차된 자동차] 사이로 사람이 갑자기 나올 수 있으니 서행하세요.""People may suddenly come out of [parked cars] in [right front], so please slow down." '주행차로의 우측차로/좌측차로에 정체중인 자동차들이 존재''There are traffic jams in the right/left lane of the driving lane' "[좌측차로]에서 [주행중인 자동차]가 끼어들 수 있으니 주의하면서 운전하세요.""Drive with caution as [driving cars] in the [left lane] may cut you in." '주행차로의 전방에 4지 교차로가 존재''There is a four-way intersection in front of the driving lane' "[전방]에 [4지 교차로]가 있습니다. 주의하세요.""There is a [4-way intersection] in [ahead]. Please be careful." ...... ......

이 때, [표 2]의 음성주행가이드 정보 예시 각각에서 별도의 기호([])로 표시한 부분은 음성가이드템플릿에서의 슬롯 부분에 해당되어 입력된 부분일 수 있는데, 이와 같은 슬롯을 포함한 음성가이드템플릿 및 음성주행가이드 정보는 [표 2]와 같이 한정되는 것은 아니며, 발명의 실시 조건에 따라 다르게 정해질 수 있다.또한, 발명의 일 예로서, 주행가이드생성모듈(50)은 소정의 TTS(Text-To-Speech) 엔진(51)과 연동되어 있을 수 있고, 주행상황판단및가이드장치(10)는 주행가이드생성모듈(50)로 하여금 생성된 특정 음성주행가이드 정보에 대하여 TTS 엔진(51)을 적용하여 특정 TTS 데이터를 생성하도록 하는 프로세스를 추가로 수행하여, 특정 TTS 데이터를 재생한 특정 음성정보를 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말(60)로 전송하여 제공되도록 지원하도록 할 수 있다.At this time, in each example of the voice driving guide information in [Table 2], the part marked with a separate symbol ([]) may correspond to a slot part in the voice guide template and may be an input part. The guide template and voice driving guide information are not limited as shown in [Table 2], and may be determined differently depending on the implementation conditions of the invention. In addition, as an example of the invention, the driving guide generating module 50 may provide a predetermined TTS (Text-To-Speech) may be interlocked with the engine 51, and the driving situation determination and guide device 10 causes the driving guide generating module 50 to generate the TTS engine 51 for specific voice driving guide information. ) to generate specific TTS data by additionally performing a process to generate specific TTS data to directly provide specific voice information that reproduces specific TTS data to the driver or to transmit to a predetermined driver terminal 60 to support the provision have.

다음으로, 주행상황판단및가이드장치(10)는 주행가이드생성모듈(50)로 하여금 주변 영상 이미지에 소정의 위험안내신호를 추가로 표시한 시각주행가이드 정보를 생성하도록 할 수도 있는데, 기설정된 복수개의 위험상황 카테고리 각각은, 그 각각의 위험상황으로 판별될 수 있는 적어도 하나의 위험조건에 대한 정보를 추가로 포함하고 있고, 주행상황판단및가이드장치(10)는, 위험상황판단모듈로 하여금 (i) 상기 특정 위험상황 카테고리 정보, (ii) 상기 특정 위험상황 카테고리에 대응되는 특정 위험조건에 대한 정보, 및 (iii) 상기 주변 환경 정보를 참조하여, 주변 영상 이미지에 포함된 적어도 하나의 오브젝트 중 특정 위험조건에 해당되는 위험오브젝트를 특정하고, 해당 위험오브젝트에 대한 정보를 포함하는 위험요인정보를 생성하도록 하는 프로세스를 추가로 수행할 수 있고, 주행상황판단및가이드장치(10)는, 생성된 위험요인정보를 주행가이드모듈(50)로 하여금 입력받도록 하여, 주변 영상 이미지에 대하여 위험오브젝트 각각에 해당되는 좌표에 대응되도록 기설정된 위험안내신호가 추가로 표시된 이미지를 특정 시각주행가이드 정보로서 생성하도록 할 수 있고, 생성된 특정 시각주행가이드 정보를 운전자에게 직접 제공하도록 하거나 소정의 운전자 단말로 전송하여 제공되도록 지원하도록 할 수 있다. 예를 들어, 특정 위험상황 카테고리가 '버스/택시정류장이 존재'인 경우에, 주변 영상 이미지는 주행차로의 우측 전방에 버스 및 버스정류장이 오브젝트로서 포함되어 있을 수 있고, 주변 영상 이미지상에서 해당 버스 및 버스정류장에 해당되는 좌표에 대응되도록 소정의 기하학적 구조 등의 위험안내신호를 표시한 이미지가 특정 시각주행가이드 정보로서 생성되어, 운전자에게 직접 제공되거나 소정의 운전자 단말로 전송되어 디스플레이 되는 등의 방식으로 제공될 수 있다.Next, the driving situation determination and guide device 10 may cause the driving guide generating module 50 to generate visual driving guide information in which a predetermined danger guide signal is additionally displayed on the surrounding image image, a plurality of preset Each of the dangerous situation categories further includes information on at least one dangerous condition that can be determined as each dangerous situation, and the driving situation determination and guide device 10 causes the dangerous situation determination module to ( i) the specific dangerous situation category information, (ii) information on a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the surrounding environment information, among at least one object included in the surrounding video image A process of specifying a risk object corresponding to a specific risk condition and generating risk factor information including information about the risk object may be additionally performed, and the driving situation judgment and guide device 10 is By allowing the driving guide module 50 to receive the risk factor information, an image in which a predetermined danger guide signal is additionally displayed to correspond to the coordinates corresponding to each danger object with respect to the surrounding image image is generated as specific visual driving guide information. In this case, the generated specific visual driving guide information may be directly provided to the driver or transmitted to a predetermined driver terminal and supported to be provided. For example, when the specific dangerous situation category is 'there is a bus/taxi stop', the surrounding video image may include a bus and a bus stop as an object on the right front side of the driving lane, and the corresponding bus on the surrounding video image and an image displaying a danger guide signal such as a predetermined geometric structure to correspond to the coordinates corresponding to the bus stop is generated as specific visual driving guide information, provided directly to the driver, or transmitted to a predetermined driver terminal and displayed. can be provided as

다음으로, 위험상황판단모듈은 소정의 딥러닝 알고리즘을 사용하여 주변 환경 정보를 출력할 수 있는데, 이를 위하여 해당 딥러닝 알고리즘을 바탕으로 한 위험상황판단모듈의 학습이 사전에 수행될 수 있다. 이를 별도의 도면(도 6a 및 도 6b)을 참조하여 구체적으로 설명하면 다음과 같다.Next, the risk situation determination module may output information about the surrounding environment using a predetermined deep learning algorithm. For this, learning of the risk situation determination module based on the deep learning algorithm may be performed in advance. This will be described in detail with reference to separate drawings ( FIGS. 6A and 6B ) as follows.

도 6a는 본 발명의 일 실시예에 따른, 학습용 주변 영상 이미지를 학습데이터로 하여 위험상황 판단 모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이고, 도 6b는 본 발명의 일 실시예에 따른, 별도로 준비된 학습용 주변 환경 정보를 학습데이터로 하여 위험상황 판단 모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이다.6A is a flowchart schematically illustrating a process in which learning for a dangerous situation determination module is performed using surrounding image images for learning as learning data according to an embodiment of the present invention, and FIG. 6B is an embodiment of the present invention , is a flowchart schematically showing a process in which learning for the dangerous situation determination module is performed using the separately prepared surrounding environment information for learning as learning data.

도 6a를 참조하면, 발명의 일 실시예로서, 소정의 위험상황판단모듈 학습장치가, 자동차에 장착된 상기 카메라로부터 획득되거나, 별도로 준비된 적어도 하나의 제1 학습용 주변 영상 이미지 각각을 학습데이터로 하여, 영상분석모듈로 하여금 학습용 제1 주변 영상 이미지 각각을 입력(S601)받아 분석한 결과로서 제1 학습용 주변 환경 정보를 출력(S602)하도록 하고, 출력된 제1 학습용 주변 환경 정보를 사용하여 상기 위험상황판단모듈에 대한 학습을 수행할 수 있는데, 이 때 제1 학습용 주변 영상 이미지 각각은, 기설정된 위험상황 카테고리 중 제1 학습용 주변 영상 이미지 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제1 GT(Ground Truth)로서 포함하고 있을 수 있다. 이러한 상태에서, 위험상황판단모듈의 학습은, 위험상황판단모듈 학습장치가, 위험상황판단모듈로 하여금 제1 학습용 주변 환경 정보를 입력(S603)받아 소정의 딥러닝 연산을 수행(S604)하도록 하고, 그 결과로서 출력되는 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제1 학습용 예측값들에 대한 정보(S605)를 제1 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어질 수 있다. 이 때, 제1 학습용 예측값들과 제1 GT를 비교한 차이는 소정의 공식을 사용한 로스값으로 산출(S606)될 수 있고, 해당 로스값이 최소화되도록 하기 위하여 백프로퍼게이션(backpropagation)이 반복적으로 수행되어 위험상황판단모듈(40)에 포함된 복수개의 파라미터가 최적화(S407)될 수 있다.Referring to FIG. 6A , as an embodiment of the present invention, a predetermined dangerous situation determination module learning apparatus uses, as learning data, each of at least one first learning surrounding video image obtained from the camera mounted on a vehicle or prepared separately , the image analysis module receives each of the first surrounding image images for learning (S601) and outputs the first surrounding environment information for learning as an analysis result (S602), and using the outputted first surrounding environment information for learning, the risk Learning for the situation determination module can be performed, in which case each of the first learning surrounding image images includes information on a specific correct dangerous situation category corresponding to each of the first surrounding video images among the preset dangerous situation categories. It may be included as GT (Ground Truth). In this state, the learning of the dangerous situation determination module causes the dangerous situation determination module learning device to receive the first learning surrounding environment information (S603) and perform a predetermined deep learning operation (S604), and , as a result of comparing the information (S605) on the first prediction values for learning as the probability corresponding to each of the plurality of preset dangerous situation categories output as a result with the first GT, so that the difference is minimized. This may be accomplished by performing a process of optimizing a plurality of parameters. At this time, the difference between the first learning prediction values and the first GT may be calculated as a loss value using a predetermined formula (S606), and backpropagation is repeatedly performed in order to minimize the loss value. A plurality of parameters included in the dangerous situation determination module 40 may be optimized ( S407 ).

또한, 발명의 다른 일 실시예로서, 도 6b를 참조하면, 소정의 위험상황판단모듈 학습장치가, 별도로 준비된 적어도 하나의 제2 학습용 주변 환경 정보를 학습데이터로 사용하여, 상기 위험상황판단모듈에 대한 학습을 수행할 수 있는데, 이 때 제2 학습용 주변 환경 정보 각각은, 그 각각에 해당되는 특정 정답 위험상황 카테고리에 대한 정보를 제2 GT(Ground Truth)로서 포함하고 있을 수 있다. 이러한 상태에서, 위험상황판단모듈의 학습은, 위험상황판단모듈 학습장치가, 위험상황판단모듈로 하여금 제2 학습용 주변 환경 정보를 입력(S611)받아 소정의 딥러닝 연산을 수행(S612)하도록 하고, 그 결과로서 출력되는 기설정된 복수개의 위험상황 카테고리 각각에 해당될 확률로서의 제2 학습용 예측값들에 대한 정보(S613)를 제2 GT와 비교하여 그 차이가 최소화되도록 상기 위험상황판단모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어질 수도 있다. 또한, 제2 학습용 예측값들과 제2 GT를 비교한 차이는 소정의 공식을 사용한 로스값으로 산출(S614)될 수 있고, 해당 로스값이 최소화되도록 하기 위하여 백프로퍼게이션(backpropagation)이 반복적으로 수행되어 위험상황판단모듈(40)에 포함된 복수개의 파라미터가 최적화(S615)될 수 있다.In addition, as another embodiment of the present invention, referring to FIG. 6B , a predetermined dangerous situation determination module learning apparatus uses at least one separately prepared second learning surrounding environment information as learning data, In this case, each of the second learning surrounding environment information may include information on a specific correct answer risk situation category corresponding to each as a second ground truth (GT). In this state, the learning of the dangerous situation determination module causes the dangerous situation determination module learning device to receive the second learning surrounding environment information (S611) and perform a predetermined deep learning operation (S612), and , as a result of comparing the information (S613) on the second learning predicted values as the probability corresponding to each of a plurality of preset dangerous situation categories output as a result with the second GT, the difference is minimized in the risk situation determination module. It may be achieved by performing a process of optimizing a plurality of parameters. In addition, the difference between the second prediction values for learning and the second GT may be calculated as a loss value using a predetermined formula (S614), and backpropagation is repeatedly performed to minimize the loss value. Thus, a plurality of parameters included in the dangerous situation determination module 40 may be optimized (S615).

그리고, 위험상황판단모듈(40)에 대한 학습은, 상술한 바와 같이 위험상황판단모듈 학습장치에 의해 수행될 수 있으나, 이와는 달리 위험상황판단모듈 학습장치가 별도의 학습용 위험상황판단모듈에 대한 학습을 수행하여 최적화된 파라미터 등의 학습결과를 위험상황판단모듈(40)에 적용할 수도 있고, 주행상황판단및가이드장치(10)가 위험상황판단모듈 학습장치로서 기능하여 위험상황판단모듈(40)에 대하여 학습을 수행할 수도 있으며, 이는 발명의 실시 조건에 따라 달라질 수 있다.And, the learning of the dangerous situation determination module 40, as described above, may be performed by the dangerous situation determination module learning device, but unlike this, the dangerous situation determination module learning device learns the separate dangerous situation determination module for learning to apply the learning results such as optimized parameters to the dangerous situation determination module 40 by performing It is also possible to perform learning on the , which may vary depending on the implementation conditions of the invention.

다음으로, 발명의 또 다른 일 실시예로서, 영상분석모듈(30)이 주변 상황 이미지를 분석하는 데 사용하는 소정의 알고리즘이 소정의 실시간 오브젝트 디텍션(real-time object detection) 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션(real-time instance segmentation) 알고리즘으로서 딥러닝 알고리즘을 사용하는 경우, 해당 알고리즘을 바탕으로 한 학습이 영상분석모듈(30)에 대하여 수행될 수 있다. 이를 별도의 도면(도 7)을 참조하여 구체적으로 설명하면 다음과 같다.Next, as another embodiment of the invention, the predetermined algorithm used by the image analysis module 30 to analyze the surrounding situation image is a predetermined real-time object detection algorithm or the predetermined real-time object detection algorithm. When a deep learning algorithm is used as a real-time instance segmentation algorithm, learning based on the algorithm may be performed on the image analysis module 30 . This will be described in detail with reference to a separate drawing ( FIG. 7 ) as follows.

도 7은 본 발명의 일 실시예에 따른, 학습용 주변 영상 이미지를 학습데이터로 하여 영상분석모듈에 대한 학습이 수행되는 과정을 개략적으로 나타내는 순서도이다.7 is a flowchart schematically illustrating a process in which learning of an image analysis module is performed using an image of a surrounding image for learning as learning data, according to an embodiment of the present invention.

도 7을 참조하면, 소정의 영상분석모듈 학습장치가, 자동차(1)에 장착된 카메라(20)로부터 획득되거나, 별도로 준비된 적어도 하나의 제2 학습용 주변 영상 이미지를 학습데이터로 하여 영상분석모듈(30)의 학습을 수행할 수 있는데, 이 때 제2 학습용 주변 영상 이미지 각각은, 그 각각에 포함된 적어도 하나의 학습용 오브젝트 각각에 대한 정답 오브젝트 정보를 제3 GT(Ground Truth)로서 포함하고 있을 수 있다. 이러한 상태에서, 영상분석모듈(30)의 학습은, 영상분석모듈 학습장치가, 영상분석모듈(30)로 하여금 제2 학습용 주변 영상 이미지를 입력(S701)받아 상기 소정의 실시간 오브젝트 디텍션 알고리즘 또는 상기 소정의 실시간 인스턴스 세그멘테이션 알고리즘을 사용하여 분석(S702)하도록 하고, 그 결과로서 출력되는 학습용 오브젝트 각각에 대한 정보(S703)를 상기 제3 GT와 비교하여 그 차이가 최소화되도록 상기 영상분석모듈에 포함된 복수개의 파라미터를 최적화하는 과정을 수행함으로써 이루어질 수 있다. 이 때, 학습용 오브젝트 각각에 대한 정보와 제3 GT를 비교한 차이는 소정의 공식을 사용한 로스값으로 산출(S704)될 수 있고, 해당 로스값이 최소화되도록 하기 위하여 백프로퍼게이션(backpropagation)이 반복적으로 수행되어 영상분석모듈(30)에 포함된 복수개의 파라미터가 최적화(S705)될 수 있다.7, the predetermined image analysis module learning apparatus, the image analysis module ( 30), each of the second learning surrounding image images may include correct answer object information for each of the at least one learning object included in each as the third GT (Ground Truth). have. In this state, in the learning of the image analysis module 30, the image analysis module learning apparatus causes the image analysis module 30 to receive the second learning surrounding image image (S701), and the predetermined real-time object detection algorithm or the Analyze (S702) using a predetermined real-time instance segmentation algorithm, and compare the information (S703) for each learning object output as a result with the third GT so that the difference is minimized. Included in the image analysis module This may be accomplished by performing a process of optimizing a plurality of parameters. At this time, the difference between the information on each learning object and the third GT may be calculated as a loss value using a predetermined formula (S704), and backpropagation is iteratively performed in order to minimize the loss value. is performed so that a plurality of parameters included in the image analysis module 30 may be optimized (S705).

그리고, 영상분석모듈(30)에 대한 학습은, 위험상황판단모듈(40)의 학습에서와 같이 영상분석모듈 학습장치가 별도의 학습용 영상분석모듈에 대한 학습을 수행하여 그 학습결과를 영상분석모듈(40)에 적용할 수도 있고, 주행상황판단및가이드장치(10)가 영상분석모듈 학습장치로서 기능하여 영상분석모듈(30)에 대하여 학습을 수행할 수도 있으며, 이는 발명의 실시 조건에 따라 달라질 수 있다.And, as for the learning of the image analysis module 30, as in the learning of the dangerous situation determination module 40, the image analysis module learning device performs learning on a separate image analysis module for learning, and the learning result is transferred to the image analysis module 40 may be applied, and the driving situation determination and guide device 10 may function as an image analysis module learning device to perform learning on the image analysis module 30, which may vary depending on the implementation conditions of the invention. can

이상 설명된 본 발명에 따른 실시예들은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magnetooptical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The embodiments according to the present invention described above may be implemented in the form of program instructions that can be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include hard disks, magnetic media such as floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floppy disks. , and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform processing according to the present invention, and vice versa.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명이 상기 실시예들에 한정되는 것은 아니며, 본 발명이 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형을 꾀할 수 있다.In the above, the present invention has been described with specific matters such as specific components and limited embodiments and drawings, but these are provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , those of ordinary skill in the art to which the present invention pertains can devise various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and not only the claims described below but also all modifications equivalently or equivalently to the claims described below belong to the scope of the spirit of the present invention. will do it

1 : 자동차 10 : 주행상황판단및가이드장치
20 : 카메라 30 : 영상분석모듈
40 : 위험상황판단모듈 50 : 주행가이드생성모듈
60 : 운전자 단말
150 : 백본 블록 200 : 바틀넥 블록
300 : 피처 피라미드 네트워크 400 : 인스턴스 세그먼테이션 네트워크1: Car 10: Driving situation judgment and guide device
20: camera 30: image analysis module
40: dangerous situation determination module 50: driving guide generation module
60: driver terminal
150: backbone block 200: bottleneck block
300: feature pyramid network 400: instance segmentation network

Claims

In the method of determining whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to give an alarm,
(a) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the driving condition determination and guide device inputs the surrounding video image to the video analysis module to cause the video analysis module to cause the surrounding image analyzing the image to output surrounding environment information including information on at least one object existing in the vicinity of the vehicle;
(b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and
(c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation category generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to , so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category;
including,
A plurality of voice guide templates in a predetermined database - Each of the voice guide templates has at least one slot, which is a part whose contents can be changed and input according to circumstances, and each of the slots includes the information about the surrounding environment. Any one of location information, size information, movement status information, and type information corresponding to each of the at least one object included in the . characterized,
Each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as each dangerous situation,
In step (c),
The driving situation determination and guide device causes the driving guide generation module to (i) information on the specific dangerous situation category, (ii) information on specific dangerous conditions corresponding to the specific dangerous situation category, and (iii) With reference to information on a specific voice guide template corresponding to the specific dangerous situation category, each of the slots included in the specific voice guide template is input to generate the specific voice driving guide information completed, and the generated specific voice driving guide information is generated. The method of claim 1, wherein the voice driving guide information is directly provided to the driver or transmitted to a predetermined driver's terminal and supported to be provided.

delete

According to claim 1,
The driving guide generation module is characterized in that it is linked with a predetermined TTS (Text-To-Speech) engine,
In step (c),
The driving situation determination and guide device further performs a process of causing the driving guide generation module to generate specific TTS data by applying the TTS engine to the generated specific voice driving guide information, and the specific TTS data A method characterized in that the specific voice information reproduced by the . is directly provided to the driver or transmitted to a predetermined driver's terminal and supported to be provided.

According to claim 1,
Each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as the respective dangerous situation,
In step (b),
The driving situation determination and guide device causes the dangerous situation determination module to (i) the specific dangerous situation category information, (ii) information on a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the surroundings A process of specifying a risk object corresponding to the risk condition among at least one of the objects included in the surrounding image image with reference to environmental information, and generating risk factor information including information on the risk object A method characterized in that it is performed.

5. The method of claim 4,
In step (c),
The driving situation determination and guide device causes the driving guide generation module to receive the risk factor information, so that the danger guide signal preset to correspond to the coordinates corresponding to each of the danger objects with respect to the surrounding image image is additionally added A method characterized in that the displayed image is generated as the specific visual driving guide information, and the generated specific visual driving guide information is directly provided to the driver or transmitted to a predetermined driver terminal and supported to be provided.

In the method of determining whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to give an alarm,
(a) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the driving condition determination and guide device inputs the surrounding video image to the video analysis module to cause the video analysis module to cause the surrounding image analyzing the image to output surrounding environment information including information on at least one object existing in the vicinity of the vehicle;
(b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and
(c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation category generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to , so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category;
including,
Before step (a),
A predetermined danger situation determination module learning device, (i) each of the at least one first surrounding video image for learning obtained from the camera mounted on the vehicle or separately prepared - Each of the first surrounding video image for learning is a preset risk In the context category, the information on the specific correct answer risk situation category corresponding to each of the first surrounding image images for learning is included as the first GT (Ground Truth) - as learning data, the image analysis module causes the first learning Each of the surrounding video images is received and analyzed to output the first surrounding environment information for learning, and learning about the dangerous situation determination module is performed using the outputted first surrounding environment information for learning, or (ii) separately Prepared at least one second surrounding environment information for learning - Each of the second learning surrounding environment information includes information on a specific correct answer risk situation category corresponding to each as a second GT (Ground Truth) - as learning data It is characterized in that learning about the dangerous situation determination module is performed using
In the learning of the dangerous situation determination module, the dangerous situation determination module learning device (i) causes the dangerous situation determination module to receive the first learning surrounding environment information and perform the predetermined deep learning operation, and the A plurality of parameters included in the risk situation determination module so that the difference is minimized by comparing the information on the first learning prediction values as probabilities corresponding to each of the plurality of preset dangerous situation categories output as a result with the first GT performing a process of optimizing, or (ii) causing the risk situation determination module to receive the second learning surrounding environment information and perform the predetermined deep learning operation, and the plurality of preset risks output as a result It is characterized by performing a process of optimizing a plurality of parameters included in the dangerous situation determination module so that the difference is minimized by comparing the information on the second learning prediction values as the probability corresponding to each situation category with the second GT. how to do it

delete

In the method of determining whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to give an alarm,
(a) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the driving condition determination and guide device inputs the surrounding video image to the video analysis module to cause the video analysis module to cause the surrounding image analyzing the image to output surrounding environment information including information on at least one object existing in the vicinity of the vehicle;
(b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and
(c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation category generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to , so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category;
including,
In step (a),
The driving situation determination and guide device causes the image analysis module to detect each of the at least one object included in the surrounding image image by using a predetermined algorithm, position information of each detected object, size information, It is characterized in that information including at least part of movement status information and type information is output as the surrounding environment information,
Before step (a),
The predetermined image analysis module learning apparatus performs learning based on the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm for the image analysis module. It is characterized by performing
In the learning of the image analysis module, the image analysis module learning apparatus obtains from the camera mounted on the vehicle, or at least one separately prepared second surrounding image image for learning - each of the second learning surrounding image image is, Including correct object information for each of the at least one learning object included in each as the third GT (Ground Truth) - by using as learning data, the image analysis module receives the surrounding image image for the second learning Analysis is performed using the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm, and the resultant information on each of the learning objects is compared with the third GT to minimize the difference. A method characterized in that by performing a process of optimizing a plurality of parameters included in the module.

In the method of determining whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to give an alarm,
(a) when the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the driving condition determination and guide device inputs the surrounding video image to the video analysis module to cause the video analysis module to cause the surrounding image analyzing the image to output surrounding environment information including information on at least one object existing in the vicinity of the vehicle;
(b) the driving situation determination and guide device inputs the surrounding environment information to the dangerous situation determination module, and allows the dangerous situation determination module to perform deep learning operations on the surrounding environment information to correspond to each of a plurality of preset dangerous situation categories outputting predicted values as a probability of becoming the vehicle, and determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and
(c) when the driving situation determination and guide device determines that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause the specific dangerous situation category generating at least one of specific visual driving guide information and specific voice driving guide information corresponding to , so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category;
including,
In step (a),
The driving situation determination and guide device causes the image analysis module to detect each of the at least one object included in the surrounding image image by using a predetermined algorithm, position information of each detected object, size information, It is characterized in that information including at least part of movement status information and type information is output as the surrounding environment information,
The driving situation determination and guidance device causes the image analysis module to apply a predetermined real-time object detection algorithm or a predetermined real-time instance segmentation algorithm to the surrounding video image. Detecting each of the at least one included object or performing instance segmentation of the surrounding video image,
The step (a) is,
(a1) When the surrounding video image is acquired, the driving situation determination and guide device causes the video analysis module to input the surrounding video image to the backbone block of ResNET and cause the backbone block to sequentially process the surrounding video image outputting down-sampled first down-sampling feature maps through convolution operation to m-th, where m is an integer greater than or equal to 2, to output a down-sampling feature map;
(a2) the driving situation determination and guidance device causes the image analysis module to input a specific down-sampling feature map to the first (1*1) convolutional layer to the first (1*1) convolutional layer (1*1) convolution operation on the specific down-sampling feature map to generate a first feature map with an adjusted number of channels, and expand the first feature map to (1*r) - where r is an integer greater than or equal to 1 - (k*k) with a ratio - where k is an integer greater than or equal to 2 - a first (k*k) convolutional layer including a kernel to (n*r) - where n is an integer greater than or equal to 2 - (k) with an extension ratio *k) each input to an nth (k*k) convolutional layer including a kernel to cause each of the first (k*k) convolutional layer to the nth (k*k) convolutional layer to form the first The channels of the feature map are divided into at least two groups, and each of the first feature maps corresponding to the at least two groups is divided into a (k*k) convolution operation using an (1*r) expansion ratio to (n* r) a first process for generating a 2_1 th feature map to a 2_n th feature map by performing a (k*k) convolution operation using an extension ratio, and a 2_1 (1) *1) Convolutional layers to 2_n (1*1) convolutional layers are input to each of the 2_1 (1*1) convolutional layers to the 2_n (1*1) convolutional layers. A (1*1) convolution operation is performed on the 2_1 feature map to the 2_nth feature map to generate 3_1 to 3_n feature maps with the number of channels adjusted, and the 3_1 to 3_n feature maps are convolutional. Kate and input as a third (1*1) convolution layer to cause the third (1*1) convolution layer to convert the concatenated 3_1 to 3_n feature maps (1*1) Convolution is performed to generate a fourth feature map with the number of channels adjusted, and the specific down-sampling feature map and the fourth feature map are combined. Through a second process of concatenating to generate a transformed down-sampling feature map, the m-th down-sampling feature map to (mj) - wherein j is an integer greater than or equal to 1 and less than m - is added to each of the down-sampling feature maps generating an m-th transformed down-sampling feature map to an (mj)-th transformed down-sampling feature map by applying the first process and the second process; and
(a3) the driving situation determination and guide device causes the image analysis module to input the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map into a feature pyramid network, and the feature pyramid To cause the network to generate an m-th up-sampling feature map to (mj)-th up-sampling feature map through a deconvolution operation referring to the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map. and input the (mj)th up-sampling feature map to an object detection network or an instance segmentation network to cause the object detection network to detect an object on the surrounding video image, or cause the instance segmentation network to instantiate the surrounding video image segmentation;
How to include.

10. The method of claim 9,
In step (a3),
The image analysis module causes the feature pyramid network to down-sample the m-th up-sampling feature map to generate a (m+j)-th up-sampling feature map, the (mj)-th up-sampling feature map to the (mj)-th up-sampling feature map. (m+j) additionally input an up-sampling feature map to the instance segmentation network to cause the instance segmentation network to further refer to the (mj)-th up-sampling feature map to the (m+j)-th up-sampling feature map. How to make an instance segmentation of the surrounding video image.

10. The method of claim 9,
In step (a3),
The image analysis module causes the feature pyramid network to generate a specific up-sampling feature map by up-sampling an up-sampling feature map, where i is an integer greater than or equal to (m-(j+1)) and less than or equal to m; A method of concatenating the specific up-sampling feature map and a corresponding (i-1)-th down-sampling feature map to generate an (i-1)-th up-sampling feature map.

In the device for judging whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to warn it,
at least one memory storing instructions; and
at least one processor configured to execute the instructions; including,
the processor,
(I) When the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the surrounding video image is input to the video analysis module, and the video analysis module analyzes the surrounding video image to analyze the surrounding video image of the vehicle. a process of outputting surrounding environment information including information on at least one object existing in the . (II) input the surrounding environment information to the dangerous situation determination module so that the dangerous situation determination module performs a deep learning operation on the surrounding environment information to output predicted values as probabilities corresponding to each of a plurality of preset dangerous situation categories, a process of determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (III) when it is determined that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause specific visual driving guide information corresponding to the specific dangerous situation category and generating at least one of specific voice driving guide information so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category. do, but
A plurality of voice guide templates in a predetermined database - Each of the voice guide templates has at least one slot, which is a part whose contents can be changed and input according to circumstances, and each of the slots includes the information about the surrounding environment. Any one of location information, size information, movement status information, and type information corresponding to each of the at least one object included in the . characterized,
Each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as each dangerous situation,
In the process (III) above,
The processor is configured to cause the driving guide generation module to (i) information on the specific dangerous situation category, (ii) information on a specific dangerous situation corresponding to the specific dangerous situation category, and (iii) the specific dangerous situation category. with reference to information on a specific voice guide template corresponding to Device characterized in that to directly provide to the driver or to transmit to a predetermined driver terminal to support the provision.

delete

13. The method of claim 12,
The driving guide generation module is characterized in that it is linked with a predetermined TTS (Text-To-Speech) engine,
In the process (III) above,
The processor may further perform a process of causing the driving guide generation module to generate specific TTS data by applying the TTS engine to the generated specific voice driving guide information, thereby reproducing the specific TTS data. Device characterized in that the information is directly provided to the driver or transmitted to a predetermined driver's terminal to support the provision.

13. The method of claim 12,
Each of the plurality of preset dangerous situation categories is characterized in that it further includes information on at least one dangerous condition that can be determined as the respective dangerous situation,
In the process (II) above,
The processor causes the dangerous situation determination module to refer to (i) the specific dangerous situation category information, (ii) information about a specific dangerous condition corresponding to the specific dangerous situation category, and (iii) the surrounding environment information. Further performing a process of specifying a risk object corresponding to the risk condition among at least one of the objects included in the surrounding image image, and generating risk factor information including information on the risk object device to do.

16. The method of claim 15,
In the process (III) above,
The processor causes the driving guide generating module to receive the risk factor information, so as to correspond to the coordinates corresponding to each of the danger objects with respect to the surrounding image image, an image in which a preset danger guide signal is additionally displayed. An apparatus characterized in that it is generated as visual driving guide information, and the generated specific visual driving guide information is directly provided to the driver or transmitted to a predetermined driver's terminal and supported to be provided.

In the device for judging whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to warn it,
at least one memory storing instructions; and
at least one processor configured to execute the instructions; including,
the processor,
(I) When the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the surrounding video image is input to the video analysis module, and the video analysis module analyzes the surrounding video image to analyze the surrounding video image of the vehicle. a process of outputting surrounding environment information including information on at least one object existing in the . (II) input the surrounding environment information to the dangerous situation determination module so that the dangerous situation determination module performs a deep learning operation on the surrounding environment information to output predicted values as probabilities corresponding to each of a plurality of preset dangerous situation categories, a process of determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (III) when it is determined that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause specific visual driving guide information corresponding to the specific dangerous situation category and generating at least one of specific voice driving guide information so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category. do, but
Prior to the (I) process,
A predetermined danger situation determination module learning device, (i) each of the at least one first surrounding video image for learning obtained from the camera mounted on the vehicle or separately prepared - Each of the first surrounding video image for learning is a preset risk In the context category, the information on the specific correct answer risk situation category corresponding to each of the first surrounding image images for learning is included as the first GT (Ground Truth) - as learning data, the image analysis module causes the first learning Each of the surrounding video images is received and analyzed to output the first surrounding environment information for learning, and learning about the dangerous situation determination module is performed using the outputted first surrounding environment information for learning, or (ii) separately Prepared at least one second surrounding environment information for learning - Each of the second learning surrounding environment information includes information on a specific correct answer risk situation category corresponding to each as a second GT (Ground Truth) - as learning data It is characterized in that learning about the dangerous situation determination module is performed using
In the learning of the dangerous situation determination module, the dangerous situation determination module learning device (i) causes the dangerous situation determination module to receive the first learning surrounding environment information and perform the predetermined deep learning operation, and the A plurality of parameters included in the risk situation determination module so that the difference is minimized by comparing the information on the first learning prediction values as probabilities corresponding to each of the plurality of preset dangerous situation categories output as a result with the first GT performing a process of optimizing, or (ii) causing the risk situation determination module to receive the second learning surrounding environment information and perform the predetermined deep learning operation, and the plurality of preset risks output as a result It is characterized by performing a process of optimizing a plurality of parameters included in the dangerous situation determination module so that the difference is minimized by comparing the information on the second learning prediction values as the probability corresponding to each situation category with the second GT. device to do.

delete

In the device for judging whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to warn it,
at least one memory storing instructions; and
at least one processor configured to execute the instructions; including,
the processor,
(I) When the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the surrounding video image is input to the video analysis module, and the video analysis module analyzes the surrounding video image to analyze the surrounding video image of the vehicle. a process of outputting surrounding environment information including information on at least one object existing in the . (II) input the surrounding environment information to the dangerous situation determination module so that the dangerous situation determination module performs a deep learning operation on the surrounding environment information to output predicted values as probabilities corresponding to each of a plurality of preset dangerous situation categories, a process of determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (III) when it is determined that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause specific visual driving guide information corresponding to the specific dangerous situation category and generating at least one of specific voice driving guide information so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category. do, but
In the process (I) above,
The processor causes the image analysis module to detect each of the at least one object included in the surrounding image image using a predetermined algorithm, and position information, size information, movement status information and type of each detected object. It is characterized in that information including at least a part of the information is output as the surrounding environment information,
Prior to the (I) process,
The predetermined image analysis module learning apparatus performs learning based on the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm for the image analysis module. It is characterized by performing
In the learning of the image analysis module, the image analysis module learning apparatus obtains from the camera mounted on the vehicle, or at least one separately prepared second surrounding image image for learning - each of the second learning surrounding image image is, Including correct object information for each of the at least one learning object included in each as the third GT (Ground Truth) - by using as learning data, the image analysis module receives the surrounding image image for the second learning Analysis is performed using the predetermined real-time object detection algorithm or the predetermined real-time instance segmentation algorithm, and the resultant information on each of the learning objects is compared with the third GT to minimize the difference. An apparatus, characterized in that by performing a process of optimizing a plurality of parameters included in the module.

In the device for judging whether the surrounding situation of the vehicle is a dangerous situation, and generating a driving guide to warn it,
at least one memory storing instructions; and
at least one processor configured to execute the instructions; including,
the processor,
(I) When the surrounding video image of the vehicle is obtained from at least one camera mounted on the vehicle, the surrounding video image is input to the video analysis module, and the video analysis module analyzes the surrounding video image to analyze the surrounding video image of the vehicle. a process of outputting surrounding environment information including information on at least one object existing in the . (II) input the surrounding environment information to the dangerous situation determination module so that the dangerous situation determination module performs a deep learning operation on the surrounding environment information to output predicted values as probabilities corresponding to each of a plurality of preset dangerous situation categories, a process of determining whether the surrounding situation of the vehicle corresponds to a specific dangerous situation category with reference to the predicted values; and (III) when it is determined that the surrounding situation of the vehicle corresponds to the specific dangerous situation category among the plurality of preset dangerous situation categories, causes the driving guide generation module to cause specific visual driving guide information corresponding to the specific dangerous situation category and generating at least one of specific voice driving guide information so that the driver of the vehicle can recognize a specific dangerous situation corresponding to the specific dangerous situation category. do, but
In the process (I) above,
The processor causes the image analysis module to detect each of the at least one object included in the surrounding image image using a predetermined algorithm, and position information, size information, movement status information and type of each detected object. It is characterized in that information including at least a part of the information is output as the surrounding environment information,
the processor causes the image analysis module to use a predetermined real-time object detection algorithm or a predetermined real-time instance segmentation algorithm to perform at least one It is characterized in that each of the objects is detected or the surrounding video image is instance-segmented,
The (I) process is
(I-1) When the surrounding video image is obtained, the processor causes the image analysis module to input the surrounding video image to a backbone block of ResNET and cause the backbone block to sequentially perform a convolution operation on the surrounding video image a subprocess for outputting down-sampled first down-sampling feature maps to m-th, where m is an integer greater than or equal to 2—down-sampling feature maps;
(I-2) the processor causes the image analysis module to input a specific down-sampling feature map as a first (1*1) convolutional layer to cause the first (1*1) convolutional layer to cause the specific downsampling feature map (1*1) convolution operation on the down-sampling feature map to generate a first feature map with the number of channels adjusted, and (1*r) - where r is an integer greater than or equal to 1 - has an expansion ratio (k*k) - wherein k is an integer greater than or equal to 2 - a first (k*k) convolutional layer including a kernel to (n*r) - where n is an integer greater than or equal to 2 - (k*k) with an extension ratio Each of the first (k*k) convolutional layer to the nth (k*k) convolutional layer is inputted to the nth (k*k) convolutional layer including the kernel, so that each of the first feature map The channels are divided into at least two groups, and each of the first feature maps corresponding to the at least two groups is expanded by a (k*k) convolution operation or (n*r) by a (1*r) extension ratio. A first process for generating a 2_1 th feature map to a 2_n th feature map by performing a (k*k) convolution operation by a ratio, and a 2_1 (1*1) The 2_1 (1*1) convolution layer to the 2_n (1*1) convolution layer are inputted as the 2_n (1*1) convolutional layer to each of the 2_1 feature maps. to (1*1) convolution operation on the 2_n-th feature map to generate 3_1 to 3_n feature maps with the number of channels adjusted, and concatenate the 3_1 to 3_n feature maps to By inputting the third (1*1) convolutional layer, the third (1*1) convolution layer performs a (1*1) convolution operation on the concatenated 3_1 to 3_n feature maps. to generate a fourth feature map with an adjusted number of channels, and concatenate the specific down-sampling feature map and the fourth feature map Through a second process of generating a down-sampling feature map transformed by and a sub-process for generating an m-th transformed down-sampling feature map to an (mj)-th transformed down-sampling feature map by applying the second process; and
(I-3) the processor causes the image analysis module to input the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map to a feature pyramid network to cause the feature pyramid network to generate an m-th up-sampling feature map to (mj)-th up-sampling feature map through a deconvolution operation referring to the m-th transformed down-sampling feature map to the (mj)-th transformed down-sampling feature map, and (mj) input the up-sampling feature map to an object detection network or an instance segmentation network to cause the object detection network to detect an object on the surrounding video image or to cause the instance segmentation network to instance segment the surrounding video image subprocess; device that does it.

21. The method of claim 20,
In the process (I-3) above,
The image analysis module causes the feature pyramid network to down-sample the m-th up-sampling feature map to generate a (m+j)-th up-sampling feature map, the (mj)-th up-sampling feature map to the (mj)-th up-sampling feature map. (m+j) additionally input an up-sampling feature map to the instance segmentation network to cause the instance segmentation network to further refer to the (mj)-th up-sampling feature map to the (m+j)-th up-sampling feature map. Apparatus characterized in that it enables the instance segmentation of the surrounding video image.

21. The method of claim 20,
In the process (I-3) above,
The image analysis module causes the feature pyramid network to generate a specific up-sampling feature map by up-sampling an up-sampling feature map, where i is an integer greater than or equal to (m-(j+1)) and less than or equal to m; and concatenating the specific up-sampling feature map and a corresponding (i-1)-th down-sampling feature map to generate an (i-1)-th up-sampling feature map.