KR20200101509A

KR20200101509A - Method and apparatus for detecting traffic light

Info

Publication number: KR20200101509A
Application number: KR1020190011788A
Authority: KR
Inventors: 선우명호; 장철훈
Original assignee: (주)컨트롤웍스
Priority date: 2019-01-30
Filing date: 2019-01-30
Publication date: 2020-08-28
Also published as: KR102178202B1

Abstract

Disclosed is a traffic light detecting method. According to one embodiment of the present invention, the traffic light detecting method comprises: an image conversion step in which a first learning model converts an input image photographed of a traffic light and a surrounding environment thereof into a low-channel converted image composed of at least a first signal color, a second signal color, and a background color; a candidate region selection step of selecting at least one traffic light candidate region based on a plurality of first color groups having the first signal color and a plurality of second color groups having the second signal color in the low-channel converted image by a second learning model; and determining, by a third learning model, a state of the traffic light based on partial images of the input image corresponding to the traffic light candidate region. According to another embodiment of the present invention, provided is a traffic light detection device which performs traffic light detection by the traffic light detection method.

Description

Method and apparatus for detecting traffic light TECHNICAL FIELD

본 개시는 신호등 검출 방법 및 장치에 관한 것이다.The present disclosure relates to a traffic light detection method and apparatus.

이 부분에 기술된 내용은 단순히 본 개시에 대한 배경정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information on the present disclosure and does not constitute prior art.

딥러닝은 머신 러닝의 한 종류로서, 데이터로부터 직접 유용한 대표적인 특징들을 학습하여 다중 비선형 처리 계층을 사용하는 것을 말한다. 한편, 머신러닝 또는 딥러닝을 활용하여 스마트폰, CCTV, 블랙박스, 위성 영상 등으로부터 수집되는 영상 데이터를 기반으로 사물 등을 인식 및 분석하고, 이를 활용하기 위한 기술 개발의 필요성이 강조되는 추세이다.Deep learning is a kind of machine learning, and refers to using multiple nonlinear processing layers by learning representative features that are useful directly from data. Meanwhile, the necessity of developing technology to recognize and analyze objects based on image data collected from smartphones, CCTV, black boxes, satellite images, etc. using machine learning or deep learning, is being emphasized. .

차량의 자율 주행을 위해 주변 도로 상황을 실시간으로 인식하는 기술이 개발되고 있다. 이 기술은 특히, 차량의 도로 주행 시 발생하는 이벤트, 예를 들면 인접 차량의 주행 상태, 사고 발생 여부 및 도로 상태 등을 실시간으로 감지하고, 감지된 이벤트에 대응하여 차량의 주행 상태를 조정하거나 이를 운전자에게 알려는 것을 필요로 한다.A technology for real-time recognition of surrounding road conditions is being developed for autonomous vehicle driving. In particular, this technology detects in real time events that occur when a vehicle is driving on the road, for example, the driving state of adjacent vehicles, whether an accident has occurred, and the road condition, and adjusts the driving state of the vehicle in response to the detected event. It is necessary to inform the driver.

다만, 차량의 주변 도로 상황은 도로의 위치에 따라 그리고 주행 시간에 따라 지속적으로 변하는 것은 물론이고, 카메라를 통해 촬영된 외부 환경에 대한 데이터도 날씨, 해의 방향, 주행 시간(주간 또는 야간) 등에 의존적이기 때문에, 이와 같은 동적 변수를 모두 고려하여 차량의 주변 상황 특히 신호등의 위치 또는 상태 등을 신뢰성 있게 검출하는 것은 쉽지 않은 과제이다.However, the road conditions around the vehicle not only continuously change according to the location of the road and the driving time, but also the data on the external environment captured by the camera are also weather, the direction of the sun, and the driving time (day or night). Since it is dependent, it is not easy to reliably detect the surrounding situation of the vehicle, particularly the location or state of traffic lights, taking into account all of these dynamic variables.

이에, 차량의 주변 상황 인식, 특히, 신호등 검출에 앞서 설명한 딥러닝을 적용하려는 시도가 진행되고 있다.Accordingly, attempts are being made to apply the aforementioned deep learning to the recognition of the surrounding situation of the vehicle, in particular, detection of traffic lights.

다만, 신호등의 검출 및 그 후속 조치를 위해서, 신호등의 검출은 상당한 거리, 예를 들어, 신호등으로부터 50미터 내지 100미터 거리에서도 신뢰성 있게 이루어져야 한다. 신호등의 검출은 카메라를 통해 인지되는 외부 환경 데이터, 즉, 영상 데이터를 통해 이루어지기 때문에, 상당 거리에서 신호등을 식별하기 위해서는 높은 해상도의 영상 이미지를 필요로 한다. 신호등 식별 과정에서 사용되는 영상 이미지의 해상도가 증가하면 신호등 검출을 위한 연산량이 증가하게 되고, 이로 인해 신호등 인식의 실시간성이 떨어질 수 있다.또한, 이는 시스템의 처리 속도 또는 비용을 증가시키는 어려움을 일으킨다.However, for the detection of traffic lights and their follow-up measures, the detection of traffic lights must be reliably carried out even at a considerable distance, for example 50 to 100 meters from the traffic light. Since the detection of a traffic light is performed through external environment data recognized through a camera, that is, image data, a high-resolution video image is required to identify a traffic light from a considerable distance. When the resolution of the video image used in the process of identifying a traffic light increases, the amount of computation for detecting a traffic light increases, which may reduce the real-time performance of the traffic light recognition. In addition, this causes difficulty in increasing the processing speed or cost of the system. .

아울러, 높은 해상도를 갖는 전체 주변 이미지에서 신호등에 대응하는 영상이 차지하는 영역 또는 픽셀 수는 상대적으로 매우 작다. 이는 신호등의 영상 변화 특히, 그 색 변화가 전체 영상의 특질에 미치는 영향이 적다고 해석될 수 있다. 즉, 신호등의 위치 및 그 상태 변화에 민감한 그리고 신뢰성 있는 신호등 검출 모델을 학습하는 것은 쉽지 않은 문제이다.In addition, the area or number of pixels occupied by an image corresponding to a traffic light in the entire surrounding image having a high resolution is relatively small. This can be interpreted as having little effect on the characteristics of the entire image, especially the change in the color of the traffic light. That is, it is a difficult problem to learn a reliable and reliable traffic light detection model that is sensitive to changes in the location and state of the traffic light.

이에, 본 발명은 차량 주행 시 도로 상의 신호등의 위치 및 상태를 인식함에 있어서, 신호등 인식 모델을 최적화하여 임베디드 시스템에 적용성을 높이고, 연산량을 최소화하여 실차 적용 시 실시간성을 확보하는 신호등 검출 방법 및 장치를 제공하는 데 주된 목적이 있다.Accordingly, in recognizing the location and state of traffic lights on the road during vehicle driving, the present invention improves applicability to an embedded system by optimizing the traffic light recognition model, and minimizes the amount of computation to secure real-time when applying a real vehicle, and The main purpose is to provide the device.

또한, 본 발명은 지도 학습을 통해 신호등에 대한 인식 정확도를 높일 수 있는 신호등 검출 방법 및 장치를 제공하는 데 주된 목적이 있다.In addition, an object of the present invention is to provide a traffic light detection method and apparatus capable of increasing the recognition accuracy of a traffic light through supervised learning.

본 발명의 목적들은 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.Objects of the present invention are not limited to the problems mentioned above, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명의 일 실시예에 의하면, 제1학습모델이 신호등 및 그 주변 환경을 촬영한 입력영상을 적어도 제1신호색, 제2 신호색 및 배경색으로 구성된 저채널 변환영상으로 변환하는 영상변환단계, 제2학습모델이 상기 저채널 변환영상 내에서 상기 제1 신호색을 갖는 복수의 제1 색군 및 상기 제2 신호색을 갖는 복수의 제2 색군을 기초로 하나 이상의 신호등 후보영역을 선별하는 후보 영역 선별단계 및, 제3학습모델이 상기 신호등 후보영역에 대응하는 상기 입력영상의 부분 영상들을 기초로 상기 신호등의 상태를 판단하는 신호등 상태 판단단계를 포함하는 신호등 검출 방법을 제공한다.According to an embodiment of the present invention, an image conversion step of converting, by the first learning model, an input image photographing a traffic light and its surrounding environment into a low-channel converted image composed of at least a first signal color, a second signal color, and a background color, A candidate region in which the second learning model selects one or more traffic light candidate regions based on a plurality of first color groups having the first signal color and a plurality of second color groups having the second signal color in the low-channel converted image It provides a traffic light detection method including a selection step, and a traffic light state determination step of determining a state of the traffic light based on partial images of the input image corresponding to the traffic light candidate region by a third learning model.

본 발명의 다른 실시예에 의하면, 위 신호등 검출 방법에 의해 신호등 검출을 수행하는 신호등 검출 장치를 제공한다.According to another embodiment of the present invention, there is provided a traffic light detection apparatus for detecting a traffic light by the above traffic light detection method.

도 1은 본 발명의 일 실시예에 따른 신호등 검출 방법 및 장치에서 수행되는 각 과정을 도시한 순서도이다.
도 2는 본 발명의 일 실시예에 따른 신호등 검출 방법 및 장치에서 수행되는 각 과정이 도시된 개념도이다.
도 3은 본 발명의 일 실시예의 영상변환단계의 구체적인 과정을 도시한 순서도이다.
도 4는 본 발명의 일 실시예에서 예시적인 입력영상이 표현된 것이다.
도 5는 본 발명의 일 실시예에서 예시적인 저채널 변환영상이 표현된 것이다.
도 6은 본 발명의 일 실시예에서 제1학습모델의 학습 과정을 도시한 순서도이다.
도 7은 본 발명의 일 실시예의 후보영역 선별단계의 구체적인 과정을 도시한 순서도이다.
도 8은 본 발명의 일 실시예에서 입력영상 상에 복수의 색군에 대응하는 색군대응영역을 표시한 것이다.
도 9는 본 발명의 일 실시예에서 입력영상 상에 신호등 후보영역을 표시한 상태를 도시한 것이다.
도 10은 본 발명의 일 실시예에서 제2학습모델의 학습 과정을 도시한 순서도이다.
도 11은 본 발명의 일 실시예의 신호등상태 판단단계의 구체적인 과정을 도시한 순서도이다.
도 12는 본 발명의 일 실시예의 신호등상태 판단단계에서 신호등 후보영역의 신호등 상태가 검출되는 상태를 표현한 것이다.
도 13은 본 발명의 일 실시예에서 제3학습모델의 학습 과정을 도시한 순서도이다.1 is a flow chart showing each process performed in a traffic light detection method and apparatus according to an embodiment of the present invention.
2 is a conceptual diagram illustrating each process performed in a method and apparatus for detecting a traffic light according to an embodiment of the present invention.
3 is a flow chart showing a detailed process of an image conversion step according to an embodiment of the present invention.
4 shows an exemplary input image according to an embodiment of the present invention.
5 shows an exemplary low-channel converted image according to an embodiment of the present invention.
6 is a flow chart showing a learning process of a first learning model in an embodiment of the present invention.
7 is a flow chart showing a detailed process of the step of selecting a candidate region according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating color group correspondence regions corresponding to a plurality of color groups on an input image according to an embodiment of the present invention.
9 is a diagram illustrating a state in which a traffic light candidate area is displayed on an input image according to an embodiment of the present invention.
10 is a flow chart showing a learning process of a second learning model in an embodiment of the present invention.
11 is a flow chart showing a detailed process of the step of determining a state of a traffic light according to an embodiment of the present invention.
12 illustrates a state in which a traffic light state of a traffic light candidate region is detected in a traffic light state determination step according to an embodiment of the present invention.
13 is a flow chart illustrating a learning process of a third learning model in an embodiment of the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성 요소들에 참조 부호를 부가함에 있어서, 동일한 구성 요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding reference numerals to constituent elements in each drawing, it should be noted that the same constituent elements are given the same reference numerals as much as possible even though they are indicated on different drawings. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted.

본 발명에 따른 실시예의 구성요소를 설명하는 데 있어서, 제1, 제2, i), ii), a), b) 등의 부호를 사용할 수 있다. 이러한 부호는 그 구성요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 부호에 의해 해당 구성요소의 본질 또는 차례나 순서 등이 한정되지 않는다. 명세서에서 어떤 부분이 어떤 구성요소를 '포함' 또는 '구비'한다고 할 때, 이는 명시적으로 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In describing the components of the embodiment according to the present invention, reference numerals such as first, second, i), ii), a), b) may be used. These codes are only for distinguishing the constituent elements from other constituent elements, and the nature, order, or order of the constituent elements are not limited by the symbols. In the specification, when a part'includes' or'includes' a certain element, it means that other elements may be further included rather than excluding other elements unless explicitly stated to the contrary. .

도 1은 본 발명의 일 실시예에 따른 신호등 검출 방법 및 장치에서 수행되는 각 과정을 도시한 순서도이며, 도 2는 본 발명의 일 실시예에 따른 신호등 검출 방법 및 장치에서 수행되는 각 과정이 도시된 개념도이다.1 is a flowchart showing each process performed by a traffic light detection method and apparatus according to an embodiment of the present invention, and FIG. 2 is a flowchart illustrating each process performed by a traffic light detection method and apparatus according to an embodiment of the present invention. It is a conceptual diagram.

이하 도 1 및 도 2를 참조하여, 본 발명의 일 실시예에 따른 신호등 검출 방법 및 장치에서 수행되는 각 과정을 개략적으로 설명한다.Hereinafter, each process performed in a traffic light detection method and apparatus according to an embodiment of the present invention will be schematically described with reference to FIGS. 1 and 2.

도 1을 참조하면, 본 발명의 일 실시예에 따른 신호등 검출 방법은 영상변환단계(S1), 후보영역 선별단계(S2) 및 신호등상태 판단단계(S3)를 포함한다.Referring to FIG. 1, a method of detecting a traffic light according to an embodiment of the present invention includes an image conversion step (S1), a candidate region selection step (S2), and a traffic light state determination step (S3).

영상변환단계(S1)에서는 입력영상(11)을 복수의 채널을 가진 저채널 변환영상(12)으로 변환한다.In the image conversion step S1, the input image 11 is converted into a low-channel converted image 12 having a plurality of channels.

이때 입력영상(11)은 차량의 외부 영상, 특히, 차량의 전방을 촬영하도록 설치된 카메라에 의해 촬영 및 입수되는 영상을 의미한다. 여기서, 카메라의 크기, 설치 위치 및 종류는 한정되지 않으며, 예를 들어, 차량 주행 시 블랙박스 카메라로부터 입수된 영상일 수도 있다.In this case, the input image 11 refers to an external image of the vehicle, in particular, an image photographed and acquired by a camera installed to photograph the front of the vehicle. Here, the size, installation location, and type of the camera are not limited, and for example, it may be an image obtained from a black box camera while driving a vehicle.

예시적으로 위 입력영상(11) 획득을 위한 촬영은 차량 주행시 실시간으로 이루어질 수 있다. For example, photographing for obtaining the above input image 11 may be performed in real time while the vehicle is driving.

한편 예를 들어, 일반적인 촬영 영상, 예를 들어 Full HD 화질을 가진 영상에서 신호등의 신호 점등 전구인 신호등 색구가 대략 10픽셀 정도 이상 차지하는 경우 신호등 색구에 대한 유효한 인식을 수행할 수 있다.On the other hand, for example, in a general photographed image, for example, an image having Full HD quality, when a traffic light color sphere, which is a signal lighting bulb of a traffic light, occupies approximately 10 pixels or more, it is possible to perform effective recognition of the traffic light color sphere.

이때 신호등과 차량 간의 거리는 약 103미터 정도인 것을 확인할 수 있었다. 따라서, 예시적으로 신호등과 차량 간의 거리가 약 103 미터 이내인 순간부터 신호등, 구체적으로는 신호등 색구에 대한 인식이 가능할 수 있다.At this time, it was confirmed that the distance between the traffic light and the vehicle was about 103 meters. Therefore, for example, from the moment the distance between the traffic light and the vehicle is within about 103 meters, it may be possible to recognize a traffic light, specifically a color sphere of a traffic light.

다만 이는 예시적인 것일 뿐이며, 촬영 영상의 화질의 증감에 따라 차량이 신호등을 인식할 수 있는 거리의 크기는 변동할 수 있다.However, this is only an example, and the size of the distance at which the vehicle can recognize a traffic light may vary according to an increase or decrease in image quality of a captured image.

한편, 입력영상(11) 상에는 신호등에 관한 정보뿐 아니라, 도로 상의 다른 차량 및 주변 환경에 대한 정보도 포함되어 있다. 따라서 입력영상(11) 상에서 도로 상의 다른 차량 및 주변 환경에 대한 정보를 제외한, 순수 신호등(정확히는 신호등 색구)에 관한 정보만을 구분해내는 것이 중요하다.Meanwhile, the input image 11 includes not only information on traffic lights, but also information on other vehicles on the road and surrounding environments. Therefore, it is important to discriminate only information about pure traffic lights (to be precise, color spheres of traffic lights), excluding information about other vehicles on the road and surrounding environment on the input image 11.

이와 같은 정보 구분을 위해, 본 발명의 일 실시예와 다르게 학습 모델을 사용하지 않고 입력영상(11)을 구성하는 각각의 요소가 가진 색이 신호등 색구 색에 특정 임계치 이상으로 가까운 경우에 신호등 색구 색과 일치하는 것으로 판단하는 방법을 생각할 수 있다.In order to distinguish such information, unlike an embodiment of the present invention, when the color of each element constituting the input image 11 is close to the color of the traffic light by more than a specific threshold, without using a learning model, You can think of a way to judge it as consistent with.

다만, 이와 같은 방법으로 신호등 색을 분류하는 경우, 다양한 환경 변화(조도, 주행환경) 등에 민감하여 검출 성능이 현저히 저하될 수 있다. 반면, 인공지능 학습 모델을 기반한 방법들은 영상 전체를 입력으로 받고 베이스 네트워크로 다양한 특징을 추출해야 하기 때문에 베이스 네트워크 상의 모델 파라미터의 수가 상당히 커지게 된다. 이 경우 파라미터의 수는 대략 700만 개 이상이 필요하게 되며, 이에 따라 신호등 상태를 인식하기 위해 수행해야 하는 연산량이 막대해지게 된다.However, when the traffic light colors are classified in this way, the detection performance may be significantly deteriorated because it is sensitive to various environmental changes (illumination, driving environment). On the other hand, methods based on artificial intelligence learning models receive the entire image as an input and have to extract various features through the base network, so the number of model parameters on the base network increases considerably. In this case, approximately 7 million or more of parameters are required, and accordingly, the amount of computation that must be performed to recognize the state of a traffic light increases.

이로 인해 연산을 위한 네트워크의 복잡도도 증가하게 되며, 또한 신호등 상태 인식을 위한 연산시간도 증가하게 된다. 이는 도로 주행시 실시간으로 짧은 시간 내에 신호등의 상태를 파악하도록 하기에는 적절하지 않은 측면이 있다.This increases the complexity of the network for calculation, and also increases the calculation time for recognizing the state of a traffic light. This is not appropriate to determine the state of traffic lights in a short time in real time while driving on the road.

따라서 본 발명은 신호등을 인식하는 데 있어서, 입력영상(11) 상의 색을 특정 임계치를 기준으로 구분하는 방법을 사용하지 않고, 지도 학습 방식을 사용하여 연산을 위한 파라미터를 줄이는 방법을 제안한다.Accordingly, the present invention proposes a method of reducing parameters for calculation by using a supervised learning method instead of using a method of classifying the color on the input image 11 based on a specific threshold in recognizing a traffic light.

즉, 본 발명의 일 실시예는, 영상변환단계(S1)에서 입력영상(11)을 뉴럴 네트워크로 구성된 제1학습모델(10)에 입력하여 저채널 변환영상(12)으로 변환한다. That is, according to an embodiment of the present invention, in the image conversion step S1, the input image 11 is inputted to the first learning model 10 configured with a neural network and converted into a low-channel converted image 12.

여기서 제1학습모델(10)은 추후 자세히 설명할 내용과 같이 딥러닝 방식으로 학습된 뉴럴 네트워크로 구성된다. Here, the first learning model 10 is composed of a neural network trained in a deep learning method as described in detail later.

또한, 다수의 훈련데이터를 통한 학습을 이미 마친 상태인 제1 학습모델(10)을 이용해 입력영상(11)을 저채널 변환영상(12)으로 변환하게 되는 데, 이와 같은 방식은 신호등 후보군을 줄일 수 있는 장점이 있고, 더불어 적은 파라미터로 구성된 학습모델을 사용하기 때문에 연산량을 상당히 감소시킬 수 있다.In addition, the input image 11 is converted into a low-channel transformed image 12 using the first learning model 10, which has already completed learning through a plurality of training data, which reduces the number of traffic light candidates. In addition, since the learning model composed of fewer parameters is used, the amount of computation can be significantly reduced.

본 발명의 일 실시예는 이와 같은 방식을 사용함으로써, 영상변환단계(S1)에서 연산을 위한 파라미터를 예시적으로 백 개 이내로 제한할 수 있다. According to an embodiment of the present invention, by using such a method, parameters for calculation in the image conversion step (S1) may be exemplarily limited to less than one hundred.

한편, 저채널 변환영상(12)은 입수된 입력영상(11)이 영상변환단계(S1)를 거침으로써 제한된 채널로만 구성되도록 변환된 영상을 의미한다. 구체적으로, 저채널 변환영상(12)은 제한된 종류의 색을 갖는 픽셀로 이루어진 영상이다. On the other hand, the low-channel converted image 12 refers to an image converted so that the received input image 11 is configured with only a limited channel by passing through the image conversion step S1. Specifically, the low-channel converted image 12 is an image composed of pixels having a limited type of color.

이는 신호등의 색이 일반적으로 적색 또는 녹색으로 구성되는 등 그 색이 매우 제한적이라는 점에 기인한다. 즉, 신호등의 신호색은 그 종류가 매우 한정적이므로, 영상변환단계(S1)에서는 신호등의 신호색에 대응하지 않는 색에 대한 정보는 모두 제거하여 추후 단계에서 불필요한 연산이 수행되지 않도록 한다.This is due to the fact that the color of traffic lights is very limited, such as usually red or green. That is, since the signal color of the traffic light is very limited in its type, in the image conversion step S1, all information on the color that does not correspond to the signal color of the traffic light is removed so that unnecessary calculations are not performed in a later step.

한편, 이때 저채널 변환영상(12)을 구성하는 복수의 채널은 추후 더욱 구체적으로 설명할 바와 같이 신호등의 각 신호색에 각각 대응되는 채널일 수 있다. Meanwhile, in this case, the plurality of channels constituting the low-channel converted image 12 may be channels respectively corresponding to each signal color of a traffic light, as will be described in more detail later.

저채널 변환영상(12)을 구성하는 복수의 채널은 예시적으로 신호등의 제1신호색에 대응하는 제1채널(13), 신호등의 제2신호색에 대응하는 제2채널(14), 기타 색에 대응하는 제3채널(15)로 구성될 수 있다. The plurality of channels constituting the low-channel converted image 12 are illustratively a first channel 13 corresponding to a first signal color of a traffic light, a second channel 14 corresponding to a second signal color of a traffic light, and others. It may be composed of a third channel 15 corresponding to the color.

여기서, 예시적으로 신호등의 제1신호색은 적색, 제2신호색은 녹색으로 지정될 수 있으며, 적색 및 녹색을 제외한 색은 기타 색으로 지정될 수 있다.Here, as an example, the first signal color of the traffic light may be designated as red, and the second signal color may be designated as green, and colors other than red and green may be designated as other colors.

후보영역 선별단계(S2)에서는 저채널 변환영상(12) 상에서 신호등의 신호색에 대응하는 색을 가진 복수의 색군을 검출하고, 상기 복수의 색군을 기초로 신호등 후보영역(17; 도 9 참조)을 선별한다. In the candidate region selection step (S2), a plurality of color groups having a color corresponding to a signal color of a traffic light are detected on the low-channel converted image 12, and a traffic light candidate region 17 (see FIG. 9) based on the plurality of color groups. Screen.

한편, 여기서 색군이란 유사한 색을 가진 픽셀들 복수가 인접하여 군집된 집합체를 의미한다. 예를 들어, 입력영상(11) 상에서 신호등 상에 적색으로 점등된 영역은 저채널 변환영상(12) 상에서 전체적으로 원형 형태로 인접 배치된 적색 픽셀들로 형성된 색군으로 표현될 수 있다.Meanwhile, the color group here refers to an aggregate in which a plurality of pixels having similar colors are adjacent and clustered. For example, an area illuminated in red on a traffic light on the input image 11 may be expressed as a color group formed of red pixels arranged adjacent to each other in a circular shape as a whole on the low-channel converted image 12.

이때 신호등 후보영역(17)은 복수의 색군들 중 실제 신호등에 대응할 확률이 일정 확률 이상인 색군에 대응하는 입력영상(11) 상의 영역을 지칭하는 용어일 수 있다.In this case, the traffic light candidate region 17 may be a term that refers to a region on the input image 11 corresponding to a color group whose probability of corresponding to an actual traffic light is greater than or equal to a certain probability among the plurality of color groups.

후보영역 선별단계(S2)에서는 추후 구체적으로 설명할 내용과 같이, 제2학습모델(20)을 이용하여 신호등 후보영역(17)을 선별한다. 이때 제2학습모델(20)은 제1학습모델(10)과 마찬가지로 딥러닝에 의해 미리 학습된 뉴럴 네트워크로 구성된 학습모델일 수 있다.In the candidate region selection step (S2), the traffic light candidate region 17 is selected using the second learning model 20 as described in detail later. In this case, like the first learning model 10, the second learning model 20 may be a learning model composed of a neural network that has been previously learned by deep learning.

신호등상태 판단단계(S3)에서는 후보영역 선별단계(S2)에서 선별된 신호등 후보영역(17)에 기반하여, 해당 영역에 대응하는 신호등의 상태를 판단한다. 이때 판단되는 신호등의 상태는 신호등의 위치 및 신호등의 색 또는 방향 표시 등을 포함할 수 있다.In the traffic light state determination step (S3), based on the traffic light candidate region 17 selected in the candidate region selection step (S2), the state of the traffic light corresponding to the corresponding region is determined. At this time, the determined state of the traffic light may include a location of the traffic light and a color or direction indication of the traffic light.

구체적으로, 신호등상태 판단단계(S3)에서는 신호등의 상태에 대한 분석 데이터인 신호등상태 결과데이터(18)를 출력할 수 있다. 이때 신호등상태 결과데이터(18)는 실제 신호등이 존재하는가에 대한 검출값(o), 신호등의 상태에 대한 검출값(s₁, s₂, s₃, s₄) 및 신호등의 위치 및 크기에 대한 검출값(x_c, y_c, w, h)을 포함할 수 있다.Specifically, in the traffic light state determination step (S3), the traffic light state result data 18, which is analysis data on the state of the traffic light, may be output. At this time, the traffic light status result data (18) is the detection value (o) for the existence of the actual traffic light, the detection value (s ₁ , s ₂ , s ₃ , s ₄ ) for the status of the traffic light, and the location and size of the traffic light. It may include detection values (x _c , y _c , w, h).

여기서 신호등 상태에 대한 검출값은 신호등에 점등된 신호의 색 또는 형태에 관한 값일 수 있으며, 신호등의 위치 및 크기에 대한 검출값은 신호등의 위치에 관한 좌표값(x_c, y_c) 또는 신호등 후보영역의 가로 및 세로 길이(w, h)일 수 있다.Here, the detected value for the traffic light status may be a value related to the color or shape of the signal lit on the traffic light, and the detected value for the position and size of the traffic light is a coordinate value for the position of the traffic light (x _c , y _c ) or a traffic light candidate. It may be the horizontal and vertical lengths (w, h) of the region.

신호등상태 판단단계(S3)에서는 추후 구체적으로 설명할 내용과 같이, 제3학습모델(30)을 이용하여 신호등의 상태를 판단한다. 이때 제3학습모델(30)은 제1학습모델(10) 및 제2학습모델(20)과 마찬가지로 딥러닝에 의해 미리 학습된 뉴럴 네트워크로 구성된 학습모델일 수 있다.In the traffic light state determination step (S3), the state of the traffic light is determined using the third learning model 30, as will be described in detail later. In this case, the third learning model 30 may be a learning model composed of a neural network previously learned by deep learning, like the first learning model 10 and the second learning model 20.

도 3은 본 발명의 일 실시예의 영상변환단계의 구체적인 과정을 도시한 순서도이다.3 is a flow chart showing a detailed process of an image conversion step according to an embodiment of the present invention.

또한, 도 4는 본 발명의 일 실시예에서 예시적인 입력영상이 표현된 것이며, 도 5는 본 발명의 일 실시예에서 예시적인 저채널 변환영상이 표현된 것이다.In addition, FIG. 4 illustrates an exemplary input image according to an embodiment of the present invention, and FIG. 5 illustrates an exemplary low-channel converted image according to an exemplary embodiment of the present invention.

이하 도 3 내지 도 5를 참조하여 본 발명의 일 실시예에서 영상변환단계(S1)의 구체적인 과정을 설명한다.Hereinafter, a detailed process of the image conversion step S1 in an embodiment of the present invention will be described with reference to FIGS. 3 to 5.

먼저, 예시적으로 도 4에 표현된, 차량 주행 시 신호등 및 주변 환경을 촬영한 입력영상(11)을 획득한다. 예시적으로 위 입력영상(11) 획득을 위한 촬영은 차량 주행시 실시간으로 이루어질 수 있다. First, an input image 11 photographing a traffic light and a surrounding environment when driving a vehicle, exemplarily illustrated in FIG. 4, is obtained. For example, photographing for obtaining the above input image 11 may be performed in real time while the vehicle is driving.

입력영상(11)을 획득한 후, 상기 입력영상(11)을 제1학습모델(10)에 입력한다. 이때 제1학습모델(10)은 CNN(Convolutional Neural Network)으로 구성된 학습모델로서, 차량 주행 시 입력영상(11)을 저채널 변환영상(12)으로 변환하도록 미리 학습된 모델이다. 이때 제1학습모델(10)을 구성하는 CNN을 제1 CNN으로 지칭하기로 한다.After acquiring the input image 11, the input image 11 is input to the first learning model 10. At this time, the first learning model 10 is a learning model composed of a Convolutional Neural Network (CNN), and is a model that has been previously trained to convert the input image 11 into a low-channel transformed image 12 when driving a vehicle. At this time, the CNN constituting the first learning model 10 will be referred to as a first CNN.

제1학습모델(10)에 입력된 입력영상(11)은 복수의 채널로 구성된 저채널 변환영상(12)으로 변환된다. 이때 저채널 변환영상(12)은 제한된 채널 수를 가진 저채널 변환영상(12)이다. 또한 위 채널은 신호등의 복수의 신호색에 각각 대응되는 채널로 구성된다.The input image 11 input to the first learning model 10 is converted into a low-channel converted image 12 composed of a plurality of channels. At this time, the low-channel converted image 12 is a low-channel converted image 12 having a limited number of channels. Also, the above channels are composed of channels corresponding to a plurality of signal colors of a traffic light.

예를 들어 가장 일반적인 신호등은 적색 또는 녹색 신호를 가지도록 구성되므로, 저채널 변환영상(12)은 적색 영상 요소에 대응되는 제1채널(13), 녹색 영상 요소에 대응되는 제2채널(14) 및, 적색 및 녹색을 제외한 기타 색인 배경색 영상 요소에 대응되는 제3채널(15)로 구성될 수 있다.For example, since the most common traffic lights are configured to have a red or green signal, the low-channel converted image 12 includes a first channel 13 corresponding to a red image element and a second channel 14 corresponding to a green image element. And, a third channel 15 corresponding to an image element with an index background color other than red and green.

또한 예시적으로 신호등의 제1신호색을 적색, 제2신호색을 녹색이라고 하면, 저채널 변환영상(12) 상에는 제1신호색이 일정 영역에서 군집을 이루는 제1색군 및 제2신호색이 일정 영역에서 군집을 이루는 제2색군이 형성될 수 있다.In addition, as an example, if the first signal color of the traffic light is red and the second signal color is green, the first signal color on the low-channel converted image 12 shows the first color group and the second signal color clustering in a certain area. A second color group forming a cluster in a certain area may be formed.

한편, 저채널 변환영상(12)은 위와 같은 3채널 영상이 아닌 4채널 영상일 수도 있다.Meanwhile, the low-channel converted image 12 may be a 4-channel image instead of the above three-channel image.

이 경우, 저채널 변환영상(12)을 구성하는 각각의 채널은 적색, 녹색, 배경색 영상 요소에 대응되는 채널뿐 아니라 황색 영상 요소에 대응되는 채널을 더 포함할 수 있다. 이는 실제 도로상의 신호등은 적색 및 녹색 신호뿐 아니라 황색 신호도 존재하는 점을 반영한 구성이다.In this case, each channel constituting the low-channel converted image 12 may further include a channel corresponding to a yellow image element as well as a channel corresponding to a red, green, and background color image element. This is a configuration reflecting the fact that traffic lights on real roads have not only red and green signals, but also yellow signals.

미리 지도학습된 제1 CNN으로 구성된 제1학습모델(10)을 통해 입력영상(11)을 저채널 변환영상(12)으로 변환하는 과정은 앞서 설명한 바와 같이 연산량이 매우 적으므로, 신호등의 상태 검출 과정에 있어서 매우 빠른 연산을 수행할 수 있다.As described above, the process of converting the input image 11 into the low-channel transformed image 12 through the first learning model 10 composed of the first supervised learning CNN is very small, so the state of the traffic light is detected. You can perform very fast calculations in the process.

도 6은 본 발명의 일 실시예에서 제1학습모델의 학습 과정을 도시한 순서도이다. 이하 도 6을 참조하여 본 발명의 일 실시예에서 제1학습모델(10)의 학습 과정을 설명한다.6 is a flow chart showing a learning process of a first learning model in an embodiment of the present invention. Hereinafter, a learning process of the first learning model 10 in an embodiment of the present invention will be described with reference to FIG. 6.

제1학습모델(10)은 제1 CNN으로 구성되며, 지도 학습(supervised learning) 방식에 의해 학습될 수 있다.The first learning model 10 is composed of a first CNN, and may be learned by a supervised learning method.

먼저 복수의 영상을 확보한다. 이때 복수의 영상은 예시적으로 여러 장소에 존재하는 도로 상의 신호등 및 그 주변 환경을 촬영한, 본 발명의 일 실시예의 입력영상(11)과 같은 다수의 영상일 수 있다. First, secure multiple images. In this case, the plurality of images may be a plurality of images, such as the input image 11 according to an embodiment of the present invention, which captures a traffic light on a road existing in several places and its surrounding environment.

또한, 확보한 복수의 영상 상의 신호등의 위치에 대한 정보를 표시한 데이터를 축적한다. In addition, data indicating information on the positions of traffic lights on a plurality of secured images is accumulated.

이때 신호등에 대한 정보는 예시적으로 위 복수의 영상을 참조하여 사람이 직접 표시한 상기 신호등의 위치에 대한 정보이다. At this time, the information on the traffic light is information on the location of the traffic light directly displayed by a person with reference to the plurality of images above.

한편, 이와 같이 확보한 복수의 영상 상의 신호등 정보를 표시한 데이터의 집합을 통틀어 제1훈련데이터라 지칭하기로 한다.Meanwhile, a set of data displaying traffic light information on a plurality of images thus secured will be collectively referred to as first training data.

또한, 저채널 변환영상(12)을 확보함으로써 저채널 변환영상(12) 상에 표현된 복수의 신호색 영역 및 배경색 영역을 확보한다. 이때 저채널 변환영상(12)은 예를 들어 본 발명의 일 실시예에서의 영상변환단계(S1)를 수행하여 획득한 저채널 변환영상(12)일 수 있다.In addition, by securing the low-channel converted image 12, a plurality of signal color regions and background color regions expressed on the low-channel converted image 12 are secured. In this case, the low-channel converted image 12 may be, for example, the low-channel converted image 12 obtained by performing the image conversion step S1 in the embodiment of the present invention.

또한, 확보된 저채널 변환영상(12) 상에 표현된 복수의 신호색 영역과 배경색 영역 및 제1훈련데이터에 기반하여 제1학습모델(10)을 학습한다. 즉, 제1 학습모델(10)이 생성한 저채널 변환영상(12)과 일종의 정답에 해당하는 제1 훈련데이터를 비교하여, 저채널 변환영상(12) 상에 신호등이 존재하는 지 여부를 제1훈련데이터에 의해 평가함으로써 제1학습모델(10)을 학습한다.In addition, the first learning model 10 is trained based on the plurality of signal color regions, background color regions and first training data expressed on the secured low-channel converted image 12. That is, by comparing the low-channel transformed image 12 generated by the first learning model 10 and the first training data corresponding to a kind of correct answer, it is determined whether or not a traffic light exists on the low-channel transformed image 12. The first learning model 10 is learned by evaluating based on one training data.

이때 제1학습모델(10)의 학습 과정에서 에러 최적화가 수행될 수 있다. 이는 영상변환단계(S1)에서 변환된 저채널 변환영상(12) 상에 신호등 위치에 대응되는 영역이 충실히, 그리고 오류 없이 반영되도록 하기 위함이다.In this case, error optimization may be performed in the learning process of the first learning model 10. This is to ensure that the area corresponding to the position of the traffic light is reflected faithfully and without errors on the low-channel converted image 12 converted in the image conversion step (S1).

예를 들어, 제1학습모델(10)의 학습 과정에서 출력된 저채널 변환영상(12)을 제1훈련데이터와 비교했을 때 저채널 변환영상(12)이 제1훈련데이터에 대응되지 않는 복수의 에러가 존재할 수 있다.For example, when the low-channel converted image 12 output in the learning process of the first learning model 10 is compared with the first training data, the low-channel converted image 12 does not correspond to the first training data. Error may exist.

이때 예시적으로 저채널 변환영상(12) 중 신호등 영역에 대응하는 영역과 관련된 에러 값에만 가중치를 부여하는 에러 최적화를 수행함으로써 제1학습모델(10)이 신호등 영역에 대응하는 영역을 누락할 가능성을 낮출 수 있다. At this time, the possibility that the first learning model 10 will omit the area corresponding to the traffic light area by performing error optimization in which weight is applied only to the error value related to the area corresponding to the traffic light area among the low-channel transformed images 12. Can lower.

구체적으로, 신호등 영역에 대응하는 영역에 관련된 에러 값에만 부여되는 가중치는 1000배 이상으로 설정될 수 있다. 이는 전체 영상에서 신호등 영역에 대응하는 영역이 차지하는 비중이 매우 적기 때문이다. 따라서 이와 같이 가중치를 1000배 이상 부여함으로써 제1학습모델(10)의 학습을 효과적으로 수행할 수 있다.Specifically, a weight assigned only to an error value related to an area corresponding to the traffic light area may be set to be 1000 times or more. This is because the proportion of the area corresponding to the traffic light area in the entire image is very small. Therefore, by assigning a weight of 1000 times or more as described above, learning of the first learning model 10 can be effectively performed.

한편 이때 적합한 형태의 손실함수(loss function)를 통해 신호등 색구의 인식률을 높이도록 학습이 진행될 수 있다.On the other hand, at this time, learning may be conducted to increase the recognition rate of a color sphere of a traffic light through a loss function of an appropriate form.

도 7은 본 발명의 일 실시예의 후보영역 선별단계(S2)의 구체적인 과정을 도시한 순서도이다.7 is a flow chart showing a specific process of the candidate region selection step (S2) according to an embodiment of the present invention.

또한, 도 8은 본 발명의 일 실시예에서 입력영상 상에 복수의 색군에 대응하는 색군대응영역을 표시한 것이며, 도 9는 본 발명의 일 실시예에서 입력영상 상에 신호등 후보영역을 표시한 상태를 도시한 것이다.In addition, FIG. 8 shows a color group-corresponding area corresponding to a plurality of color groups on an input image in an embodiment of the present invention, and FIG. 9 shows a traffic light candidate area on an input image in an embodiment of the present invention. It shows the state.

여기서 색군대응영역(16)이란 저채널 변환영상(12) 상에 형성된 색군에 대응하는, 입력영상(11) 상의 영역을 말하며, 신호등 후보영역(17)이란 복수의 색군대응영역(16) 중 입력영상(11) 상에서 신호등 위치에 대응되는 것으로 판단되는 영역을 의미한다. Here, the color group correspondence region 16 refers to an area on the input image 11 corresponding to the color group formed on the low-channel converted image 12, and the traffic light candidate region 17 is an input among a plurality of color group correspondence regions 16. It refers to an area determined to correspond to the location of a traffic light on the image 11.

한편, 도 8에는 편의상 색군대응영역(16) 중 일부에만 도면부호를 표시하였으나, 입력영상(11) 상에서 별도로 표시된 부분은 모두 색군대응영역(16)에 해당함에 유의한다.On the other hand, in FIG. 8, for convenience, only a part of the color response region 16 is indicated with reference numerals, but it is noted that all of the separately marked regions on the input image 11 correspond to the color group correspondence region 16.

이하 도 7 내지 도 9를 참조하여 본 발명의 일 실시예에서 후보영역 선별단계(S2)의 과정에 대해 구체적으로 설명한다.Hereinafter, a process of selecting a candidate region (S2) in an embodiment of the present invention will be described in detail with reference to FIGS. 7 to 9.

먼저, 영상변환단계(S1)에서 변환된 저채널 변환영상(12)에 관한 정보를 제2학습모델(20)에 입력한다. First, information on the low-channel converted image 12 converted in the image conversion step S1 is input into the second learning model 20.

이때, 예시적으로 제2학습모델(20)에 입력되는 정보는 저채널 변환영상(12) 상의 복수의 색군의 위치 및 크기 등, 기하학적 정보이다. 또한 예를 들어 제2학습모델(20)에 입력되는 정보는 저채널 변환영상(12)을 구성하는 채널 별 또는 색군 별로 분류되어 입력될 수 있다.In this case, exemplary information input to the second learning model 20 is geometric information, such as positions and sizes of a plurality of color groups on the low-channel converted image 12. Further, for example, information input to the second learning model 20 may be classified and input by channels or color groups constituting the low-channel converted image 12.

이때 복수의 색군은 앞서 설명한 바와 같이 예를 들어 신호등의 제1신호의 색에 대응하는 제1신호색으로 구성된 제1색군 및, 신호등의 제2신호의 색에 대응하는 제2신호색으로 구성된 제2색군으로 구성될 수 있다. In this case, as described above, the plurality of color groups are, for example, a first color group composed of a first signal color corresponding to the color of a first signal of a traffic light and a second signal color composed of a second signal color corresponding to the color of a second signal of a traffic light. It can be composed of two color groups.

이 경우 예시적으로 제1색군은 적색에 대응하는 색을 가진 색군일 수 있으며, 제2색군은 녹색에 대응하는 색을 가진 색군일 수 있다.In this case, for example, the first color group may be a color group having a color corresponding to red, and the second color group may be a color group having a color corresponding to green.

또한, 신호등이 적색 및 녹색의 제1신호 및 제2신호만으로 구성된 것을 전제로 설명하나, 신호등이 황색의 제3신호도 포함하는 경우에 대비하여 복수의 색군은 위 제1색군 및 제2색군 뿐 아니라 제3신호에 대응하는 추가적인 색군도 포함할 수 있음에 유의하여야 한다.In addition, it is assumed that a traffic light is composed of only the first and second signals of red and green, but the plurality of color groups are only the first color group and the second color group above in case the traffic light also includes a third signal of yellow color. In addition, it should be noted that an additional color group corresponding to the third signal may be included.

후보영역 선별단계(S2)에서는 제2학습모델(20)에 입력하기 위한, 복수의 색군에 관한 기하정보를 먼저 추출할 수 있다. 이때 각각의 색군은 모두 위치 또는 크기 중 하나 이상의 요소가 다르게 형성되므로, 위 각각의 색군들의 위치 및 크기 정보는 각각의 색군을 구분하기 위한 정보로 활용될 수 있다.In the candidate region selection step S2, geometric information about a plurality of color groups for input to the second learning model 20 may be first extracted. At this time, since each color group has one or more elements of position or size differently formed, information on the position and size of each color group above may be used as information for distinguishing each color group.

이때 각 색군들의 위치 정보는 예를 들어 색군의 중심점을 기준으로 할 수 있다. 또한, 각 색군들의 위치 정보는 예를 들어 각 색군의 가장 좌측 및 우측에 있는 픽셀의 x좌표를 선정하고, 가장 상측 및 하측에 있는 픽셀의 y좌표를 선정하여, 위 좌표들을 지나는 사각형을 토대로 판단되는 것일 수도 있다.At this time, the location information of each color group may be based on, for example, a center point of the color group. In addition, the location information of each color group is determined based on the square passing through the coordinates by selecting, for example, the x-coordinates of the pixels on the left and right of each color group, and selecting the y-coordinates of the pixels on the top and bottom sides. It may be.

한편, 색군들의 위치 정보의 설정은 위 예시들과 다른 방식이라도 무방하다.On the other hand, it is okay to set the location information of the color groups in a different manner from the above examples.

이때 색군의 중심점에 관한 정보는 예를 들어 저채널 변환영상(12) 상에서 설정된 좌표 평면 상의 좌표 정보로 추출될 수 있다. 또한 예시적으로 각 색군들의 크기 정보는 색군의 가로 및 세로 길이의 크기일 수 있다.In this case, the information on the center point of the color group may be extracted as coordinate information on a coordinate plane set on the low-channel transformed image 12, for example. Also, for example, the size information of each color group may be the size of the horizontal and vertical length of the color group.

이와 같은 복수의 색군의 기하정보는 제2학습모델(20)에 복수의 색군에 관한 정보가 입력되기 전에 별도로 추출될 수도 있으나, 제2학습모델(20)에 저채널 변환영상(12)을 직접 입력하여 제2학습모델(20)에서 저채널 변환영상(12)으로부터 직접 각 색군들의 기하정보를 추출하는 것도 가능하다.The geometric information of the plurality of color groups may be separately extracted before information on the plurality of color groups is input to the second learning model 20, but the low-channel converted image 12 is directly applied to the second learning model 20. It is also possible to input and extract geometric information of each color group directly from the low-channel converted image 12 in the second learning model 20.

추출된 각 색군들의 기하정보는 제2학습모델(20)에 입력된다. 한편, 각 색군들의 기하정보는 제2학습모델(20)에 입력되기 전에 정규화 과정을 거칠 수 있다. 이는 화면 크기의 편차에 따라 에러가 발생할 가능성을 감소시키기 위함이다.The extracted geometric information of each color group is input to the second learning model 20. Meanwhile, the geometric information of each color group may undergo a normalization process before being input to the second learning model 20. This is to reduce the possibility of an error occurring according to the deviation of the screen size.

제2학습모델(20)은 이미 수행된 학습에 의해 얻어진 데이터에 기초하여 각 색군들이 실제 신호등 색구에 대응하는지 여부를 판단한다. 이를 위해 제2학습모델(20)은 각 색군들이 실제 신호등 색구에 대응될 확률들을 산출한다.The second learning model 20 determines whether each color group corresponds to an actual traffic light color sphere based on the data obtained by the already performed learning. To this end, the second learning model 20 calculates the probability that each color group corresponds to an actual traffic light color sphere.

이때, 예를 들어 저채널 변환영상(12) 상에서의 각각의 색군의 위치 및 크기 정보가 실제 신호등 색구의 위치 및 크기 정보와 대응되는지 여부를 바탕으로 위 대응 확률을 산출할 수 있다.At this time, for example, the corresponding probability may be calculated based on whether the position and size information of each color group on the low-channel converted image 12 corresponds to the position and size information of an actual traffic light color sphere.

한편, 도로상에서 신호등은 일반적으로 도로로부터 일정 높이 이격되어 위치하므로 그 위치에 있어 큰 변동이 없다. 또한 신호등 형태는 일반적으로 직사각형의 틀 내에 3 또는 4의 신호등 색구가 배치되며, 그 크기도 대부분 규격화되어 큰 차이가 없다.On the other hand, since traffic lights are generally located at a certain height apart from the road, there is no significant change in their location. In addition, in the shape of a traffic light, three or four color spheres are generally arranged in a rectangular frame, and the size is mostly standardized, so there is no significant difference.

따라서, 이와 같은 판단 과정은 실제 도로에서 신호등의 위치, 형태 및 크기가 어느 정도 제한적이라는 점으로부터 특히 높은 활용성 및 정확도를 가짐을 알 수 있다.Therefore, it can be seen that such a judgment process has particularly high utility and accuracy because the location, shape, and size of traffic lights on an actual road are somewhat limited.

이와 같은 과정을 통해, 각각의 색군이 신호등 색구에 해당할 확률인 신호등 대응확률이 산출되며, 신호등 대응확률이 일정 확률보다 낮은 색군은 무의미한 데이터로 판단할 수 있다. Through this process, a traffic light correspondence probability, which is a probability that each color group corresponds to a traffic light color sphere, is calculated, and a color group with a traffic light correspondence probability lower than a certain probability may be determined as meaningless data.

즉, 신호등 대응확률이 일정 확률보다 낮은 색군은 신호등이 아닌 것으로 확신할 수 있으므로, 해당 색군들은 신호등상태 판단단계(S3)에서 신호등의 상태를 판단하기 위한 자료로 사용하지 않게 된다.That is, since a color group having a traffic light response probability lower than a certain probability can be certain that it is not a traffic light, the corresponding color groups are not used as data for determining the state of the traffic light in the traffic light state determination step S3.

이때 판단 기준이 되는 기준확률은 적정한 신뢰도를 갖는 값으로 임의 설정할 수 있다. 또한, 신호등 대응확률이 기준확률보다 높은 색군은 신호등 색구에 대응되는 색군으로 잠정적으로 판단할 수 있다. At this time, the reference probability, which is the criterion for determination, can be arbitrarily set to a value having an appropriate reliability. In addition, a color group having a traffic light response probability higher than the reference probability may be tentatively determined as a color group corresponding to the traffic light color sphere.

또한, 저채널 변환영상(12) 상에서 신호등 색구가 표현된 것으로 판단된 색군이 존재하는 영역에 대응하는 입력영상(11) 상의 영역을 신호등 후보영역(17)으로 지정할 수 있다. 이때 신호등 후보영역(17)이란 입력영상(11) 상에서 신호등에 해당하는 것으로 판단되는 영역을 의미한다.In addition, an area on the input image 11 corresponding to an area in which a color group determined to be represented by a traffic light color sphere on the low-channel converted image 12 may be designated as the traffic light candidate area 17. At this time, the traffic light candidate region 17 refers to an area determined to correspond to a traffic light on the input image 11.

도 10은 본 발명의 일 실시예에서 제2학습모델(20)의 학습 과정을 도시한 순서도이다. 이하 도 10을 참조하여 본 발명의 일 실시예에서 제2학습모델(20)의 학습 과정을 구체적으로 설명한다.10 is a flow chart illustrating a learning process of the second learning model 20 in an embodiment of the present invention. Hereinafter, a learning process of the second learning model 20 in an embodiment of the present invention will be described in detail with reference to FIG. 10.

제2학습모델(20)은 제1학습모델(10)과 같이 지도 학습(supervised learning) 방식에 의해 학습될 수 있다. 또한 제2학습모델(20)도 제1학습모델(10)과 유사하게 CNN으로 구성될 수 있다. 이때 제2학습모델(20)을 구성하는 CNN을 제1학습모델(10)을 구성하는 CNN과 구별하기 위해, 제 2 CNN으로 지칭하기로 한다.Like the first learning model 10, the second learning model 20 may be learned by a supervised learning method. In addition, the second learning model 20 may be composed of a CNN similar to the first learning model 10. At this time, in order to distinguish the CNN constituting the second learning model 20 from the CNN constituting the first learning model 10, it will be referred to as a second CNN.

제2학습모델(20)을 학습하기 위해, 먼저 하나 이상의 색군이 포함된 복수의 저채널 변환영상(12)을 확보한다. 이때 하나 이상의 색군은 앞서 설명한 바와 같이 영상변환단계(S1)를 거쳐 변환된 저채널 변환영상(12) 상에서 동일 색이 군집을 이루는 영역이다.In order to learn the second learning model 20, first, a plurality of low-channel transformed images 12 including one or more color groups are secured. At this time, at least one color group is an area in which the same color is clustered on the low-channel converted image 12 converted through the image conversion step S1 as described above.

이후, 복수의 색군이 실제 신호등 색구에 대응하는지 여부를 표시한 훈련데이터를 작성 및 수집한다. Thereafter, training data indicating whether the plurality of color groups correspond to the actual traffic light color sphere is created and collected.

이때 위 표시 정보는 예시적으로 저채널 변환영상(12) 상에 존재하는 복수의 색군 중 실제 신호등 색구에 대응하는 색군을 사람이 직접 선별하여 표시한 정보일 수 있다. 한편, 이와 같이 확보한 훈련데이터의 집합을 제2훈련데이터라 지칭하기로 한다.In this case, the above display information may be information obtained by directly selecting and displaying a color group corresponding to an actual traffic light color sphere among a plurality of color groups existing on the low-channel converted image 12. Meanwhile, the set of training data secured in this way will be referred to as second training data.

또한, 후보영역 선별단계(S2)에서 출력된 복수의 확률데이터를 확보한다. In addition, a plurality of probability data output in the candidate region selection step (S2) is secured.

위 확률데이터는 앞서 설명한 바와 같이 후보영역 선별단계(S2)에서 저채널 변환영상(12) 상의 복수의 색군에 대한 기하학적 정보에 기초하여 산출된, 각각의 색군이 신호등 색구에 대응할 확률에 관한 정보의 집합을 의미한다.As described above, the probability data is calculated based on geometric information for a plurality of color groups on the low-channel transformed image 12 in the candidate region selection step (S2), as described above, and contains information on the probability that each color group corresponds to a color sphere of a traffic light. Means set.

이후, 확률데이터 및 제2훈련데이터를 기초로 지도학습을 수행한다. 이때, 확률데이터를 구성하는 각각의 확률정보가 제2훈련데이터의 정보와 일치 또는 불일치하는지 여부를 확인하는 과정을 거침으로써 지도학습이 수행될 수 있다. Thereafter, supervised learning is performed based on the probability data and the second training data. At this time, supervised learning may be performed by going through a process of checking whether each probability information constituting the probability data matches or does not match the information of the second training data.

이와 같은 지도학습을 수행함으로써 에러 최적화가 수행될 수 있다. 즉, 지도학습이 수행되는 횟수가 증가할수록 에러 가능성이 감소할 수 있다. 이로 인해, 결과적으로 제2학습모델(20)의 학습 경험이 증가할수록 후보영역 선별단계(S2)에서 선별한 신호등 후보영역(17)이 실제 신호등 위치에 더욱 정확히 대응될 수 있다.By performing such supervised learning, error optimization can be performed. That is, as the number of times supervised learning is performed increases, the probability of an error may decrease. As a result, as the learning experience of the second learning model 20 increases, the traffic light candidate region 17 selected in the candidate region selection step S2 may more accurately correspond to the actual traffic light position.

도 11은 본 발명의 일 실시예의 신호등상태 판단단계(S3)의 구체적인 과정을 도시한 순서도이며, 도 12는 본 발명의 일 실시예의 신호등상태 판단단계(S3)에서 신호등 후보영역(17)의 신호등 상태가 검출되는 상태를 표현한 것이다.11 is a flow chart showing a detailed process of the traffic light state determination step (S3) according to an embodiment of the present invention, and FIG. 12 is a traffic light of the traffic light candidate region 17 in the traffic light state determination step (S3) according to an embodiment of the present invention. It represents the state in which the state is detected.

이하 도 11 및 도 12를 참조하여 본 발명의 일 실시예의 신호등상태 판단단계(S3)의 구체적인 과정을 설명한다.Hereinafter, a detailed process of determining a traffic light state (S3) according to an embodiment of the present invention will be described with reference to FIGS. 11 and 12.

먼저, 후보영역 선별단계(S2)에서 선별한 신호등 후보영역(17)을 제3학습모델(30)에 입력한다.First, the traffic light candidate region 17 selected in the candidate region selection step (S2) is input into the third learning model 30.

제3학습모델(30)은 입력된 신호등 후보영역(17)에 대한 정보를 분석하여 신호등 상태를 판단한다. 이때 판단되는 신호등 상태는 신호 상태 및 영상 내 위치 정보를 의미하는 것으로, 이로 인해 신호등 색구의 색의 상태를 판단할 수 있다. 즉, 신호등 색구 상에 표시된 신호가 적색인지 또는 녹색인지 여부를 판단할 수 있다.The third learning model 30 determines the state of the traffic light by analyzing the information on the inputted traffic light candidate region 17. At this time, the determined traffic light state means signal state and location information in the image, and thus, the color state of the color sphere of the traffic light can be determined. That is, it may be determined whether the signal displayed on the color sphere of the traffic light is red or green.

또한, 이때 판단되는 신호등 상태에는 신호등 색구에 표시된 표지에 대한 인식도 포함될 수 있다. 이때 위 표지는 좌회전, 우회전 등 차량의 진행 방향에 대한 지시 표지일 수 있다.In addition, the determined traffic light state may include recognition of a mark displayed on the color sphere of the traffic light. At this time, the above sign may be an indication sign for the direction of the vehicle, such as a left turn or a right turn.

한편, 이때 구체적으로 신호등상태 판단단계(S3)에서 먼저 출력되는 값은 각각의 해당 신호등 상태가 존재할 확률 값일 수 있다. 즉, 신호등상태 판단단계(S3)에서는 먼저 입력된 신호등 후보영역(17)이 특정 신호 상태일 확률을 계산하여 해당 확률 값을 출력할 수 있다.Meanwhile, in this case, the value first output in the step S3 of determining the traffic light state may be a probability value of the existence of each corresponding traffic light state. That is, in the traffic light state determination step S3, a probability that the input signal light candidate region 17 is in a specific signal state may be calculated and a corresponding probability value may be output.

이때 출력된 확률 값이 미리 설정된 기준 확률보다 높은 값을 가지면 해당 신호등 후보영역(17)에 표시된 신호가 특정 색 또는 표지가 표시된 신호에 해당하는 것으로 판단할 수 있다. In this case, if the output probability value has a value higher than the preset reference probability, it may be determined that the signal displayed on the corresponding traffic light candidate region 17 corresponds to a signal marked with a specific color or mark.

또한 제3학습모델(30)의 출력은 제2신호등 존재 여부 확률을 더 포함할 수도 있다. 이와 같이 신호등 존재 여부 확률을 신호등상태 판단단계(S3)에서도 산출함으로써 해당 신호등 후보영역(17)이 실제로 신호등에 대응되는 영역이 맞는지 여부를 후보영역 선별단계(S2)에 이어 다시 한번 체크할 수 있다. In addition, the output of the third learning model 30 may further include a probability of the presence of the second traffic light. In this way, by calculating the probability of the existence of a traffic light in the traffic light state determination step (S3), it is possible to check whether the corresponding traffic light candidate region 17 actually corresponds to an area corresponding to the traffic light after the candidate region selection step (S2). .

이때 복수의 신호등 후보영역(17) 중 실제로 신호등에 해당하지 않는 것으로 판명된 신호등 후보영역(17)은 신호등 상태를 판단함에 있어 제외할 수 있다.At this time, among the plurality of traffic light candidate regions 17, the traffic light candidate region 17, which is found not to actually correspond to a traffic light, may be excluded in determining the state of the traffic light.

이와 같은 과정을 거쳐, 결과적으로 도 12에 표현된 바와 같이 신호등상태 결과데이터(18)를 도출할 수 있다. 도 12에서 신호등상태 결과데이터(18)는 검출된 신호등 후보영역(17)에 대한 분석으로부터 실제 신호등의 신호등 색구에 녹색 신호가 점등되어 있다고 판단하는 상태가 표현되었다.Through this process, as a result, the traffic light state result data 18 can be derived as shown in FIG. 12. In FIG. 12, the traffic light state result data 18 is a state in which it is determined that the green signal is lit in the actual traffic light color sphere from the analysis of the detected traffic light candidate region 17.

도 13은 본 발명의 일 실시예에 따른 신호등 검출 방법에서 제3학습모델(30)의 학습 과정을 도시한 순서도이다. 이하 도 13을 참조하여 제3학습모델(30)의 학습 과정을 구체적으로 설명한다.13 is a flow chart showing a learning process of the third learning model 30 in the traffic light detection method according to an embodiment of the present invention. Hereinafter, the learning process of the third learning model 30 will be described in detail with reference to FIG. 13.

제3학습모델(30)은 제1학습모델(10) 및 제2학습모델(20)과 마찬가지로 지도 학습 방식에 의해 학습될 수 있다.Like the first learning model 10 and the second learning model 20, the third learning model 30 may be learned by a supervised learning method.

제3학습모델(30)을 학습하기 위해, 먼저 복수의 신호등 후보영역(17)을 포함한 데이터를 확보한다. 이때, 위 데이터는 복수의 신호등 후보영역(17)을 포함한 입력영상(11)에 관한 데이터일 수 있다. In order to learn the third learning model 30, first, data including a plurality of traffic light candidate regions 17 are secured. In this case, the above data may be data related to the input image 11 including a plurality of traffic light candidate regions 17.

또한, 위 신호등 후보영역(17)들은 본 발명의 일 실시예의 후보영역 선별단계(S2)에서 복수의 저채널 변환영상(12)에 관한 기하학적 정보를 기초로 선별한 신호등 후보영역(17)들일 수 있다.In addition, the traffic light candidate regions 17 may be traffic light candidate regions 17 selected based on geometric information about a plurality of low-channel transformed images 12 in the candidate region selection step (S2) according to an embodiment of the present invention. have.

이후, 복수의 위 신호등 후보영역(17)을 기초로, 각각의 신호등 후보영역(17)의 위치 및 상태 정보에 대해 표시한 훈련데이터를 축적한다. Thereafter, based on the plurality of upper traffic light candidate regions 17, training data displayed for the position and state information of each of the traffic light candidate regions 17 are accumulated.

이때, 위치 및 상태 정보에 대한 표시는 사람이 직접 표시한 것일 수 있으나, 제3학습모델(30)의 정답 역할을 할 수 있는 데이터를 수집할 수 있는 방식이라면 다른 방식에 의해 위 데이터를 축적하는 것도 무방하다.At this time, the display of the location and status information may be directly displayed by a person, but if a method capable of collecting data that can serve as a correct answer to the third learning model 30, the above data is accumulated by another method. It is also okay.

이와 같은 방식으로 확보한, 복수의 신호등 후보영역(17)의 위치 및 상태를 표시한 훈련데이터의 집합을 제3훈련데이터라 지칭하기로 한다.A set of training data that displays the positions and states of the plurality of traffic light candidate regions 17 secured in this manner will be referred to as third training data.

또한, 복수의 신호등 상태 정보를 확보한다. 이때 복수의 신호등 상태 정보는 본 발명의 일 실시예에서의 신호등상태 판단단계(S3)를 수행함으로써 출력된 신호등의 상태에 대한 정보의 집합을 의미한다. 이는 앞서 설명한 신호등상태 결과데이터를 포함하는 의미로 사용될 수 있다.In addition, a plurality of traffic light status information is secured. At this time, the plurality of traffic light status information refers to a set of information on the status of a traffic light output by performing the traffic light status determination step (S3) in an embodiment of the present invention. This can be used to mean including the traffic light condition result data described above.

이후, 확보된 복수의 신호등 상태 정보 및 제3훈련데이터에 기반하여 제3학습모델(30)을 학습한다. 구체적으로, 제3훈련데이터에 기반하여, 복수의 신호등 상태 정보가 옳은 정보인지 여부의 판단을 반복 판단함으로써 제3학습모델(30)이 학습될 수 있다.Thereafter, the third learning model 30 is learned based on the obtained plurality of traffic light state information and the third training data. Specifically, based on the third training data, the third learning model 30 may be trained by repeatedly determining whether or not the plurality of traffic light state information is correct information.

또한, 제3학습모델(30)이 학습을 반복함으로써 신호등상태 판단단계(S3)에서 잘못된 판단을 할 가능성이 점점 낮아지므로, 학습 반복에 의해 에러 최적화가 진행될 수 있다.In addition, since the third learning model 30 repeats learning, the possibility of making an erroneous determination in the traffic light state determination step (S3) gradually decreases, so error optimization can be performed by repeating learning.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present embodiment, and those of ordinary skill in the technical field to which the present embodiment belongs will be able to make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present exemplary embodiments are not intended to limit the technical idea of the present exemplary embodiment, but are illustrative, and the scope of the technical idea of the present exemplary embodiment is not limited by these exemplary embodiments. The scope of protection of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

10: 제1학습모델 20: 제2학습모델
11: 입력영상 30: 제3학습모델
12: 저채널 변환영상
13: 제1채널
14: 제2채널
15: 제3채널
16: 색군대응영역
17: 신호등 후보영역
18: 신호등상태 결과데이터10: first learning model 20: second learning model
11: Input image 30: Third learning model
12: Low-channel converted video
13: 1st channel
14: second channel
15: 3rd channel
16: color gamut response area
17: traffic light candidate area
18: traffic light condition result data

Claims

An image conversion step of converting, by the first learning model, an input image photographing a traffic light and its surrounding environment into a low-channel converted image composed of at least a first signal color, a second signal color, and a background color;
A candidate region in which the second learning model selects one or more traffic light candidate regions based on a plurality of first color groups having the first signal color and a plurality of second color groups having the second signal color in the low-channel converted image Screening step;
A traffic light state determination step in which a third learning model determines the state of the traffic light based on partial images of the input image corresponding to the traffic light candidate region.
Traffic light detection method comprising a.

The method of claim 1,
And in the selecting of the candidate region, the traffic light candidate region is selected based on a probability that the first color group and the second color group correspond to the position of the traffic light color sphere on the traffic light.

The method of claim 1,
The traffic light detection method, characterized in that the low-channel converted image further includes a third signal color.

The method of claim 1,
The first learning model comprises a first training data that accumulates information about the traffic light based on a plurality of images and a first CNN that performs supervised learning based on the plurality of low-channel transformed images.

The method of claim 4,
The first learning model outputs a first error value, which is a difference value between the information on the traffic light region among the first training data and the information on the region corresponding to the traffic light region among the plurality of low-channel converted images. A traffic light detection method, characterized in that error optimization is performed in which a weight is assigned only to the first error value among error values.

The method of claim 4,
The traffic light detection method, characterized in that the information on the traffic light is position information of the traffic light displayed by a person.

The method of claim 1,
The second learning model includes a second training data created based on a plurality of the plurality of color groups, and a second CNN that performs supervised learning based on a probability of matching between the plurality of color groups and the traffic light color area of the traffic light. Way.

The method of claim 7,
And the second training data is training data indicating whether or not the plurality of color groups match the color sphere of the traffic light.

The method of claim 1,
Wherein the third learning model performs supervised learning based on third training data created based on the traffic light candidate region and the determined state information of the traffic light.

The method of claim 9,
And the third training data is training data in which a person displays location information and status of the traffic light based on the traffic light candidate area.

The method of claim 10,
The traffic light detection method, characterized in that the location information of the traffic light is a center, a horizontal length, and a vertical length of the traffic light.

The method of claim 10,
The traffic light detection method, characterized in that the status information of the traffic light comprises a color or direction indication of a color sphere of the traffic light.

A traffic light detection device that performs traffic light detection by the traffic light detection method according to claim 1.