KR101908481B1

KR101908481B1 - Device and method for pedestraian detection

Info

Publication number: KR101908481B1
Application number: KR1020170093740A
Authority: KR
Inventors: 박강령; 강진규
Original assignee: 동국대학교 산학협력단
Priority date: 2017-07-24
Filing date: 2017-07-24
Publication date: 2018-12-10

Abstract

The present invention relates to a pedestrian detecting device and a method thereof and, more specifically, to a pedestrian detecting device and a method thereof which adaptively select an image advantageous in detecting a pedestrian in an image obtained by a visible ray camera and an infrared ray camera so as to increase pedestrian detection accuracy. The present invention analyzes features of a visible ray image and an infrared ray image from a real-time input image based on a fuzzy logic so as to select an image advantageous in detection and applies a convolutional neural network to the selected image so as to increase pedestrian detection performance.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a pedestrian detection device,

본 발명은 보행자 검출 장치 및 방법에 관한 것으로, 더욱 상세하게는 가시광 카메라와 적외선 카메라로부터 취득한 영상에서 보행자 검출에 유리한 영상을 적응적으로 선택함으로써 보행자 검출 정확도를 향상시키는 보행자 검출 장치 및 방법에 관한 것이다.The present invention relates to a pedestrian detection apparatus and method, and more particularly, to a pedestrian detection apparatus and method for improving the pedestrian detection accuracy by adaptively selecting an image favorable for pedestrian detection in an image acquired from a visible light camera and an infrared camera .

영상에서 보행자를 검출하는 것은 보안, 감시 등의 다양한 목적으로 사용된다. 하지만, 가시광 카메라에서 취득한 가시광 영상은 그림자가 많이 드리우는 부분과 같이 제한이 있는 환경에서 또는 캄캄한 밤, 조명이 균일하지 않은 환경에서 보행자를 정확하게 검출하기 어렵다. 또한, 적외선 카메라에서 취득한 적외선 영상은 보행자와 보행자 주변의 온도가 비슷해지는 경우 이를 구분하는 것이 어렵다.Detecting pedestrians in video is used for various purposes such as security and surveillance. However, it is difficult to accurately detect a pedestrian in an environment where there is a limit such as a part where a shadow is large, or in a dark night or in an uneven lighting environment. In addition, it is difficult to distinguish the infrared image acquired by the infrared camera when the temperature around the pedestrian becomes similar to that of the pedestrian.

본 발명의 배경기술은 대한민국 공개특허 제2010-0010734호 (2010.02.02 공개, 스테레오 카메라 및 열화상 카메라를 이용한 승강장모니터링 시스템 및 그 방법)에 개시되어 있다.Background Art [0002] The background art of the present invention is disclosed in Korean Patent Publication No. 2010-0010734 (published on Feb. 02, 2010, a platform monitoring system using a stereo camera and an infrared camera and a method thereof).

본 발명은 가시광 영상과 적외선 영상의 특징을 퍼지 논리를 기반으로 분석하고, 분석 결과에 따라 보행자 검출에 유리한 영상을 선택하여 회선 신경망 모델을 통해 보행자 검출 결과를 검증함으로써 보행자 검출 정확도를 향상시키는 보행자 검출 장치 및 방법을 제공하기 위한 것이다.The present invention analyzes a characteristic of a visible light image and an infrared image based on a fuzzy logic, selects an image advantageous for detecting a pedestrian according to an analysis result, and verifies a pedestrian detection result through a line neural network model, Apparatus, and method.

본 발명의 일 측면에 따르면, 보행자 검출 장치가 제공된다.According to one aspect of the present invention, a pedestrian detecting apparatus is provided.

본 발명의 일 실시 예에 따른 보행자 검출 장치는 가시광 영상 및 적외선 영상을 입력받는 영상 입력부, 입력된 가시광선 영상 및 적외선 영상에서 후보 영역을 검출하고, 검출한 후보 영역에서 특징값 및 가중치를 추출하는 영상 분석부, 기 설정한 두 개의 퍼지 논리를 기반으로 후보영역의 특징값 및 가중치를 이용하여 보행자 검출에 유리한 후보영역을 선택하는 영상 선택부 및 선택된 후보영역에 기 학습한 회선 신경망 모델을 적용하여 선택된 영상이 보행자에 해당하는지 검증하고, 검증 결과에 따라 상기 가시광 영상 및 적외선 영상 내 보행자를 검출하는 보행자 검출부를 포함할 수 있다.A pedestrian detecting apparatus according to an embodiment of the present invention includes a video input unit for receiving a visible light image and an infrared image, a candidate region detecting unit for detecting candidate regions in the input visible light ray image and infrared ray image, and extracting feature values and weights from the detected candidate regions Based on the two fuzzy logic sets, the image analysis part applies the image selection part to select the candidate area which is advantageous for the detection of the pedestrian by using the feature value and the weight of the candidate area, and the line neural network model that learns the selected candidate area And a pedestrian detection unit for verifying whether the selected image corresponds to a pedestrian and for detecting the visible light image and the pedestrian in the infrared image according to the verification result.

본 발명의 다른 일 측면에 따르면, 보행자 검출 방법 및 이를 실행하는 컴퓨터 프로그램이 제공된다.According to another aspect of the present invention, there is provided a pedestrian detection method and a computer program for executing the same.

본 발명의 일 실시 예에 따른 보행자 검출 방법 및 이를 실행하는 컴퓨터 프로그램은 가시광 영상 및 적외선 영상을 입력받는 단계, 입력된 가시광선 영상 및 적외선 영상에서 후보 영역을 검출하고, 검출한 후보 영역에서 특징값 및 가중치를 추출하는 단계, 기 설정한 두 개의 퍼지 논리를 기반으로 후보영역의 특징값 및 가중치를 이용하여 보행자 검출에 유리한 후보영역을 선택하는 단계 및 선택된 후보영역에 기 학습한 회선 신경망 모델을 적용하여 선택된 영상이 보행자에 해당하는지 검증하고, 검증 결과에 따라 상기 가시광 영상 및 적외선 영상 내 보행자를 검출하는 단계를 포함할 수 있다.A pedestrian detection method and a computer program for executing the method of detecting a pedestrian according to an embodiment of the present invention include the steps of receiving a visible light image and an infrared image, detecting a candidate region in the input visible light ray image and infrared ray image, Selecting a candidate region which is advantageous for detecting a pedestrian using feature values and weights of the candidate region based on the two set fuzzy logic, and applying a line-by-line neural network model to the selected candidate region And verifying whether the selected image corresponds to a pedestrian, and detecting the pedestrian in the visible light image and the infrared image according to the verification result.

본 발명의 일 실시 예에 따른 보행자 검출 장치는 가시광 영상과 적외선 영상의 특징을 퍼지 논리를 기반으로 분석하고, 분석 결과에 따라 보행자 검출에 유리한 영상을 선택하여 회선 신경망 모델을 통해 보행자 검출 결과를 검증함으로써 보행자 검출 정확도를 향상시킬 수 있다.A pedestrian detection apparatus according to an embodiment of the present invention analyzes characteristics of a visible light image and an infrared image based on fuzzy logic, selects an image advantageous for pedestrian detection according to the analysis result, and verifies the pedestrian detection result through a line neural network model So that the pedestrian detection accuracy can be improved.

도 1 내지 도 3은 본 발명의 일 실시 예에 따른 보행자 검출 장치를 설명하기 위한 도면들.
도 4 내지 도 7은 본 발명의 일 실시 예에 따른 두 개의 퍼지 논리의 퍼지룰 테이블과 입력 멤버십 함수 및 출력 멤버십 함수를 예시한 도면들.
도 8 및 도 9는 본 발명의 일 실시 예에 따른 회선 신경망 모델을 예시한 도면들.
도 10은 본 발명의 일 실시 예에 따른 회선 신경망 기반의 학습을 통한 최종 보행자 검출 결과를 나타낸 도면.
도 11은 본 발명의 일 실시 예에 따른 보행자 검출 방법을 설명하기 위한 도면.
도 12 내지 도 14는 본 발명의 일 실시 예에 따른 보행자 검출 방법의 검출 정확도를 측정한 결과를 나타낸 도면들.1 to 3 are views for explaining a pedestrian detection apparatus according to an embodiment of the present invention.
FIGS. 4-7 illustrate a fuzzy rule table of two fuzzy logic, an input membership function, and an output membership function according to an embodiment of the present invention.
8 and 9 are diagrams illustrating a circuit neural network model according to an embodiment of the present invention.
10 is a diagram illustrating a final pedestrian detection result through learning based on a circuit neural network according to an embodiment of the present invention.
11 is a view for explaining a pedestrian detection method according to an embodiment of the present invention.
12 to 14 are diagrams showing the results of measuring the detection accuracy of a pedestrian detection method according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시 예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 이를 상세한 설명을 통해 상세히 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서 및 청구항에서 사용되는 단수 표현은, 달리 언급하지 않는 한 일반적으로 "하나 이상"을 의미하는 것으로 해석되어야 한다.While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and similarities. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. In addition, the singular phrases used in the present specification and claims should be interpreted generally to mean "one or more " unless otherwise stated.

이하, 본 발명의 바람직한 실시 예를 첨부도면을 참조하여 상세히 설명하기로 하며, 첨부 도면을 참조하여 설명함에 있어, 동일하거나 대응하는 구성 요소는 동일한 도면번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings, wherein like reference numerals refer to like or corresponding components throughout. .

도 1 내지 도 3은 본 발명의 일 실시 예에 따른 보행자 검출 장치를 설명하기 위한 도면들이다.1 to 3 are views for explaining a pedestrian detection apparatus according to an embodiment of the present invention.

도 1을 참조하면, 보행자 검출 장치(100)는 듀얼 카메라(10), 영상 입력부(110), 영상 분석부(120), 영상 선택부(130) 및 보행자 검출부(140)를 포함한다.Referring to FIG. 1, a pedestrian detection apparatus 100 includes a dual camera 10, an image input unit 110, an image analysis unit 120, an image selection unit 130, and a pedestrian detection unit 140.

듀얼 카메라(10)는 광 축이 평행하게 결합된 가시광 카메라 및 적외선 카메라 중 적어도 하나를 이용하여 영상을 취득한다.The dual camera 10 acquires an image using at least one of a visible light camera and an infrared camera in which optical axes are coupled in parallel.

영상 입력부(110)는 가시광 영상 및 적외선 영상을 입력받는다. The image input unit 110 receives a visible light image and an infrared image.

영상 분석부(120)는 입력된 가시광선 영상 및 적외선 영상에서 후보 영역을 검출하고, 검출한 후보 영역에서 특징값 및 가중치를 추출한다.The image analysis unit 120 detects candidate regions from the input visible light ray image and infrared ray image, and extracts feature values and weights from the detected candidate regions.

도 2를 참조하면, 영상 분석부(120)는 후보영역 검출부(121), 특징값 추출부(122) 및 가중치 추출부(123)을 포함한다. 2, the image analysis unit 120 includes a candidate region detection unit 121, a feature value extraction unit 122, and a weight extraction unit 123.

후보영역 검출부(121)는 가시광 배경영상과 입력된 가시광 영상의 차영상을 생성하여 가시광 영상의 후보영역을 검출하고, 적외선 배경영상과 입력된 적외선 영상의 차영상을 생성하여 적외선 영상의 후보영역을 검출한다. The candidate region detection unit 121 generates a difference image between the visible light background image and the input visible light image to detect a candidate region of the visible light image and generates a difference image between the infrared background image and the input infrared image to generate a candidate region of the infrared image .

도 3을 참조하면, 후보영역 검출부(121)는 도 3의 (a) 및 (b)와 같이 적외선 영상 및 가시광 영상을 입력받아 (c) 및 (d)와 같이 차영상을 생성하고, (e) 및 (f)와 같이 후보영역을 검출할 수 있다. 이때, 후보영역은 도 3의 320 및 340과 같이 보행자뿐만 아니라 310 및 330과 같이 비보행자가 검출될 수 있다. 이에, 본 발명에서는 보행자와 비보행자를 구분하기 위해 특징값 추출부(122) 및 가중치 추출부(123)를 통해 해당 영역의 특징을 분석한다.3, the candidate region detecting unit 121 receives the infrared image and the visible light image as shown in FIGS. 3A and 3B, generates a difference image as shown in FIGS. 3C and 3D, ) And (f). At this time, non-pedestrians such as 310 and 330 as well as pedestrians may be detected in the candidate region as shown in 320 and 340 of FIG. Accordingly, in the present invention, the characteristics of the corresponding region are analyzed through the feature value extracting unit 122 and the weight extracting unit 123 to distinguish the pedestrian from the non-pedestrian.

특징값 추출부(122)는 가시광 영상의 후보영역 및 적외선 영상의 후보영역에서 픽셀 평균값을 산출하여 제1 특징값 및 제2 특징값을 추출한다. The feature value extracting unit 122 extracts the first feature value and the second feature value by calculating a pixel average value in the candidate region of the visible light image and the candidate region of the infrared image.

도 5를 참조하면, 제1 특징값 및 제2 특징값은 후보영역이 보행자일 경우와 비보행자일 경우에 따라 상이한 양상을 보인다. 도 5의 (a)는 후보영역이 보행자에 해당 경우로, 가시광 영상 및 적외선 영상 각각에서 추출한 후보영역의 특징값 상관관계이다. 도 5의 (b)는 후보영역이 비보행자에 해당 경우로, 가시광 영상 및 적외선 영상 각각에서 추출한 후보영역의 특징값 상관관계이다. 상관관계를 분석해 보면, 가시광 영상에서 추출한 특징값은 보행자일 경우에 0.5 이상의 값으로 나타나고, 비보행자일 경우에 0.5 이하의 값으로 나타난다. 이와 반대로, 적외선 영상에서 추출한 특징값은 보행자일 경우에 0.5 이하의 값으로 나타나고, 비보행자일 경우에 0.5 이상의 값으로 나타난다. 이러한 분석을 통해 본 발명은 보행자 영역과 비보행자 영역을 보다 정확하게 검출할 수 있다. Referring to FIG. 5, the first characteristic value and the second characteristic value are different depending on whether the candidate region is a pedestrian or a non-pedestrian. FIG. 5A shows a feature value correlation of a candidate region extracted from each of a visible light image and an infrared image when the candidate region corresponds to a pedestrian. FIG. 5B shows a feature value correlation of a candidate region extracted from each of the visible light image and the infrared image when the candidate region corresponds to a non-pedestrian. When the correlation is analyzed, the feature value extracted from the visible light image is 0.5 or more in the case of a pedestrian and 0.5 or less in the case of a non-pedestrian. On the contrary, the feature value extracted from the infrared image is 0.5 or less for a pedestrian and 0.5 or more for a non-pedestrian. Through this analysis, the present invention can more accurately detect the pedestrian area and the non-pedestrian area.

가중치 추출부(123)는 가시광 영상 및 적외선 영상의 후보영역에 가우시안 필터링을 적용하고 각 픽셀 명도의 기울기 크기값을 산출하여 제1 가중치 및 제2 가중치를 추출한다. The weight extraction unit 123 extracts the first and second weights by applying Gaussian filtering to the candidate regions of the visible light image and the infrared image, and calculating the slope magnitude value of each pixel brightness.

영상 선택부(130)는 기 설정한 두 개의 퍼지 논리를 기반으로 후보영역의 특징값 및 가중치를 이용하여 보행자 검출에 유리한 후보 영역을 선택한다. 여기서 퍼지 논리는 후보 영역이 보행자일 경우와 비보행자일 경우에 따라 상이하게 설정된다. The image selection unit 130 selects a candidate region which is advantageous for the detection of a pedestrian by using the feature values and weights of the candidate region based on the two fuzzy logic sets. Here, the fuzzy logic is set differently depending on whether the candidate region is a pedestrian or a non-pedestrian.

도 4 내지 도 7은 본 발명의 일 실시 예에 따른 두 개의 퍼지 논리의 퍼지룰 테이블과 입력 멤버십 함수 및 출력 멤버십 함수를 예시한 도면들이다.FIGS. 4-7 illustrate two fuzzy logic fuzzy rule tables, an input membership function, and an output membership function, according to one embodiment of the present invention.

도 4를 참조하면, 본 발명에서 설정한 퍼지룰 테이블은 가시광 영상 및 적외선 영상의 후보영역에서 추출한 특징값이 보행자에 해당할 경우와 비보행자에 해당할 경우에 따라 퍼지룰 테이블을 정의된다. 도 4의 (a)는 보행자에 해당할 경우의 퍼지룰 테이블로써, 제1 특징값과 제2 특징값이 모두 'LOW' 이거나 'HIGH'인 경우는 'MEDIUM'을 출력하고, 제1 특징값이 'LOW'이고 제2 특징값이 'HIGH'인 경우는 'HIGH'를 출력하며, 제1 특징값이 'HIGH'이고 제2 특징값이 'LOW' 인 경우는 'LOW' 를 출력하도록 정의된다. 도 4의 (b)는 비보행자에 해당할 경우의 퍼지룰 테이블로써, 제1 특징값과 제2 특징값이 모두 'LOW' 이거나 'HIGH'인 경우는 'MEDIUM'을 출력하고, 제1 특징값이 'LOW'이고 제2 특징값이 'HIGH'인 경우는 'LOW'를 출력하며, 제1 특징값이 'HIGH'이고 제2 특징값이 'LOW' 인 경우는 'HIGH'를 출력하도록 정의된다.Referring to FIG. 4, the fuzzy rule table set in the present invention defines a fuzzy rule table according to the case where feature values extracted from the candidate regions of the visible light image and the infrared image correspond to a pedestrian or a non-pedestrian. 4A is a fuzzy rule table corresponding to a pedestrian, and outputs 'MEDIUM' when the first feature value and the second feature value are 'LOW' or 'HIGH', respectively, LOW 'when the first characteristic value is' HIGH' and the second characteristic value is' LOW 'when the first characteristic value is' LOW' and the second characteristic value is' HIGH ' do. 4B is a fuzzy rule table for a non-pedestrian. When the first feature value and the second feature value are both 'LOW' or 'HIGH', 'MEDIUM' is output. When the first characteristic value is 'HIGH' and the second characteristic value is 'LOW', it outputs 'HIGH'. When the first characteristic value is 'LOW' and the second characteristic value is 'HIGH' Is defined.

도 5를 참조하면, 본 발명에서 설정한 입력 멤버십 함수는 (a)후보 영역이 보행자일 경우의 입력 멤버십 함수와, (b)후보 영역이 비보행자일 경우의 입력 멤버십 함수로 정의된다. 두 개의 입력 멤버십 함수는 가시광선 영상의 후보 영역에서 추출한 특징값(제1 특징값) 및 적외선 영상의 후보 영역에서 추출한 특징값(제2 특징값)을 입력받아 L(Low) 또는 H(High)으로 멤버십 값을 출력한다. 여기서, 입력 멤버십 함수는 예를 들면, 특징값이 0.5 이하이면 L(Low)을 멤버십 값으로 출력할 수 있다. 또한, 입력 멤버십 함수는 특징값이 0.5 이상이면 H(High)를 멤버십 값으로 출력할 수 있다.Referring to FIG. 5, the input membership function set in the present invention is defined as (a) an input membership function when the candidate region is a pedestrian, and (b) an input membership function when the candidate region is a non-pedestrian. The two input membership functions are L (Low) or H (High), which receives a feature value (first feature value) extracted from the candidate region of the visible light image and a feature value (second feature value) extracted from the candidate region of the infrared image, And outputs the membership value. Here, the input membership function can output L (Low) as a membership value, for example, when the feature value is 0.5 or less. In addition, the input membership function can output H (High) as a membership value when the feature value is 0.5 or more.

도 6을 참조하면, 본 발명에서 설정한 출력 멤버십 함수는 입력 멤버십 함수에서 출력된 멤버십 값에 따라 L(Low), M(Medium) 또는 H(High)로 분류하여 퍼지 결과값을 출력한다. 여기서, 출력 멤버십 함수는 제1 특징값(가시광선 영상)과 제2 특징값(적외선 영상)의 상관관계 합에 따라 퍼지 결과값을 출력할 수 있다. 예를 들어, 제1 특징값과 제2 특징값의 상관관계 합이 0.5이하일 때 멤버십 값이 L이면 0, H이면 0.5를 부여할 수 있다. 출력 멤버십 함수는 특징값의 상관 관계 합이 0.5 미만일 때, 멤버십 값이 L이면 0, H이면 0.25를 부여할 수 있다. 또한, 출력 멤버십 함수는 제1 특징값과 제2 특징값의 상관 관계 합이 0.75 미만이면 퍼지 결과값을 L(Low)로 분류할 수 있다. 출력 멤버십 함수는 제1 특징값과 제2 특징값의 상관 관계 합이 0.75 이상 1.0 미만이면, 퍼지 결과를 M(Medium)으로 분류할 수 있다. 출력 멤버십 함수는 제1 특징값과 제2 특징값의 상관관계 합이 1.0 이상이면, 퍼지 결과값을 H(High)로 분류할 수 있다. Referring to FIG. 6, the output membership function set in the present invention is classified into L (Low), M (Medium), or H (High) according to the membership value output from the input membership function. Here, the output membership function may output the fuzzy result value according to the correlation sum of the first feature value (visible light ray image) and the second feature value (infrared ray image). For example, when the sum of the correlation between the first feature value and the second feature value is 0.5 or less, 0 is given for L and 0.5 for H, respectively. The output membership function can be assigned 0 if the correlation value is less than 0.5, 0.25 if the membership value is L, and 0.25 if the correlation value is less than 0.5. In addition, the output membership function can classify the fuzzy result value into L (Low) if the correlation sum of the first feature value and the second feature value is less than 0.75. The output membership function can classify the purge result as M (Medium) if the correlation sum of the first feature value and the second feature value is greater than or equal to 0.75 and less than 1.0. The output membership function can classify the fuzzy result value as H (High) if the correlation sum of the first feature value and the second feature value is 1.0 or more.

본 발명의 일 실시 예에 따른 영상 선택부(130)는 상술한 두 개의 퍼지 논리에 따른 퍼지 결과값을 식(1)에 적용하여 최종 퍼지 결과값을 산출하고, 산출한 최종 퍼지 결과값을 식(2)를 통해 가시광 영상의 후보영역 또는 적외선 영상의 후보영역 중 하나를 보행자 검출에 유리한 후보영역으로 선택할 수 있다. The image selecting unit 130 according to an embodiment of the present invention calculates the final purging result value by applying the purging result values according to the two fuzzy logic described above to the formula (1) It is possible to select one of the candidate region of the visible light image or the candidate region of the infrared image through the image region 2 as a candidate region that is advantageous for pedestrian detection.

(1)

(One)

식(1)에서

는 최종 퍼지 결과값,

는 보행자일 경우의 퍼지 결과값,

는 보행자일 경우의 가중치,

는 비보행자일 경우의 퍼지 결과값,

는 비보행자일 경우의 가중치를 의미한다. In equation (1)

Is the final purge result value,

Is a fuzzy result value in the case of a pedestrian,

Is a weight in the case of a pedestrian,

Is a fuzzy result value in the case of a non-pedestrian,

Is a weight for a non-pedestrian.

여기서, 가중치는 상술한 가중치 추출부(123)에서 추출한 기울기 크기값으로, 도 7을 참조하면 기울기 크기값은 비보행자에 해당하는 경우와 보행자에 해당할 때의 분포가 상이하기 때문에 이러한 특성을 통해 기울기 크기값을 식(1)의

및

에 적용하여 보행자 검출 정확도를 향상시킬 수 있다.Here, the weight is a slope magnitude value extracted by the weight extracting unit 123. Referring to FIG. 7, since the slope magnitude value differs when the pedestrian corresponds to the non-pedestrian and when the pedestrian corresponds to the slope magnitude, The value of the slope magnitude is expressed in Equation (1)

And

It is possible to improve the pedestrian detection accuracy.

(2)

식(2)를 참조하면, 본 발명의 일 실시 예에 따른 보행자 검출 장치(100)는 기 설정한 임계값을 기준으로 최종 퍼지 결과값

이 임계값 이하이면 가시광 영상에서 검출한 후보영역을 선택하고, 그렇지 않을 경우는 적외선 영상에서 검출한 후보영역을 선택한다.Referring to Equation (2), the pedestrian detection apparatus 100 according to an embodiment of the present invention calculates a final purging result value

Is less than or equal to the threshold value, the candidate region detected from the visible light image is selected. Otherwise, the candidate region detected from the infrared image is selected.

보행자 검출부(140)는 선택된 후보영역에 기 학습한 회선 신경망 모델을 적용하여 선택된 영상이 보행자에 해당하는지 검증하고, 검증 결과에 따라 상기 가시광 영상 및 적외선 영상 내 보행자를 검출한다.The pedestrian detecting unit 140 verifies whether the selected image corresponds to a pedestrian by applying a learned neural network model to the selected candidate region, and detects the visible light image and the pedestrian in the infrared image according to the verification result.

도 8 및 도 9은 본 발명의 일 실시 예에 따른 회선 신경망 모델을 예시한 도면들이다.8 and 9 are diagrams illustrating a circuit neural network model according to an embodiment of the present invention.

도 8 및 도 9를 참조하면, 회선 신경망 모델은 이미지 입력 레이어, 특징 추출 레이어 및 분류 레이어를 포함한다.8 and 9, the circuit neural network model includes an image input layer, a feature extraction layer, and a classification layer.

이미지 입력 레이어는 미리 설정한 스케일의 보행자 후보 영상을 입력한다. 이미지 입력 레이어는 컬러 이미지를 이용할 수 있으며 119×183×3 픽셀의 크기일 수 있다. The image input layer inputs a pedestrian candidate image of a predetermined scale. The image input layer may use a color image and may be a size of 119 x 183 x 3 pixels.

특징 추출 레이어는 (1) 5개의 콘벌루션 레이어들을 포함할 수 있고, (2) 각각의 콘벌루션 레이어들은 교정 선형 유닛((Rectified Linear Unit); 이하 'ReLU') 레이어와 함께 하고, (3) 로컬 응답 정규화(Local response normalization); 이하 'LRN') 레이어 및 (4) 맥스 풀링(Max pooling) 레이어를 포함할 수 있다. The feature extraction layer may include (1) five convolution layers, (2) each convolution layer is associated with a rectified linear unit (hereinafter 'ReLU') layer, and (3) Local response normalization; (LRN) layer and (4) a Max pooling layer.

(1) 5개의 콘벌루션 레이어들 중에서 제1 콘벌루션 레이어는 119×183×3 픽셀의 이미지가 입력되며, 크기 11×11×3 크기의 96개의 필터를 이용하여 2 픽셀 간격으로 콘벌루션화될 수 있다. 이 경우 필터당 가중치 크기는 11×11×3=363이고, 제1 콘벌루션 레이어에서 전체 파라미터의 수는(363+1)×96 = 34,944이며, 여기서, 1은 바이어스(bias)를 나타낸다. 제1 콘벌루션 레이어에서 특징 맵의 크기는 55×87×96이고, 55 및 87은 출력 높이 및 너비이다. 출력 높이(또는 너비)는 입력 높이(또는 너비) - 필터 높이(또는 너비) + 2 × 패딩)/스트라이드 +1의 식을 이용하여 산출될 수 있다. 출력들은 ReLU 레이어 및 LRN 레이어를 거친다. 여기서, ReLU 레이어는 식(3)에 나타낸 바와 같이, 0을 기준으로 입력된 값이 0보다 작을 경우는 모두 0으로 설정하는 임계 연산을 수행한다. 이에, ReLU 레이어는 연산을 단순화하여 심층 네트워크의 학습 속도를 증가시키는데 도움이 된다.(1) Of the five convolution layers, the first convolution layer receives an image of 119 x 183 x 3 pixels and is convoluted at intervals of two pixels using 96 filters of size 11 x 11 x 3 . In this case, the weight per filter size is 11 x 11 x 3 = 363, and the total number of parameters in the first convolution layer is (363 + 1) x 96 = 34,944, where 1 indicates a bias. The size of the feature map in the first convolution layer is 55 x 87 x 96, and 55 and 87 are the output height and width. The output height (or width) can be computed using the input height (or width) - filter height (or width) + 2 × padding) / stride +1. The outputs go through the ReLU layer and the LRN layer. Here, as shown in equation (3), the ReLU layer performs a threshold operation in which all 0s are set to 0 when the input value is less than 0. Thus, the ReLU layer helps simplify the operation and increase the learning speed of the deep network.

(3)

LRN 레이어는 식(4)을 통해 구현할 수 있다. 본 발명의 실시 예에 따른 특징 추출 레이어는 식(4)에

,

및

의 값을 1, 0.0001, 0.75으로 설정함으로써 정규화된 출력값(

)을 구할 수 있다.The LRN layer can be implemented through Equation (4). The feature extraction layer according to the embodiment of the present invention is expressed by Equation (4)

,

And

Is set to 1, 0.0001, 0.75 so that the normalized output value (

) Can be obtained.

(4)

LRN 레이어의 출력맵은 제1 맥스 풀링 레이어에 의해 3×3 크기의 1개의 필터를 이용하여 2 픽셀 간격으로 처리된 후 다운샘플링된다. 제1 맥스 풀링 레이어에서의 출력은 출력 높이(또는 너비)= 입력 높이(또는 너비)- 필터 높이(또는 너비) + 2 × 패딩)/스트라이드 +1 등식에 의해 27((55 - 3 + 2 × 0) / 2 + 1))×43(=((87 - 3 + 2 × 0)/ 2 + 1))× 96로 계산된다. The output map of the LRN layer is processed by the first Max-Pulling layer at intervals of 2 pixels using one filter of 3x3 size and then down-sampled. The output from the first max-pulling layer is 27 ((55 - 3 + 2 ×) by the output height (or width) = input height (or width) - filter height (or width) + 2 × padding) / stride + 0) / 2 + 1)) x 43 (= ((87 - 3 + 2 x 0) / 2 + 1)) x 96.

이후, 제2 콘벌루션 레이어에서 5×5×96 크기의 128개의 필터를 이용하여 1 픽셀 간격으로 2패딩씩 콘벌루션화하고, ReLU 레이어 및 LRN 레이어를 거쳐 3×3 크기의 필터를 가지는 제2 맥스 풀링 레이어를 적용하여 2 픽셀 간격으로 특징 맵의 크기를 축소한다. Then, in the second convolution layer, 128 pels of 5 × 5 × 96 pixels are used to convolve 2 padding at intervals of 1 pixel, and through the ReLU layer and the LRN layer, Apply the Max-Pulling layer to reduce the size of the feature map at 2-pixel intervals.

제3 콘벌루션 레이어 및 제4 콘벌루션 레이어는 3×3×128 크기의 필터 256개를 이용하여 1 픽셀 간격으로 1 패딩씩 콘벌루션화하고, ReLU 레이어를 적용한다. The third convolution layer and the fourth convolution layer are convolutionized by 1 padding at intervals of one pixel using 256 3 × 3 × 128 filters and apply the ReLU layer.

다음 제5 콘벌루션 레이어가 1 픽셀 간격으로 1패딩씩 3×3×256 크기의 필터 128개를 이용하여 1 픽셀 간격으로 1 패딩씩 콘벌루션화하고, ReLU 레이어 및 3×3 크기의 필터를 가지는 제3 맥스 풀링 레이어를 적용하여 2 픽셀 간격으로 특징 맵의 크기를 축소하여 6×10×128 픽셀 크기의 출력맵을 출력한다. Next, the fifth convolution layer convolutes one padding at intervals of one pixel by 128 padding filters of 3x3x256 by one padding at intervals of one pixel, and performs convolution using ReLU layer and 3x3 size filter The third max-pulling layer is applied to reduce the size of the feature map at intervals of two pixels to output an output map having a size of 6 x 10 x 128 pixels.

또한, 특징 추출 레이어는 콘벌루션 레이어들 다음으로 3개의 완전 연결(Fully connected) 레이어 집합들을 더 포함할 수 있다. 여기서 완전 연결 레이어는 이전 레이어 연산 결과를 1차원 행렬이 되도록 복수의 노드로 연결하여 특징을 축소한다. 제1 완전 연결 레이어 집합들에는 완전 연결 레이어 및 ReLU 레이어를 포함할 수 있고, 제2 완전 연결 레이어 집합은 완전 연결 레이어, ReLU 레이어 및 드롭아웃(Drop out) 레이어가 삽입될 수 있고, 제3 완전 연결 레이어 집합은 완전 연결 레이어 뒤에 소프트맥스(Softmax) 레이어를 포함할 수 있다. 이 때, 제1 완전 연결 레이어는 6 및 4096 노드들을 입력 및 출력으로 각각 가질 수 있다. 제2 완전 연결 레이어는 4096 및 1024 노드들을 입력 및 출력으로 각각 가질 수 있다. 제3 완전 연결 레이어 전에는 드롭아웃 기술이 적용되며, 이는 무작위로 미리 설정된 확률에 기초하여 각각의 숨겨진 노드의 출력을 0으로 설정할 수 있다. 다음 제3 완전 연결 레이어는 1024 개의 입력 노드와 2개의 출력 노드를 가질 수 있다.In addition, the feature extraction layer may further comprise three fully connected layer sets next to the convolution layers. Here, the complete connection layer reduces the feature by connecting the result of the previous layer operation to a plurality of nodes so as to be a one-dimensional matrix. The first fully-connected layer sets may include a full-connect layer and a ReLU layer, the second fully-connected layer set may include a full-connect layer, a ReLU layer, and a drop-out layer, The connection layer set can include a Softmax layer behind the full connection layer. At this time, the first complete connection layer may have 6 and 4096 nodes as input and output, respectively. The second complete connection layer may have 4096 and 1024 nodes as inputs and outputs, respectively. Prior to the third complete connection layer, a dropout technique is applied, which may set the output of each hidden node to zero based on a randomly predetermined probability. The next third fully connected layer may have 1024 input nodes and 2 output nodes.

분류 레이어는 제3 완전 연결 레이어에서 출력된 후보영역의 특징을 소프트맥스(Softmax) 레이어를 통해 보행자 또는 비호행자로 분류한다.The classification layer classifies the characteristics of the candidate region output from the third complete connection layer as a pedestrian or a non-preferred through the Softmax layer.

소프트맥스 함수는 식(5)과 같다. 식(3)에서

는 i~ K개의 클래스에 대응하는 Sj값을 정규화함으로써 산출할 수 있다. The soft max function is expressed by Equation (5). In equation (3)

Can be calculated by normalizing Sj values corresponding to i to K classes.

(5)

보행자 검출부(140)는 식(3)에서

값이 최대가 되는 경우에 해당하는 클래스에 따라 선택된 후보영역을 보행자 또는 비보행자로 분류하고, 보행자로 분류된 후보영역으로 입력된 가시광 영상 및 적외선 영상 내의 보행자를 검출한다.The pedestrian detection unit 140 calculates the distance

When the value becomes maximum, the candidate region selected according to the class is classified as a pedestrian or a non-pedestrian, and a visible light image and a pedestrian in the infrared image input as candidate regions classified as a pedestrian are detected.

도 10은 본 발명의 일 실시 예에 따른 회선 신경망 기반의 학습을 통한 최종 보행자 검출 결과를 나타낸 도면이다.10 is a diagram illustrating a final pedestrian detection result through learning based on a circuit neural network according to an embodiment of the present invention.

도 10을 참조하면, 본 발명의 일 실시 예에 따른 보행자 검출 장치(100)는 (a)가시광선 영상에서 선택된 후보영역(1010)이 최종적으로 선택된 후보영역일 경우, 해당 영역과 대응하는 (b)적외선 영상에서 선택된 후보영역(1020)을 최종 보행자로 검출할 수 있다. 상술한 방법을 통해 보행자 검출 장치(100)는 가시광 영상과 적외선 영상에서 보행자의 후보영역을 검출하고, 영상 간 후보영역 특징의 상관관계를 분석함으로써 비호행자 영역을 효과적으로 제거하여 보행자를 보다 정확하게 검출할 수 있다.Referring to FIG. 10, the pedestrian detection apparatus 100 according to an embodiment of the present invention includes: (a) a candidate region 1010 selected from a visible light image, which is a finally selected candidate region, The candidate region 1020 selected from the infrared image can be detected as a final pedestrian. The pedestrian detection apparatus 100 detects the candidate region of the pedestrian from the visible light image and the infrared ray image and analyzes the correlation between the candidate region features between the images to effectively detect the pedestrian by more effectively removing the non- .

도 11은 본 발명의 일 실시 예에 따른 보행자 검출 방법을 설명하기 위한 도면이다. 11 is a view for explaining a pedestrian detection method according to an embodiment of the present invention.

도 11을 참조하면, 단계 S1110 및 S1115에서 보행자 검출 장치(100)는 광 축이 평행하게 결합된 가시광 카메라 및 적외선 카메라를 통해 취득한 가시광 영상 및 적외선 영상을 입력한다.Referring to FIG. 11, in steps S1110 and S1115, the pedestrian detection apparatus 100 inputs visible light images and infrared images acquired through a visible light camera and an infrared camera in which optical axes are coupled in parallel.

단계 S1120에서 보행자 검출 장치(100)는 입력한 가시광 배경영상과 입력된 가시광 영상의 차영상을 생성하여 가시광 영상의 후보영역을 검출한다. In step S1120, the pedestrian detection apparatus 100 generates a difference image between the inputted visible light background image and the input visible light image to detect a candidate region of the visible light image.

단계 S1125에서 보행자 검출 장치(100)는 입력한 적외선 배경영상과 입력된 적외선 영상의 차영상을 생성하여 적외선 영상의 후보영역을 검출한다. In step S1125, the pedestrian detection apparatus 100 generates a difference image between the inputted infrared background image and the input infrared image to detect a candidate region of the infrared image.

단계 S1130 및 단계 S1135에서 보행자 검출 장치(100)는 가시광 영상의 후보영역 및 적외선 영상의 후보영역에서 픽셀 평균값을 산출하여 제1 특징값 및 제2 특징값을 추출한다. In steps S1130 and S1135, the pedestrian detection apparatus 100 calculates a pixel average value in the candidate region of the visible light image and the candidate region of the infrared image, and extracts the first feature value and the second feature value.

단계 S1140 및 단계 S1145에서 보행자 검출 장치(100)는 상기 가시광 영상의 후보영역 및 상기 적외선 영상의 후보영역에 가우시안 필터링을 적용하고 각 픽셀 명도의 기울기 크기값을 산출하여 제1 가중치 및 제2 가중치를 추출한다. In step S1140 and step S1145, the pedestrian detection apparatus 100 applies Gaussian filtering to the candidate region of the visible light image and the candidate region of the infrared image, calculates the slope magnitude value of each pixel brightness, and outputs the first weight and the second weight .

단계 S1150에서 보행자 검출 장치(100)는 기 설정한 두 개의 퍼지 논리에 제1 특징값, 제2 특징값, 제1 가중치 및 제2 가중치를 적용하여 퍼지 결과값을 출력한다.In step S1150, the pedestrian detection apparatus 100 applies the first feature value, the second feature value, the first weight value, and the second weight value to two predetermined fuzzy logic values to output a fuzzy result value.

단계 S1155에서 보행자 검출 장치(100)는 출력된 퍼지 결과값에 퍼지 결과값에 따라 가시광 영상의 후보영역 또는 적외선 영상의 후보영역 중 하나를 선택한다.In step S1155, the pedestrian detection device 100 selects one of the candidate region of the visible light image or the candidate region of the infrared image according to the output fuzzy result value.

단계 S1160에서 보행자 검출 장치(100)는 선택된 후보영역을 기 학습한 회선 신경망 모델에 적용하여 보행자 또는 비보행자로 분류한다.In step S1160, the pedestrian detection device 100 classifies the selected candidate area into a pedestrian or a non-pedestrian by applying it to a learned-line neural network model.

단계 S1165에서 보행자 검출 장치(100)는 보행자로 분류된 후보영역으로 입력된 가시광 영상 및 적외선 영상 내의 보행자를 검출한다.In step S1165, the pedestrian detection device 100 detects the pedestrian in the visible light image and the infrared image input into the candidate area classified as the pedestrian.

도 12 내지 도 14는 본 발명의 일 실시 예에 따른 보행자 검출 방법의 검출 정확도를 측정한 결과를 나타낸 도면들이다.12 to 14 are diagrams showing the results of measurement of the detection accuracy of the pedestrian detection method according to an embodiment of the present invention.

본 발명은 검출 성능 평가를 위해 총 3개의 데이터 셋을 사용하였다. 표 1을 참조하면, 첫 번째 데이터 셋은 자체적으로 구축한 셋으로, 날씨, 시간 및 온도 변화와 같은 다양한 환경 변화에서 촬영된 총 4,080장의 영상 데이터로 구성된다. 두 번째 데이터 셋은 첫 번째 데이터에 가우시안 노이즈 및 블러 효과를 인위적으로 생성한 영상 데이터로 구성된다. 세 번째 데이터 셋은 OSU color-thermal 데이터베이스로 오픈 데이터베이스 중 하나이다. The present invention uses a total of three data sets for the detection performance evaluation. Referring to Table 1, the first data set is a self-constructed set of 4,080 image data taken from various environmental changes such as weather, time, and temperature changes. The second data set consists of the image data generated artificially by Gaussian noise and blur effect in the first data. The third dataset is the OSU color-thermal database, which is one of the open databases.

Sub-database1Sub-database1 2Sub-database22Sub-database2 Sub-database3Sub-database3 Sub-database4Sub-database4 Number of imageNumber of image 598598 651651 23642364 467467 Number of pedestrian candidateNumber of pedestrian candidate 11231123 566566 24322432 169169 Number of pedestrian non-candidateNumber of pedestrian non-candidate 763763 734734 784784 347347 (range of width)×(rang of height)
(pixels)(range of width) x (rang of height)
(pixels) Pedestrian Pedestrian (27~91)
×
(63~142)(27 ~ 91)
×
(63 to 142) (24~93)
×
(49~143)(24 ~ 93)
×
(49 to 143) (31~105)
×
(55~147)(31 to 105)
×
(55-147) (60~170)
×
(50~110)(60-170)
×
(50 to 110) Non-candidateNon-candidate (51~161)
×
(63~142)(51 to 161)
×
(63 to 142) (29~93)
×
(49~143)(29-93)
×
(49 to 143) (53~83)
×
(55~147)(53 to 83)
×
(55-147) (60~170)
×
(50~110))(60-170)
×
(50 to 110) Weather ConditionsWeather Conditions Surface
temperature:
30.4℃
Air temperature:
22.5℃
Wind speed:
10km/h
Sensory temperature:
21.3℃Surface
temperature:
30.4 DEG C
Air temperature:
22.5 DEG C
Wind speed:
10 km / h
Sensory temperature:
21.3 DEG C Surface
temperature:
25.5℃
Air temperature:
24℃
Wind speed:
5km/h
Sensory temperature:
23.5℃Surface
temperature:
25.5 DEG C
Air temperature:
24 ℃
Wind speed:
5km / h
Sensory temperature:
23.5 DEG C Surface
temperature:
20℃
Air temperature:
21℃
Wind speed:
6.1km/h
Sensory temperature:
21℃Surface
temperature:
20 ℃
Air temperature:
21 ℃
Wind speed:
6.1 km / h
Sensory temperature:
21 ℃ Surface
temperature:
16℃
Air temperature:
20.5℃
Wind speed:
2.5km/h
Sensory temperature:
20.8℃Surface
temperature:
16 ℃
Air temperature:
20.5 DEG C
Wind speed:
2.5 km / h
Sensory temperature:
20.8 DEG C

학습 및 검증은 Intel® Core™ i7-3770K CPU @ 3.40 GHz (4 CPUs), 16GB 메모리를 가진 그래픽 카드 NVIDIA GeForce GTX 1070 (1,920 CUDA cores) 컴퓨팅 장치가 이용되었다.Learning and verification was done using a Intel® Core ™ i7-3770K CPU @ 3.40 GHz (4 CPUs) and a graphics card NVIDIA GeForce GTX 1070 (1,920 CUDA cores) computing device with 16GB memory.

보행자 검출 장치(100)는 가시광 영상과 적외선 영상에서 후보영역을 검출하고, 기 설정한 퍼지 논리를 기반으로 보행자 검출에 유리한 후보영역을 선택한다. 여기서 퍼지 논리는 앞 서 도 4 내지 도 6에서 설명한 방법과 같다.The pedestrian detection apparatus 100 detects a candidate region in a visible light image and an infrared image, and selects a candidate region that is advantageous for pedestrian detection based on the previously set fuzzy logic. Here, the fuzzy logic is the same as the method described in Figs. 4 to 6 above.

보행자 검출 장치(100)는 보행자 영상 및 비보행자 영상으로 회선 신경망 모델을 학습시켰으며, 보행자 영상 및 비보행자 영상을 119×183 픽셀 크기로 정규화하여 학습 및 검증에 이용하였다. 여기서, 회선 신경망 모델의 학습은 앞 서 도 8 및 도 9를 참조하여 설명한 방법과 같다.The pedestrian detection apparatus 100 learned a line neural network model using a pedestrian image and a non-pedestrian image, and normalized the pedestrian image and the non-pedestrian image to a size of 119 × 183 pixels for use in learning and verification. Here, learning of the circuit neural network model is the same as that described with reference to Figs. 8 and 9.

검증은 데이터 셋을 2개의 그룹으로 나누어 4회의 교차 검증(cross validation)를 수행하였다. Verification performed four cross validations by dividing the dataset into two groups.

본 발명의 일 실시 예에 따른 보행자 검출 장치(100)는 기존에 연구된 보행자 검출 방법과의 성능 비교를 위해, 총 3개의 데이터 셋을 사용하여 TPR(True Positive Rate), TNR(True Negative Rate), FNR(False Negative Rate) 및 FPR(False Positive Rate)를 측정하였다. 또한, 보행자 검출의 평균 정확도를 아래 수식 6 내지 9를 이용하여 계산하였다. The pedestrian detection apparatus 100 according to an embodiment of the present invention uses TPR (True Positive Rate) and TNR (True Negative Rate) using a total of three data sets, , FNR (False Negative Rate) and FPR (False Positive Rate). In addition, the average accuracy of the pedestrian detection was calculated using the following expressions (6) to (9).

(6)

(7)

(8)

(9)

여기서, #TN, #TP, #FN 및 #FP 는 각각 true negatives (TN's), true positives (TP's), false negatives (FN's) 및 false positives (FP's)의 개수를 나타낸다. Here, #TN, #TP, #FN, and #FP represent the number of true negatives (TN's), true positives (TP's), false negatives (FN's), and false positives (FP's).

도 12 내지 도 14를 참조하면, 본 발명의 보행자 검출 장치(100)는 총 3개의 데이터 셋을 사용하여 평균 검출 정확도를 측정한 결과, 99.61%, 97.02% 및 99.03%의 정확도를 보였으며, 이는 기존에 연구된 보행자 검출 방법의 정확도 보다 우수함을 확인할 수 있다.12 to 14, the pedestrian detection apparatus 100 of the present invention measures 99.61%, 97.02%, and 99.03% of the average detection accuracy using a total of three data sets, It can be confirmed that the accuracy of the pedestrian detection method that has been studied is superior to that of the conventional pedestrian detection method.

본 발명의 실시 예에 따른 보행자 검출 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 컴퓨터 판독 가능 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 또한, 상술한 매체는 프로그램 명령, 데이터 구조 등을 지정하는 신호를 전송하는 반송파를 포함하는 광 또는 금속선, 도파관 등의 전송 매체일 수도 있다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The pedestrian detection method according to an embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination. Program instructions to be recorded on a computer-readable medium may be those specially designed and constructed for the present invention or may be available to those skilled in the computer software arts. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Includes hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. The above-mentioned medium may also be a transmission medium such as optical or metal lines, waveguides, etc., including a carrier wave for transmitting a signal designating a program command, a data structure and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.The embodiments of the present invention have been described above. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

100: 보행자 검출 장치
110: 영상 입력부
120: 영상 분석부
130: 영상 선택부
140: 보행자 검출부100: Pedestrian detection device
110:
120: Image analysis section
130:
140: Pedestrian detection unit

Claims

A pedestrian detection device comprising:
An image input unit for receiving a visible light image and an infrared image;
An image analyzer for detecting a candidate region in an input visible light ray image and an infrared ray image and extracting a feature value and a weight value from the detected candidate region;
An image selection unit for selecting a candidate region that is advantageous for the detection of a pedestrian by using feature values and weights of the candidate regions based on two predetermined fuzzy logic; And
And a pedestrian detection unit for verifying whether the selected image corresponds to a pedestrian by applying a circuit neural network model that has been learned to the selected candidate region and detecting the pedestrian in the visible light image and the infrared image according to the verification result,
The image selection unit
A feature value extracted from a candidate region of a visible light image and an infrared image is set to a fuzzy logic differently depending on whether it is a pedestrian or a non-pedestrian, a characteristic value of the candidate region is input to fuzzy logic, The result and weight are applied to equation (1) to calculate the final fuzzy result,

(One)
In equation (1)

Is the final purge result value,

Is a fuzzy result value in the case of a pedestrian,

Is a weight in the case of a pedestrian,

Is a fuzzy result value in the case of a non-pedestrian,

Means a weight in the case of a non-pedestrian.

The method according to claim 1,
The image analysis unit
A candidate region detection unit for detecting a candidate region of a visible light image by generating a difference image of a visible light background image and an input visible light image to generate a difference image of an infrared background image and an input infrared image to detect a candidate region of the infrared image;
A feature value extracting unit for calculating a pixel average value in a candidate region of a visible light image and a candidate region of an infrared image to extract a first feature value and a second feature value; And
And a weight extracting unit for applying Gaussian filtering to candidate areas of the visible light image and the infrared image, and calculating a slope magnitude value of each pixel brightness to extract a first weight and a second weight.

delete

The method according to claim 1,
The image selection unit
Based on the preset threshold value, the final purge result value

The candidate region detected from the visible light image is selected, and if not, the candidate region detected from the infrared image is selected.

The method according to claim 1,
The circuit neural network model of the pedestrian detecting unit
An image input layer for inputting a pedestrian candidate image of a preset scale;
A feature extraction layer including five convolution layer sets and three fully connected layer sets; And
And a classification layer for classifying the characteristics of the candidate region output from the three complete connection layers as a pedestrian or a non-preferred through a soft max layer.

6. The method of claim 5,
The feature extraction layer
Each convolution layer is associated with a rectified linear unit layer, and includes a local response normalization layer and a Max pooling layer.

6. The method of claim 5,
The five convolution layer sets
A first convolution layer for inputting an image of 115 × 51 × 1 pixels and convoluting at intervals of 2 pixels using 96 filters of size 11 × 11 × 3, a rectified linear unit layer, A first convolution layer composed of a local response normalization layer and a 3 × 3 size filter, which is a first max-pooling layer for outputting a feature map of 27 × 43 × 96 pixels, set;
A second convolution layer for convolutionizing 128 filters of 5x5x96 size by 2 padding at intervals of one pixel in a feature map of 27x43x96 pixels outputted from the first convolution layer set, A second convolution layer set consisting of a unit layer, a local response normalization, and a second max-pooling layer for outputting a feature map of 13x21x128 pixels by applying one 3x3 filter at intervals of two pixels;
A third convolution layer for convoluting the feature maps of 13 x 21 x 256 pixels output from the second convolution layer set by one padding at intervals of one pixel using 256 3 x 3 x 128 filters; A third convolution layer set consisting of a calibration linear unit layer;
A fourth convolution layer for convolutionizing the feature maps of 13 x 21 x 256 pixels output from the third convolution layer by one padding at intervals of one pixel by using 256 x 3 x 128 filters, A fourth convolution layer set composed of a linear unit layer; And
In a feature map of 13 × 21 × 256 pixels output from the fourth convolution layer set, 128 filters having a size of 3 × 3 × 256 are convolved with 1 padding at intervals of 1 pixel, A fifth convolution layer consisting of a third max-filling layer for applying a 3 x 3 size filter at intervals of two pixels to output a feature map of 6 x 10 x 128 pixels, and a calibration linear unit layer The pedestrian detection device comprising:

6. The method of claim 5,
The three fully connected layer sets
6 < / RTI > and 4096 nodes as inputs and outputs, respectively;
A second full connection layer having 4096 and 1024 nodes as inputs and outputs, respectively; And
1024 and a third complete connection layer having two nodes as inputs and outputs, respectively.

In the pedestrian detection method,
Receiving a visible light image and an infrared image;
Detecting candidate regions in the input visible light ray image and infrared ray image, and extracting feature values and weights from the detected candidate regions;
Selecting candidate regions advantageous for pedestrian detection using feature values and weights of the candidate regions based on two predetermined fuzzy logic; And
And a step of verifying whether the selected image corresponds to a pedestrian by applying a line-by-line neural network model to the selected candidate region, and detecting the pedestrian in the visible light image and the infrared image according to the result of the verification,
Wherein the step of selecting a candidate region which is advantageous for the detection of a pedestrian by using the feature values and the weights of the candidate regions based on the two predetermined fuzzy logic
A feature value extracted from a candidate region of a visible light image and an infrared image is set to a fuzzy logic differently depending on whether it is a pedestrian or a non-pedestrian, a characteristic value of the candidate region is input to fuzzy logic, The result and weight are applied to equation (1) to calculate the final fuzzy result,

(One)
In equation (1)

Is the final purge result value,

Is a fuzzy result value in the case of a pedestrian,

Is a weight in the case of a pedestrian,

Is a fuzzy result value in the case of a non-pedestrian,

Means a weight in the case of a non-pedestrian.

10. The method of claim 9,
The step of detecting a candidate region in the input visible light ray image and the infrared ray image and extracting a feature value and a weight value in the detected candidate region
Detecting a candidate region of a visible light image by generating a difference image between a visible light background image and an input visible light image, generating a difference image between the infrared background image and the input infrared image to detect a candidate region of the infrared image;
Calculating a pixel average value in a candidate region of the visible light image and a candidate region of the infrared image to extract a first feature value and a second feature value; And
Further comprising the step of applying Gaussian filtering to the candidate region of the visible light image and the candidate region of the infrared image, and calculating a slope magnitude value of each pixel brightness to extract a first weight and a second weight.

11. The method of claim 10,
Wherein the step of selecting a candidate region which is advantageous for the detection of a pedestrian by using the feature values and the weights of the candidate regions based on the two predetermined fuzzy logic
Applying the first feature value, the second feature value, the first weight value, and the second weight value to two preset fuzzy logic values to output a final purge result value; and
And selecting one of the candidate region of the visible light image or the candidate region of the infrared image according to the result of the final purging.

12. The method of claim 11,
The fuzzy logic may be set differently depending on whether the feature value extracted from the candidate region of the visible light image and the infrared image corresponds to a pedestrian or a non-pedestrian,
The final purging result value is obtained by inputting the feature value of the candidate region to the fuzzy logic, applying the output fuzzy result value and the weight to the equation (1)

(One)
In equation (1)

Is the final purge result value,

Is a fuzzy result value in the case of a pedestrian,

Is a weight in the case of a pedestrian,

Is a fuzzy result value in the case of a non-pedestrian,

Means a weight in the case of a non-pedestrian.

10. The method of claim 9,
Wherein the step of verifying whether the selected image corresponds to a pedestrian by applying a line neural network model learned in the selected candidate region and detecting the pedestrian in the visible light image and the infrared image according to the result of the verification
The selected candidate region is classified as a pedestrian or a non-pedestrian by using a circuit neural network model that learns a pedestrian region and a non-pedestrian region included in the data set, and the pedestrian in the visible image and the infrared image as the candidate region classified as a pedestrian Detecting a pedestrian.

A computer program that executes the pedestrian detection method according to any one of claims 9 to 13 and is recorded on a computer-readable recording medium.