KR102260405B1

KR102260405B1 - Method and apparatus for recognizing passengers using artificial neural network

Info

Publication number: KR102260405B1
Application number: KR1020190152228A
Authority: KR
Inventors: 김현덕; 손명규; 이상헌
Original assignee: 재단법인대구경북과학기술원
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2021-06-03
Also published as: KR20210063745A

Abstract

본 발명의 일 실시예에 따른 탑승객 인식 방법은 적어도 하나의 헤드를 포함하는 복수의 프레임을 획득하는 단계, 복수의 컨벌루션 레이어 및 복수의 헤드 검출 네트워크를 포함하는 신경망에 상기 복수의 프레임을 입력하는 단계, 제1 컨벌루션 레이어에서 출력된 제1 특징맵을 제1 헤드 검출 네트워크에 입력하고, 제2 컨벌루션 레이어에서 출력된 제2 특징맵 및 상기 제1 특징맵을 제2 헤드 검출 네트워크에 입력하는 단계, 상기 제2 헤드 검출 네트워크에서 상기 제1 특징맵을 업샘플링(upsampling)한 후 상기 제2 특징맵과 합성곱 연산을 수행하고, 상기 제2 헤드 검출 네트워크의 출력 데이터를 기초로 상기 복수의 프레임에 포함된 헤드를 검출하는 단계 및 상기 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과하는 경우, 탑승객이 승차한 것으로 판단하고, 상기 헤드가 제2 라인을 통과한 후 제1 라인을 통과하는 경우 상기 탑승객이 하차한 것으로 판단하는 단계를 포함할 수 있다.A passenger recognition method according to an embodiment of the present invention includes: obtaining a plurality of frames including at least one head; and inputting the plurality of frames into a neural network including a plurality of convolutional layers and a plurality of head detection networks; , inputting the first feature map output from the first convolutional layer to the first head detection network, and inputting the second feature map output from the second convolutional layer and the first feature map to the second head detection network; After upsampling the first feature map in the second head detection network, a convolution operation is performed with the second feature map, and based on the output data of the second head detection network, the first feature map is applied to the plurality of frames. detecting the included head, and when the detected head passes through the first line and then passes through the second line, it is determined that the passenger has boarded, and after the head passes through the second line, the first line In case of passing, it may include determining that the passenger has alighted.

Description

Passenger recognition method and device using artificial neural network { METHOD AND APPARATUS FOR RECOGNIZING PASSENGERS USING ARTIFICIAL NEURAL NETWORK }

본 발명은 인공신경망을 이용하여 헤드를 검출하고, 검출된 헤드를 이용하여 탑승객의 승하차를 인식하는 방법 및 그 장치에 관한 것이다.The present invention relates to a method and apparatus for detecting a head using an artificial neural network and recognizing a passenger's getting on and off using the detected head.

딥러닝 기반의 다양한 헤드 검출 방법이 개발되고 있으나, 대부분의 헤드 검출기는 설계 단계부터 많은 연산량을 요구한다. 이에 따라 GPU기반의 병렬처리가 필수적으로 요구된다는 한계가 존재한다.Although various deep learning-based head detection methods are being developed, most head detectors require a large amount of computation from the design stage. Accordingly, there is a limit that GPU-based parallel processing is essential.

따라서, 모바일과 같은 임베디드 기기에서 실시간으로 헤드를 검출할 수 있는 기술의 제공이 요구된다.Accordingly, it is required to provide a technology capable of detecting a head in real time in an embedded device such as a mobile device.

한편, 교통 수단의 수요를 파악하기 위해 탑승객의 승하차를 정확하게 파악할 필요가 있으나, 탑승객의 경우 지정된 경로로 이동하지 않으므로 승하차 판단에 오인식률이 높아 문제가 된다.On the other hand, in order to understand the demand for transportation means, it is necessary to accurately identify the passengers getting on and off, but in the case of the passengers, since they do not go to a designated route, the misrecognition rate is high in determining whether to get on or off, which is a problem.

본 발명은, CPU 자원을 이용하여 헤드를 정확하게 검출하는 방법 및 그 장치를 제공하는 것을 목적으로 한다.An object of the present invention is to provide a method and apparatus for accurately detecting a head using CPU resources.

본 발명은, 탑승객의 승하차를 정확하게 판단하는 방법 및 그 장치를 제공하는 것을 목적으로 한다.An object of the present invention is to provide a method and an apparatus for accurately determining whether a passenger gets on or off.

그러나 이러한 과제는 예시적인 것으로, 이에 의해 본 발명의 범위가 한정되는 것은 아니다.However, these problems are exemplary, and the scope of the present invention is not limited thereto.

일 실시예에서 제1 컨벌루션 레이어에서 출력된 제1 특징맵을 제1 헤드 검출 네트워크에 입력하고, 제2 컨벌루션 레이어에서 출력된 제2 특징맵 및 상기 제1 특징맵을 제2 헤드 검출 네트워크에 입력하는 단계는, 제3 컨벌루션 레이어에서 출력된 제3 특징맵, 제2 특징맵 및 제1 특징맵을 제3 헤드 검출 네트워크에 입력하는 단계를 포함하고, 상기 제2 헤드 검출 네트워크에서 상기 제1 특징맵을 업샘플링(upsampling)한 후 상기 제2 특징맵과 합성곱 연산을 수행하고, 상기 제2 헤드 검출 네트워크의 출력 데이터를 기초로 상기 복수의 프레임에 포함된 헤드를 검출하는 단계는, 상기 제3 헤드 검출 네트워크에서 상기 제2 헤드 검출 네트워크의 상기 합성곱 연산의 결과 데이터 및 상기 제3 특징맵간 합성곱 연산을 수행하고, 상기 제3 헤드 검출 네트워크의 출력 데이터를 기초로 상기 복수의 프레임에 포함된 헤드를 검출하는 단계를 더 포함할 수 있다.In an embodiment, the first feature map output from the first convolutional layer is input to the first head detection network, and the second feature map output from the second convolution layer and the first feature map are input to the second head detection network The step of performing includes inputting a third feature map, a second feature map, and a first feature map output from a third convolutional layer into a third head detection network, wherein the first feature in the second head detection network The step of performing a convolution operation with the second feature map after upsampling the map, and detecting the heads included in the plurality of frames based on the output data of the second head detection network, includes: A three-head detection network performs a convolution operation between the result data of the convolution operation of the second head detection network and the third feature map, and includes in the plurality of frames based on the output data of the third head detection network It may further include the step of detecting the old head.

일 실시예에서 상기 신경망은, 배치 정규화(Batch normalization)을 수행하지 않을 수 있다.In an embodiment, the neural network may not perform batch normalization.

일 실시예에서 상기 복수의 컨벌루션 레이어는, 그룹 컨벌루션 레이어를 더 포함하고, 상기 제1 헤드 검출 네트워크는, 상기 제1 특징맵을 기초로 그룹 합성곱 연산을 수행하고, 상기 제2 헤드 검출 네트워크는, 상기 제1 특징맵을 업샘플링(upsampling)한 후 상기 제2 특징맵과 그룹 합성곱 연산을 수행할 수 있다.In an embodiment, the plurality of convolutional layers further include a group convolutional layer, wherein the first head detection network performs a group convolution operation based on the first feature map, and the second head detection network includes: , after upsampling the first feature map, a group convolution operation with the second feature map may be performed.

일 실시예에서 상기 검출된 헤드가 제1 라인과 제2 라인 사이의 지정된 공간에 위치하는 경우, 상기 탑승객의 승하차 판단을 보류하는 단계를 더 포함할 수 있다.In an embodiment, when the detected head is located in a designated space between the first line and the second line, the method may further include the step of suspending the determination of getting on and off the passenger.

본 발명의 다른 실시예에 따른 탑승객 인식 장치는 프로세서 를 포함하고, 상기 프로세서는 적어도 하나의 헤드를 포함하는 복수의 프레임을 획득하고, 복수의 컨벌루션 레이어 및 복수의 헤드 검출 네트워크를 포함하는 신경망에 상기 복수의 프레임을 입력하고, 제1 컨벌루션 레이어에서 출력된 제1 특징맵을 제1 헤드 검출 네트워크에 입력하고, 제2 컨벌루션 레이어에서 출력된 제2 특징맵 및 상기 제1 특징맵을 제2 헤드 검출 네트워크에 입력하고, 상기 제2 헤드 검출 네트워크에서 상기 제1 특징맵을 업샘플링(upsampling)한 후 상기 제2 특징맵과 합성곱 연산을 수행하고, 상기 제2 헤드 검출 네트워크의 출력 D23데이터를 기초로 상기 복수의 프레임에 포함된 헤드를 검출하고, 상기 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과하는 경우, 탑승객이 승차한 것으로 판단하고, 상기 헤드가 제2 라인을 통과한 후 제1 라인을 통과하는 경우 상기 탑승객이 하차한 것으로 판단할 수 있다.An apparatus for recognizing a passenger according to another embodiment of the present invention includes a processor, wherein the processor acquires a plurality of frames including at least one head, and transmits the frame to a neural network including a plurality of convolutional layers and a plurality of head detection networks. A plurality of frames are input, a first feature map output from a first convolutional layer is input to a first head detection network, and a second feature map output from a second convolution layer and the first feature map are used for second head detection input to a network, upsampling the first feature map in the second head detection network, and perform a convolution operation with the second feature map, based on the output D23 data of the second head detection network to detect a head included in the plurality of frames, and when the detected head passes through the first line and then passes through the second line, it is determined that the passenger has boarded, and the head passes through the second line. After passing through the first line, it may be determined that the passenger has alighted.

일 실시예에서 상기 프로세서는, 제3 컨벌루션 레이어에서 출력된 제3 특징맵, 제2 특징맵 및 제1 특징맵을 제3 헤드 검출 네트워크에 입력하고, 상기 제3 헤드 검출 네트워크에서 상기 제2 헤드 검출 네트워크의 상기 합성곱 연산의 결과 데이터 및 상기 제3 특징맵간 합성곱 연산을 수행하고, 상기 제3 헤드 검출 네트워크의 출력 데이터를 기초로 상기 복수의 프레임에 포함된 헤드를 검출할 수 있다.In an embodiment, the processor inputs a third feature map, a second feature map, and a first feature map output from a third convolutional layer to a third head detection network, and the second head in the third head detection network A convolution operation may be performed between the result data of the convolution operation of a detection network and the third feature map, and the heads included in the plurality of frames may be detected based on the output data of the third head detection network.

일 실시예에서 상기 프로세서는, 상기 검출된 헤드가 제1 라인과 제2 라인 사이의 지정된 공간에 위치하는 경우, 상기 탑승객의 승하차 판단을 보류할 수 있다.In an embodiment, when the detected head is located in a designated space between the first line and the second line, the processor may suspend the determination of getting on and off the passenger.

전술한 것 외의 다른 측면, 특징, 이점은 이하의 발명을 실시하기 위한 구체적인 내용, 청구범위 및 도면으로부터 명확해질 것이다.Other aspects, features and advantages other than those described above will become apparent from the following detailed description, claims and drawings for carrying out the invention.

도 1은 본 발명의 일 실시예에 따른 탑승객 인식 장치의 구성 및 동작을 설명하기 위한 도면이다.
도 2 내지 도 3은 본 발명의 일 실시예에 따른 탑승객 인식 방법의 순서도이다.
도 4 본 발명의 일 실시예에 따라 헤드를 검출하는 신경망을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따라 헤드를 검출하는 신경망의 예시도이다.
도 6 내지 도 7은 본 발명의 일 실시예에 따라 탑승객의 승하차 여부를 판단하는 방법을 설명하기 위한 도면이다.1 is a view for explaining the configuration and operation of a passenger recognition device according to an embodiment of the present invention.
2 to 3 are flowcharts of a method for recognizing a passenger according to an embodiment of the present invention.
4 is a diagram for explaining a neural network for detecting a head according to an embodiment of the present invention.
5 is an exemplary diagram of a neural network for detecting a head according to an embodiment of the present invention.
6 to 7 are diagrams for explaining a method of determining whether a passenger gets on or off according to an embodiment of the present invention.

이하, 본 개시의 다양한 실시예가 첨부된 도면과 연관되어 기재된다. 본 개시의 다양한 실시예는 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들이 도면에 예시되고 관련된 상세한 설명이 기재되어 있다. 그러나, 이는 본 개시의 다양한 실시예를 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 개시의 다양한 실시예의 사상 및 기술 범위에 포함되는 모든 변경 및/또는 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용되었다.Hereinafter, various embodiments of the present disclosure are described in connection with the accompanying drawings. Various embodiments of the present disclosure are capable of various changes and may have various embodiments, and specific embodiments are illustrated in the drawings and the related detailed description is described. However, this is not intended to limit the various embodiments of the present disclosure to specific embodiments, and it should be understood to include all modifications and/or equivalents or substitutes included in the spirit and scope of the various embodiments of the present disclosure. In connection with the description of the drawings, like reference numerals have been used for like components.

본 개시의 다양한 실시예에서 사용될 수 있는 "포함한다" 또는 "포함할 수 있다" 등의 표현은 개시(disclosure)된 해당 기능, 동작 또는 구성요소 등의 존재를 가리키며, 추가적인 하나 이상의 기능, 동작 또는 구성요소 등을 제한하지 않는다. 또한, 본 개시의 다양한 실시예에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Expressions such as “comprises” or “may include” that may be used in various embodiments of the present disclosure indicate the existence of the disclosed corresponding function, operation, or component, and may include one or more additional functions, operations, or components, etc. are not limited. In addition, in various embodiments of the present disclosure, terms such as “comprise” or “have” are intended to designate that a feature, number, step, action, component, part, or combination thereof described in the specification is present, It should be understood that it does not preclude the possibility of addition or existence of one or more other features or numbers, steps, operations, components, parts, or combinations thereof.

본 개시의 다양한 실시예에서 "또는" 등의 표현은 함께 나열된 단어들의 어떠한, 그리고 모든 조합을 포함한다. 예를 들어, "A 또는 B"는, A를 포함할 수도, B를 포함할 수도, 또는 A 와 B 모두를 포함할 수도 있다.In various embodiments of the present disclosure, expressions such as “or” include any and all combinations of the words listed together. For example, "A or B" may include A, may include B, or may include both A and B.

본 개시의 다양한 실시예에서 사용된 "제1", "제2", "첫째", 또는 "둘째" 등의 표현들은 다양한 실시예들의 다양한 구성요소들을 수식할 수 있지만, 해당 구성요소들을 한정하지 않는다. 예를 들어, 상기 표현들은 해당 구성요소들의 순서 및/또는 중요도 등을 한정하지 않는다. 상기 표현들은 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 수 있다. 예를 들어, 제1 사용자 기기와 제2 사용자 기기는 모두 사용자 기기이며, 서로 다른 사용자 기기를 나타낸다. 예를 들어, 본 개시의 다양한 실시예의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.Expressions such as “first”, “second”, “first”, or “second” used in various embodiments of the present disclosure may modify various components of various embodiments, but do not limit the components. does not For example, the above expressions do not limit the order and/or importance of corresponding components. The above expressions may be used to distinguish one component from another. For example, both the first user device and the second user device are user devices, and represent different user devices. For example, without departing from the scope of the various embodiments of the present disclosure, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 새로운 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 새로운 다른 구성요소가 존재하지 않는 것으로 이해될 수 있어야 할 것이다.When a component is referred to as being “connected” or “connected” to another component, the component may be directly connected to or connected to the other component, but the component and It should be understood that other new components may exist between the other components. On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it will be understood that no new element exists between the element and the other element. should be able to

본 개시의 다양한 실시예에서 사용한 용어는 단지 특정일 실시예를 설명하기 위해 사용된 것으로, 본 개시의 다양한 실시예를 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.The terminology used in various embodiments of the present disclosure is only used to describe one specific embodiment, and is not intended to limit the various embodiments of the present disclosure. The singular expression includes the plural expression unless the context clearly dictates otherwise.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 개시의 다양한 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the present disclosure pertain.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 개시의 다양한 실시예에서 명백하게 정의되지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in various embodiments of the present disclosure, ideal or excessively formal terms not interpreted as meaning

도 1은 본 발명의 일 실시예에 따른 탐승객 인식 장치의 구성 및 동작을 설명하기 위한 도면이다.1 is a view for explaining the configuration and operation of a passenger recognition device according to an embodiment of the present invention.

본 발명의 몇몇 실시예에 따른 탑승객 인식 장치(100)는 별도의 GPU 없이 CPU 환경에서 탑승객 인식 방법을 수행할 수 있다. 또한 본 실시예에 따른 탑승객 인식 장치(100)는 후술되는 구조의 신경망을 이용하여 CPU 환경에서 실시간으로 헤드를 검출할 수 있다. 이하, 탑승객 인식 장치(100)의 하드웨어 구성 및 동작에 대하여 상세히 설명한다.The passenger recognition apparatus 100 according to some embodiments of the present invention may perform the passenger recognition method in a CPU environment without a separate GPU. Also, the passenger recognition apparatus 100 according to the present embodiment may detect a head in real time in a CPU environment using a neural network having a structure to be described later. Hereinafter, the hardware configuration and operation of the passenger recognition apparatus 100 will be described in detail.

일 실시예에 따른 탑승객 인식 장치(100)는 입출력 인터페이스(101), 메모리(102), 프로세서(103) 및 통신 모듈(104)을 포함할 수 있다. The passenger recognition apparatus 100 according to an embodiment may include an input/output interface 101 , a memory 102 , a processor 103 , and a communication module 104 .

메모리(102)는 컴퓨터에서 판독 가능한 기록 매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 또한, 메모리(102)에는 운영체제와 적어도 하나의 프로그램 코드가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 드라이브 메커니즘(drive mechanism)을 이용하여 메모리(102)와는 별도의 컴퓨터에서 판독 가능한 기록 매체로부터 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록 매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록 매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록 매체가 아닌 통신 모듈(104)을 통해 메모리(102)에 로딩될 수도 있다. 예를 들어, 적어도 하나의 프로그램은 개발자들 또는 어플리케이션의 설치 파일을 배포하는 파일 배포 시스템이 네트워크를 통해 제공하는 파일들에 의해 설치되는 프로그램에 기반하여 메모리(102)에 로딩될 수 있다.The memory 102 is a computer-readable recording medium and may include a random access memory (RAM), a read only memory (ROM), and a permanent mass storage device such as a disk drive. In addition, the memory 102 may store an operating system and at least one program code. These software components may be loaded from a computer-readable recording medium separate from the memory 102 using a drive mechanism. The separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, a disk, a tape, a DVD/CD-ROM drive, and a memory card. In another embodiment, the software components may be loaded into the memory 102 through the communication module 104 rather than a computer-readable recording medium. For example, the at least one program may be loaded into the memory 102 based on a program installed by files provided through a network by developers or a file distribution system that distributes installation files of applications.

프로세서(103)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(102) 또는 통신 모듈(104)에 의해 프로세서(103)로 제공될 수 있다. 예를 들어 프로세서(103)는 메모리(102)와 같은 기록 장치에 저장된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다.The processor 103 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input/output operations. Instructions may be provided to the processor 103 by the memory 102 or the communication module 104 . For example, the processor 103 may be configured to execute instructions received according to program code stored in a recording device, such as the memory 102 .

일 실시예에서 프로세서(103)는 적어도 하나의 헤드를 포함하는 복수의 프레임을 획득하고, 복수의 컨벌루션 레이어 및 복수의 헤드 검출 네트워크를 포함하는 신경망에 복수의 프레임을 입력하고, 제1 컨벌루션 레이어에서 출력된 제1 특징맵을 제1 헤드 검출 네트워크에 입력하고, 제2 컨벌루션 레이어에서 출력된 제2 특징맵 및 제1 특징맵을 제2 헤드 검출 네트워크에 입력하고, 제2 헤드 검출 네트워크에서 제1 특징맵을 업샘플링(upsampling)한 후 제2 특징맵과 합성곱 연산을 수행하고, 제2 헤드 검출 네트워크의 출력 D23데이터를 기초로 복수의 프레임에 포함된 헤드를 검출하고, 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과하는 경우, 탑승객이 승차한 것으로 판단하고, 헤드가 제2 라인을 통과한 후 제1 라인을 통과하는 경우 탑승객이 하차한 것으로 판단할 수 있다.In an embodiment, the processor 103 obtains a plurality of frames including at least one head, inputs the plurality of frames to a neural network including a plurality of convolutional layers and a plurality of head detection networks, and in the first convolutional layer The output first feature map is input to the first head detection network, the second feature map and the first feature map output from the second convolutional layer are input to the second head detection network, and the first feature map is inputted in the second head detection network. After upsampling the feature map, a convolution operation is performed with the second feature map, and heads included in a plurality of frames are detected based on the output D23 data of the second head detection network, and the detected heads are first When passing through the second line after passing through the first line, it is determined that the passenger has boarded, and when the head passes through the first line after passing through the second line, it can be determined that the passenger has alighted.

통신 모듈(104)은 네트워크를 통해 외부 서버와 탑승객 인식 장치(100)가 통신하기 위한 기능을 제공할 수 있다. 일례로, 탑승객 인식 장치(100)의 프로세서(103)가 메모리(102)와 같은 기록 장치에 저장된 프로그램 코드에 따라 생성한 요청이 통신 모듈(104)의 제어에 따라 네트워크를 통해 외부 서버로 전달될 수 있다. 역으로, 외부 서버의 프로세서의 제어에 따라 제공되는 제어 신호나 명령, 콘텐츠, 파일 등이 통신 모듈과 네트워크를 거쳐 탑승객 인식 장치(100)의 통신 모듈(104)을 통해 탑승객 인식 장치(100)로 수신될 수 있다. 예를 들어 통신 모듈(104)을 통해 수신된 외부 서버의 제어 신호나 명령 등은 프로세서(103)나 메모리(102)로 전달될 수 있고, 콘텐츠나 파일 등은 탑승객 인식 장치(100)가 더 포함할 수 있는 저장 매체로 저장될 수 있다.The communication module 104 may provide a function for communication between the external server and the passenger recognition device 100 through a network. For example, a request generated by the processor 103 of the passenger recognition device 100 according to a program code stored in a recording device such as the memory 102 is transmitted to an external server through a network under the control of the communication module 104 . can Conversely, a control signal, command, content, file, etc. provided under the control of the processor of the external server is transmitted to the passenger recognition device 100 through the communication module 104 of the passenger recognition device 100 via the communication module and network. can be received. For example, a control signal or command of an external server received through the communication module 104 may be transmitted to the processor 103 or the memory 102 , and the passenger recognition device 100 further includes content or files. It can be stored as a storage medium that can be

통신 방식은 제한되지 않으며, 네트워크가 포함할 수 있는 통신망(일례로, 이동통신망, 유선 인터넷, 무선 인터넷, 방송망)을 활용하는 통신 방식뿐만 아니라 기기들간의 근거리 무선 통신 역시 포함될 수 있다. 예를 들어, 네트워크는, PAN(personal area network), LAN(local area network), CAN(campus area network), MAN(metropolitan area network), WAN(wide area network), BBN(broadband network), 인터넷 등의 네트워크 중 하나 이상의 임의의 네트워크를 포함할 수 있다. 또한, 네트워크는 버스 네트워크, 스타 네트워크, 링 네트워크, 메쉬 네트워크, 스타-버스 네트워크, 트리 또는 계층적(hierarchical) 네트워크 등을 포함하는 네트워크 토폴로지 중 임의의 하나 이상을 포함할 수 있으나, 이에 제한되지 않는다.The communication method is not limited, and not only a communication method using a communication network (eg, a mobile communication network, a wired Internet, a wireless Internet, a broadcasting network) that the network may include, but also short-range wireless communication between devices may be included. For example, the network is a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), a broadband network (BBN), the Internet, etc. may include any one or more of the networks of Further, the network may include, but is not limited to, any one or more of a network topology including, but not limited to, a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, and the like. .

입출력 인터페이스(101)는 입출력 장치와의 인터페이스를 위한 수단일 수 있다. 예를 들어, 입력 장치는 키보드 또는 마우스 등의 장치를, 그리고 출력 장치는 어플리케이션의 통신 세션을 표시하기 위한 디스플레이와 같은 장치를 포함할 수 있다. 다른 예로 입출력 인터페이스(101)는 터치스크린과 같이 입력과 출력을 위한 기능이 하나로 통합된 장치와의 인터페이스를 위한 수단일 수도 있다. 보다 구체적인 예로, 탑승객 인식 장치(100)의 프로세서(103)는 메모리(102)에 로딩된 컴퓨터 프로그램의 명령을 처리함에 있어서 외부 서버가 제공하는 데이터를 이용하여 구성되는 서비스 화면이나 콘텐츠가 입출력 인터페이스(101)를 통해 디스플레이에 표시될 수 있다.The input/output interface 101 may be a means for interfacing with an input/output device. For example, the input device may include a device such as a keyboard or mouse, and the output device may include a device such as a display for displaying a communication session of an application. As another example, the input/output interface 101 may be a means for an interface with a device in which functions for input and output are integrated into one, such as a touch screen. As a more specific example, the processor 103 of the passenger recognition device 100 processes the command of the computer program loaded in the memory 102, and the service screen or content configured using the data provided by the external server is displayed on the input/output interface ( 101) can be displayed on the display.

도시되지는 않았으나, 일 실시예에서 탑승객 인식 장치(100)는 카메라 모듈을 더 포함할 수 있다. 카메라 모듈은 탑승객 인식 장치(100)에 내장된 카메라의 인터페이스를 포함할 수 있고, 외부 카메라 장치와의 연결을 위한 인터페이스를 포함할 수도 있다. 일 실시예에서 카메라 모듈은 적어도 하나의 헤드를 포함하는 영상을 획득할 수 있고, 상기 영상은 복수의 프레임을 포함할 수 있다.Although not shown, in an embodiment, the passenger recognition apparatus 100 may further include a camera module. The camera module may include an interface of a camera built into the passenger recognition device 100 and may include an interface for connection with an external camera device. In an embodiment, the camera module may acquire an image including at least one head, and the image may include a plurality of frames.

또한, 다른 실시예들에서 탑승객 인식 장치(100)는 도 2의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 탑승객 인식 장치(100)는 트랜시버(transceiver), GPS(Global Positioning System) 모듈, 각종 센서, 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.Also, in other embodiments, the passenger recognition apparatus 100 may include more components than those of FIG. 2 . However, there is no need to clearly show most of the prior art components. For example, the passenger recognition apparatus 100 may further include other components such as a transceiver, a global positioning system (GPS) module, various sensors, and a database.

이하 도 2를 참조하여, 본 발명의 일 실시예에 따른 탑승객 인식 방법에 대하여 상세히 설명한다. 후술되는 탑승객 인식 방법은 전술한 탑승객 인식 장치(100)의 프로세서에 의해 수행될 수 있다.Hereinafter, a method for recognizing a passenger according to an embodiment of the present invention will be described in detail with reference to FIG. 2 . A method of recognizing a passenger to be described below may be performed by the processor of the apparatus 100 for recognizing a passenger described above.

단계 S110에서, 탑승객 인식 장치(100)는 적어도 하나의 헤드를 포함하는 복수의 프레임을 획득할 수 있다. 일 실시예에서 탑승객 인식 장치(100)는 영상에 포함된 복수의 프레임을 획득할 수 있다. 예를 들어 상기 복수의 프레임은 차량 내에 설치된 카메라에서 촬영된 컬러 영상에 포함된 프레임일 수 있다.In operation S110, the passenger recognition apparatus 100 may acquire a plurality of frames including at least one head. In an embodiment, the passenger recognition apparatus 100 may acquire a plurality of frames included in an image. For example, the plurality of frames may be frames included in a color image captured by a camera installed in a vehicle.

단계 S120에서, 탑승객 인식 장치(100)는 복수의 컨벌루션 레이어 및 복수의 헤드 검출 네트워크를 포함하는 신경망에 복수의 프레임을 입력할 수 있다. 일 실시예에서 탑승객 인식 장치(100)는 상기 획득한 복수의 프레임에 대한 별도의 전처리과정을 수행하지 않을 수 있다. 전처리 과정 없이 전술한 복수의 프레임을 신경망에 입력함으로써 컴퓨팅 자원을 절약할 수 있다. 이와 같이 본 발명의 몇몇 실시예에 따른 탑승객 인식 방법에 따르면 CPU 자원만을 이용하여 실시간으로 헤드를 검출할 수 있다. 본 발명의 몇몇 실시예에 따라 헤드를 검출하는 신경망은 복수의 컨벌루션 레이어 및 복수의 헤드 검출 네트워크를 포함할 수 있다. 일 실시예에서 컴퓨팅 연산량을 제한하기 위해 전술한 신경망에 포함된 복수의 컨벌루션 레이어의 개수 및 복수의 헤드 검출 네트워크의 개수는 제한될 수 있다. 예를 들어 상기 신경망은 8개의 컨벌루션 레이어(Convolution Layer), 5개의 최대 풀링 층(Max pooling Layer) 및 검출 네트워크를 포함할 수 있고, 검출 네트워크는 3개의 컨벌루션 레이어 및 검출 레이어(Detection Layer)를 포함할 수 있다. 하지만, 이는 신경망 구성의 일 예일 뿐, 본 발명의 몇몇 실시예에 따른 신경망에 포함된 각 레이어의 개수가 이에 한정되는 것은 아님에 유의한다.In operation S120 , the passenger recognition apparatus 100 may input a plurality of frames to a neural network including a plurality of convolutional layers and a plurality of head detection networks. In an embodiment, the passenger recognition apparatus 100 may not perform a separate pre-processing on the plurality of acquired frames. Computing resources can be saved by inputting the above-described plurality of frames to the neural network without a preprocessing process. As described above, according to the passenger recognition method according to some embodiments of the present invention, the head may be detected in real time using only CPU resources. A neural network for detecting a head according to some embodiments of the present invention may include a plurality of convolutional layers and a plurality of head detection networks. In an embodiment, in order to limit the amount of computational operation, the number of the plurality of convolutional layers and the number of the plurality of head detection networks included in the above-described neural network may be limited. For example, the neural network may include 8 convolutional layers, 5 max pooling layers, and a detection network, and the detection network includes 3 convolutional layers and a detection layer. can do. However, it should be noted that this is only an example of a configuration of a neural network, and the number of each layer included in the neural network according to some embodiments of the present invention is not limited thereto.

전술한 복수의 컨벌루션 레이어는 후술되는 제1 컨벌루션 레이어 및 제2 컨벌루션 레이어를 포함할 수 있고, 전술한 복수의 헤드 검출 네트워크는 후술되는 제1 헤드 검출 네트워크 및 제2 헤드 검출 네트워크를 포함할 수 있다.The plurality of convolutional layers described above may include a first convolutional layer and a second convolutional layer to be described later, and the aforementioned plurality of head detection networks may include a first head detection network and a second head detection network to be described later. .

단계 S130에서, 탑승객 인식 장치(100)는 제1 컨벌루션 레이어에서 출력된 제1 특징맵을 제1 헤드 검출 네트워크에 입력할 수 있다. 일 실시예에서 탑승객 인식 장치(100)는 제1 컨벌루션 레이어에서 검출한 헤드의 특징을 바탕으로, 제1 헤드 검출 네트워크를 이용하여 복수의 프레임에 포함된 헤드 각각을 검출할 수 있다. 일 실시예에서 탑승객 인식 장치(100)는 헤드를 검출하고 각각의 헤드에 식별값을 부여할 수 있다. 또한, 전술한 신경망을 이용하여 검출한 헤드의 움직임을 추적할 수 있다.In operation S130, the passenger recognition apparatus 100 may input the first feature map output from the first convolutional layer to the first head detection network. In an embodiment, the passenger recognition apparatus 100 may detect each of the heads included in the plurality of frames using the first head detection network based on the characteristics of the heads detected in the first convolutional layer. In an embodiment, the passenger recognition apparatus 100 may detect a head and assign an identification value to each head. In addition, it is possible to track the movement of the detected head using the above-described neural network.

단계 S140에서, 탑승객 인식 장치(100)는 제2 컨벌루션 레이어에서 출력된 제2 특징맵 및 제1 특징맵을 제2 헤드 검출 네트워크에 입력할 수 있다. 탑승객 인식 장치(100)는 제2 컨벌루션 레이어에서 검출한 헤드의 특징 및 제1 컨벌루션 레이어에서 검출한 헤드의 특징을 바탕으로, 제2 헤드 검출 네트워크를 이용하여 복수의 프레임에 포함된 헤드 각각을 검출할 수 있다. 일 실시예에서 탑승객 인식 장치(100)는 헤드를 검출하고 각각의 헤드에 식별값을 부여할 수 있다. 이 경우, 신경망의 입력 순서를 기초로, 제1 컨벌루션 레이어는 제2 컨벌루션 레이어보다 후 순서에 위치할 수 있다. 따라서, 제1 컨벌루션 레이어에서 검출한 헤드의 특징과 제2 컨벌루션 레이어에서 검출한 헤드의 특징을 상이할 수 있다. In operation S140 , the passenger recognition apparatus 100 may input the second feature map and the first feature map output from the second convolutional layer to the second head detection network. The passenger recognition apparatus 100 detects each of the heads included in the plurality of frames using the second head detection network based on the head characteristics detected in the second convolutional layer and the head characteristics detected in the first convolutional layer. can do. In an embodiment, the passenger recognition apparatus 100 may detect a head and assign an identification value to each head. In this case, based on the input order of the neural network, the first convolutional layer may be positioned after the second convolutional layer. Accordingly, the characteristic of the head detected by the first convolutional layer may be different from the characteristic of the head detected by the second convolutional layer.

단계 S150에서, 탑승객 인식 장치(100)는 제2 헤드 검출 네트워크에서 제1 특징맵을 업샘플링한 후 제2 특징맵과 합성곱 연산을 수행할 수 있다. 제1 특징맵은 제1 컨벌루션 레이어에서 출력될 수 있고, 제2 특징맵은 제2 컨벌루션 레이어에서 출력될 수 있다. 전술한 바, 제1 컨벌루션 레이어는 신경망의 입력 순서를 기초로 제2 컨벌루션 레이어보다 후순서에 위치하므로, 제1 특징맵의 크기와 제2 특징맵의 크기가 상이할 수 있다. 따라서 일 실시예에 따른 탑승객 인식 장치(100)는 제1 특징맵을 업샘플링(upsampling)하여 크기를 키운 후, 제2 특징맵 과의 합성곱 연산을 수행할 수 있다. 이와 같이 신경망에 포함된 복수의 컨벌루션 레이어에서 출력된 복수의 특징맵을 이용하여 헤드를 검출하는 경우, 헤드의 다양한 특징을 기초로 헤드를 검출할 수 있다. 이에 따라 헤드 검출의 정확도가 높아진다.In operation S150 , the passenger recognition apparatus 100 may up-sample the first feature map in the second head detection network and then perform a convolution operation with the second feature map. The first feature map may be output from the first convolutional layer, and the second feature map may be output from the second convolutional layer. As described above, since the first convolutional layer is located in a later order than the second convolutional layer based on the input order of the neural network, the size of the first feature map and the size of the second feature map may be different. Accordingly, the apparatus 100 for recognizing passengers according to an embodiment may upsampling the first feature map to increase the size, and then perform a convolution operation with the second feature map. As described above, when a head is detected using a plurality of feature maps output from a plurality of convolutional layers included in the neural network, the head may be detected based on various features of the head. Accordingly, the accuracy of head detection is increased.

단계 S160에서, 탑승객 인식 장치(100)는 제2 헤드 검출 네트워크의 출력 데이터를 기초로 복수의 프레임에 포함된 헤드를 검출할 수 있다. 다른 실시예에서 전술한 신경망이 3개의 컨벌루션 레이어 및 3개의 헤드 검출 네트워크를 포함하는 경우, 탑승객 인식 장치(100)는 제3 컨벌루션 레이어에서 출력된 제3 특징맵과 전술한 제2 특징맵 및 제1 특징맵을 제3 헤드 검출 네트워크에 입력하여 헤드를 검출할 수 있다. 이 경우, 탑승객 인식 장치(100)는 제3 헤드 검출 네트워크에서, 전술한 상기 제2 헤드 검출 네트워크에서의 합성곱 결과 데이터 및 제3 특징맵에 대한 합성곱을 수행할 수 있고, 상기 제2 헤드 검출 네트워크에서의 합성곱 결과 데이터에 대한 업샘플링을 수행하여 크기를 키울 수 있다.In operation S160, the passenger recognition apparatus 100 may detect a head included in a plurality of frames based on output data of the second head detection network. In another embodiment, when the above-described neural network includes three convolutional layers and three head detection networks, the passenger recognition apparatus 100 includes the third feature map output from the third convolutional layer, the second feature map and the second feature map. 1 The head can be detected by inputting the feature map to the third head detection network. In this case, the passenger recognition apparatus 100 may perform convolution on the convolution result data in the above-described second head detection network and the third feature map in the third head detection network, and detect the second head The size can be increased by performing upsampling on the convolution result data in the network.

본 발명의 몇몇 실시예에 따라 신경망을 이용하여 헤드를 검출하는 방법에 대하여는 도 4 내지 도 5를 참조하여 후술한다.A method of detecting a head using a neural network according to some embodiments of the present invention will be described later with reference to FIGS. 4 to 5 .

이하, 도 3을 참조하여 본 발명의 일 실시예에 따라 탑승객의 승하차를 인식하는 방법에 대하여 상세히 설명한다.Hereinafter, a method for recognizing the getting on and off of a passenger according to an embodiment of the present invention will be described in detail with reference to FIG. 3 .

단계 S170에서, 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과하는 경우, 탑승객이 승차한 것으로 판단할 수 있다. 본 발명의 몇몇 실시예에 따른 탑승객 인식 장치(100)는 탑승객의 승하차 여부를 정확하게 판단하게 위해 적어도 두 개의 라인 및 전술한 두 개의 라인 사이의 지정된 공간을 이용할 수 있다. 탑승객이 승차하는 방향에 따라, 먼저 위치한 라인은 제1 라인이고 그 다음 위치한 라인은 제2 라인일 수 있다. 따라서, 본 실시예에 따른 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인 및 제2 라인을 모두 통과할 뿐 아니라, 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과한 경우에 한하여 탑승객이 승차한 것으로 판단할 수 있다.In step S170 , when the detected head passes through the first line and then passes through the second line, the passenger recognition apparatus 100 may determine that the passenger has boarded. The apparatus 100 for recognizing passengers according to some embodiments of the present invention may use at least two lines and a designated space between the two lines in order to accurately determine whether a passenger is getting on or off. According to the direction in which the passenger rides, the first line may be the first line and the next line may be the second line. Accordingly, in the passenger recognition apparatus 100 according to the present embodiment, when the detected head not only passes through both the first line and the second line, but also passes through the second line after the detected head passes through the first line It can only be judged that the passenger has boarded.

단계 S180에서, 탑승객 인식 장치(100)는 헤드가 제2 라인을 통과한 후 제1 라인을 통과하는 경우 탑승객이 하차한 것으로 판단할 수 있다. 본 실시예에 따른 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인 및 제2 라인을 모두 통과할 뿐 아니라, 검출된 헤드가 제2 라인을 통과한 후 제1 라인을 통과한 경우에 한하여 탑승객이 승차한 것으로 판단할 수 있다. 다른 실시예에서 교통 수단의 승차 영역과 하차 영역이 상이한 경우, 각각의 영역에 두 개의 라인 및 두 개의 라인 사이의 공간을 이용하여 탑승객의 승차 및 하자 각각을 판단할 수 있음은 물론이다.In step S180 , the passenger recognition apparatus 100 may determine that the passenger has alighted when the head passes through the first line after passing through the second line. In the passenger recognition apparatus 100 according to the present embodiment, only when the detected head not only passes through both the first line and the second line, but also passes through the first line after the detected head passes through the second line It can be judged that the passenger has boarded. In another embodiment, when the boarding area and the disembarking area of the transportation means are different, it is of course possible to determine the passenger's boarding and defecting respectively by using two lines in each area and the space between the two lines.

단계 S190에서, 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인과 제2 라인 사이의 지정된 공간에 위치하는 경우, 탑승객의 승하차 여부의 판단을 보류할 수 있다. 검출된 헤드가 제1 라인을 통과한 후 제2 라인을 통과하지 않은 경우 또는 헤드가 제2 라인을 통과한 후 제1 라인을 통과하지 않은 경우, 탑승객 인식 장치(100)는 탑승객의 승하차 여부 판단을 보류할 수 있다. 탑승객 인식 장치(100)는 이후 검출된 헤드가 통과하는 라인 정보를 기초로 탑승객의 승하차 여부를 판단할 수 있다.In step S190 , when the detected head is located in a designated space between the first line and the second line, the passenger recognition apparatus 100 may withhold the determination of whether the passenger gets on or off. When the detected head does not pass through the second line after passing through the first line or when the head does not pass through the first line after passing through the second line, the passenger recognition apparatus 100 determines whether the passenger gets on or off can be withheld. The passenger recognition apparatus 100 may then determine whether the passenger gets on or off based on the line information through which the detected head passes.

이하 도 4 내지 도 5를 참조하여, 본 발명의 일 실시예에 따라 헤드를 검출하는 신경망의 구조 및 동작에 대하여 상세히 설명한다.Hereinafter, the structure and operation of a neural network for detecting a head according to an embodiment of the present invention will be described in detail with reference to FIGS. 4 to 5 .

도 4를 참조할 때, 일 실시예에서 헤드를 검출하는 신경망은 복수개의 컨벌루션 레이어를 포함하는 컨벌루션 네트워크(210) 및 복수의 헤드 검출 네트워크(250)를 포함할 수 있다. 탑승객 인식 장치는 복수의 프레임을 포함하는 입력 데이터(10)를 입력 받아, 복수의 컨벌루션 레이어를 포함하는 컨벌루션 네트워크(210)에 입력하여 상기 복수의 프레임에 포함된 헤드의 특징을 추출할 수 있다. 또한, 탑승객 인식 장치는 복수의 컨벌루션 레이어를 포함하는 컨벌루션 네트워크(210)에서 추출한 헤드의 특징을 기초로 복수의 헤드 검출 네트워크를 이용하여 헤드를 검출할 수 있다. 일 실시예에서 탑승객 인식 장치는 복수의 컨벌루션 레이어에서 출력된 각각의 특징맵을 복수의 헤드 검출 네트워크 각각에 전달할 수 있다. 예를 들어 제1 컨벌루션 레이어에서 출력된 특징맵이 제1 헤드 검출 네트워크에 입력될 수 있고, 제2 컨벌루션 레이어에서 출력된 특징맵이 제2 헤드 검출 네트워크에 입력될 수 있다. 또한, 제1 헤드 검출 네트워크에 포함된 컨벌루션 레이어에서 출력된 특징맵도 제2 헤드 검출 네트워크에 입력될 수 있다. 일 실시예에서 신경망을 통해 검출된 헤드에는 각각의 식별값이 부여될 수 있고, 헤드의 위치에 대응되는 박스가 표시된 복수의 프레임을 포함하는 출력 데이터(20)가 출력될 수 있다.Referring to FIG. 4 , a neural network for detecting a head according to an embodiment may include a convolutional network 210 including a plurality of convolutional layers and a plurality of head detection networks 250 . The passenger recognition apparatus may receive input data 10 including a plurality of frames, input it to a convolutional network 210 including a plurality of convolution layers, and extract features of a head included in the plurality of frames. Also, the passenger recognition apparatus may detect a head using the plurality of head detection networks based on the characteristics of the head extracted from the convolutional network 210 including the plurality of convolutional layers. In an embodiment, the passenger recognition apparatus may transmit each feature map output from the plurality of convolutional layers to each of the plurality of head detection networks. For example, the feature map output from the first convolutional layer may be input to the first head detection network, and the feature map output from the second convolutional layer may be input to the second head detection network. In addition, the feature map output from the convolutional layer included in the first head detection network may also be input to the second head detection network. In an embodiment, each identification value may be assigned to a head detected through a neural network, and output data 20 including a plurality of frames in which boxes corresponding to the positions of the heads are displayed may be output.

또한, 본 발명의 일 실시예에 따른 탑승객 인식 장치는 네트워크 성능의 고도화를 위해 신경망에서 배치 정규화(Batch Normalization)을 수행하지 않을 수 있다. 통상적인 컨벌루션 네트워크의 경우, 네트워크 학습의 수렴성 증가와 안정성을 위하여 컨벌루션 레이어에서 합성곱 연산을 수행한 후 배치 정규화(Batch normalization)를 수행한다. 하지만 본 발명의 몇몇 실시예에 따른 탑승객 인식 장치는 제한된 컴퓨팅 자원을 이용하여 헤드를 검출하므로 배치 정규화를 수행하지 않는다. Also, the apparatus for recognizing a passenger according to an embodiment of the present invention may not perform batch normalization in the neural network to improve network performance. In the case of a typical convolutional network, batch normalization is performed after performing a convolution operation in the convolutional layer in order to increase the convergence and stability of network learning. However, the apparatus for recognizing a passenger according to some embodiments of the present invention detects a head using limited computing resources, and thus does not perform batch normalization.

또한, 일 실시예에서 탑승객 인식 장치는 신경망에 포함된 적어도 하나의 컨벌루션 레이어에서 그룹 컨벌루션(Group Convolution)연산을 수행할 수 있다. 본 발명의 몇몇 실시예에 따른 탑승객 인식 장치는 그룹 컨벌루션 연산을 수행함으로써, 입력 채널에 각각 합성곱 연산을 수행하여 전체적인 파라미터 수를 줄여 CPU 환경에서 실시간 헤드 검출을 가능하게 할 수 있다.Also, according to an embodiment, the apparatus for recognizing a passenger may perform a group convolution operation on at least one convolutional layer included in the neural network. The apparatus for recognizing a passenger according to some embodiments of the present invention may perform a group convolution operation, thereby performing a convolution operation on each input channel to reduce the overall number of parameters, thereby enabling real-time head detection in a CPU environment.

도 5를 참조하여 헤드를 검출하는 신경망의 일 예시에 대하여 설명한다.An example of a neural network for detecting a head will be described with reference to FIG. 5 .

일 실시예에 따라 헤드를 검출하는 신경망은 복수의 컨벌루션 레이어를 포함하는 컨벌루션 네트워크(210) 및 복수의 헤드 검출 네트워크(220, 230, 240)를 포함할 수 있다. 적어도 하나의 헤드 검출 알고리즘(220)은 컨벌루션 네트워크(210) 뒤에 위치할 수 있고, 적어도 하나의 헤드 검출 알고리즘(230, 240)은 컨벌루션 네트워크(210)와 별도로 위치할 수 있다. 일 실시예에서 컨벌루션 네트워크(210)는 제1 컨벌루션 레이어(Conv. 128 x 1 x 1)(213), 제2 컨벌루션 레이어(Conv. 128 x 1 x 1)(212) 및 제3 컨벌루션 레이어(Conv. 128 x 1 x 1)(211)를 포함할 수 있고, 도 5에 도시된 예시에 따르면 제1 헤드 검출 네트워크(220)는 컨벌루션 네트워크(210)에 포함될 수 있다. 일 실시예에서 제1 헤드 검출 네트워크(220)는 제1 컨벌루션 레이어(Conv. 128 x 1 x 1)(213)에서 추정한 헤드의 제1 특징맵을 기초로 헤드를 검출할 수 있다. 제1 헤드 검출 네트워크(220)는 신경망의 연산 속도를 빠르게 하기 위해 적어도 하나의 그룹 컨벌루션 레이어(Group Conv. 128 x 3 x 3)를 포함할 수 있다.According to an embodiment, the neural network for detecting a head may include a convolutional network 210 including a plurality of convolutional layers and a plurality of head detection networks 220 , 230 , and 240 . At least one head detection algorithm 220 may be located behind the convolutional network 210 , and at least one head detection algorithm 230 , 240 may be located separately from the convolutional network 210 . In an embodiment, the convolutional network 210 includes a first convolutional layer (Conv. 128 x 1 x 1) 213, a second convolutional layer (Conv. 128 x 1 x 1) 212 and a third convolutional layer (Conv. 128 x 1 x 1) 211 , and according to the example shown in FIG. 5 , the first head detection network 220 may be included in the convolutional network 210 . In an embodiment, the first head detection network 220 may detect a head based on the first feature map of the head estimated by the first convolutional layer (Conv. 128 x 1 x 1) 213 . The first head detection network 220 may include at least one group convolution layer (Group Conv. 128 x 3 x 3) in order to speed up the operation of the neural network.

일 실시예에서 제2 헤드 검출 네트워크는 제2 컨벌루션 레이어(Conv. 128 x 1 x 1)(212)에서 추정한 헤드의 제2 특징맵 및 제1 특징맵을 기초로 헤드를 검출할 수 있다. 일 실시예에서 상기 제1 특징맵의 크기는 이전에 수행된 합성곱 연산에 의해 제2 특징맵의 크기와 상이할 수 있다. 따라서 본 실시예에 따른 제2 헤드 검출 네트워크는 제1 특징맵에 대하여 업샘플 레이어(Upsample)에서 업샘플링을 수행하여 제2 특징맵과 동일한 크기로 수정할 수 있다. 또한, 제1 헤드 검출 네트워크와 마찬가지로, 제2 헤드 검출 네트워크도 파라미터의 수를 줄이기 위해 적어도 하나의 그룹 컨벌루션 레이어(Group Conv. 192 x 3 x 3)를 포함할 수 있다. In an embodiment, the second head detection network may detect the head based on the second feature map and the first feature map of the head estimated by the second convolutional layer (Conv. 128 x 1 x 1) 212 . In an embodiment, the size of the first feature map may be different from the size of the second feature map by a previously performed convolution operation. Accordingly, the second head detection network according to the present embodiment may perform upsampling in an upsample layer (Upsample) on the first feature map to correct it to have the same size as the second feature map. Also, like the first head detection network, the second head detection network may include at least one group convolution layer (Group Conv. 192 x 3 x 3) in order to reduce the number of parameters.

일 실시예에서 제3 헤드 검출 네트워크(240)는 제3 컨벌루션 레이어(Conv. 128 x 1 x 1)(211)에서 추정한 헤드의 제3 특징맵 및 제2 헤드 검출 네트워크(230)에 포함된 그룹 컨벌루션 레이어에서 출력한 특징맵을 기초로 헤드를 검출할 수 있다. 일 실시예에서 제3 헤드 검출 네트워크(240)는 제2 헤드 검출 네트워크(230)에 포함된 그룹 컨벌루션 레이어에서 출력한 특징맵에 대하여 업샘플링을 수행하여 크기를 키울 수 있다. 이후 제3 헤드 검출 네트워크(240)는 상기 업샘플링된 특징맵과 제3 컨벌루션 레이어(Conv. 128 x 1 x 1)(211)에서 추정한 헤드의 제3 특징맵을 기초로 헤드를 검출할 수 있다.In an embodiment, the third head detection network 240 includes a third feature map of the head estimated by the third convolutional layer (Conv. 128 x 1 x 1) 211 and included in the second head detection network 230 . A head can be detected based on the feature map output from the group convolutional layer. In an embodiment, the third head detection network 240 may increase the size by performing upsampling on the feature map output from the group convolution layer included in the second head detection network 230 . Then, the third head detection network 240 can detect the head based on the upsampled feature map and the third feature map of the head estimated by the third convolutional layer (Conv. 128 x 1 x 1) 211 . have.

도 6을 참조하여 본 발명의 몇몇 실시예에 따른 탑승객 승하차 인식 방법에 대하여 상세히 설명한다.A method for recognizing passengers getting on and off according to some embodiments of the present invention will be described in detail with reference to FIG. 6 .

일 실시예에 따른 탑승객 인식 장치(100)는 적어도 두 개의 라인(310, 320) 및 전술한 두 개의 라인 사이의 지정된 공간(330)을 이용하여 탑승객의 승하차 여부를 판단할 수 있다. 일 실시예에서 탑승객이 승차하는 방향에 따라, 먼저 위치한 라인은 제1 라인(310)이고 그 다음 위치한 라인은 제2 라인(320)일 수 있다. 따라서, 본 실시예에 따른 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인(310) 및 제2 라인(320)을 모두 통과할 뿐 아니라, 검출된 헤드가 제1 라인(310)을 통과한 후 제2 라인(320)을 통과한 경우에 한하여 탑승객이 승차한 것으로 판단할 수 있다. 또한, 본 실시예에 따른 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인(310) 및 제2 라인(320)을 모두 통과할 뿐 아니라, 검출된 헤드가 제2 라인(320)을 통과한 후 제1 라인(310)을 통과한 경우에 한하여 탑승객이 승차한 것으로 판단할 수 있다. 다른 실시예에서 교통 수단의 승차 영역과 하차 영역이 상이한 경우, 각각의 영역에 위치한 두 개의 라인 및 두 개의 라인 사이의 공간을 이용하여 탑승객의 승차 및 하차를 판단할 수 있음은 물론이다. 또한, 탑승객 인식 장치(100)는 검출된 헤드가 제1 라인(310)과 제2 라인(320) 사이의 지정된 공간에 위치하는 경우, 탑승객의 승하차 여부의 판단을 보류할 수 있다. 검출된 헤드가 제1 라인(310)을 통과한 후 제2 라인(320)을 통과하지 않은 경우 또는 헤드가 제2 라인(320)을 통과한 후 제1 라인(310)을 통과하지 않은 경우, 탑승객 인식 장치(100)는 탑승객의 승하차 여부 판단을 보류할 수 있다. 탑승객 인식 장치(100)는 이후 헤드가 통과하는 라인 정보를 기초로 탑승객의 승하차 여부를 판단할 수 있다.The apparatus 100 for recognizing passengers according to an embodiment may determine whether a passenger is getting on or off using at least two lines 310 and 320 and a designated space 330 between the two lines. In an embodiment, according to the direction in which the passenger rides, the line positioned first may be the first line 310 and the line positioned next may be the second line 320 . Accordingly, in the passenger recognition apparatus 100 according to the present embodiment, the detected head not only passes through both the first line 310 and the second line 320 , but also the detected head passes through the first line 310 . After passing through the second line 320, it can be determined that the passenger has boarded. In addition, in the passenger recognition apparatus 100 according to the present embodiment, the detected head not only passes through both the first line 310 and the second line 320 , but also the detected head passes through the second line 320 . After passing through the first line 310 only, it may be determined that the passenger has boarded. In another embodiment, when the boarding area and the disembarking area of the means of transportation are different, it goes without saying that the boarding and disembarking of the passenger may be determined using two lines located in each area and the space between the two lines. Also, when the detected head is located in a designated space between the first line 310 and the second line 320 , the passenger recognition apparatus 100 may suspend the determination of whether the passenger gets on or off. When the detected head does not pass through the second line 320 after passing through the first line 310 or when the head does not pass through the first line 310 after passing through the second line 320, The passenger recognition apparatus 100 may withhold the determination of whether the passenger gets on or off. The passenger recognition apparatus 100 may then determine whether the passenger gets on or off based on the line information through which the head passes.

다른 실시예에 따른 탑승객 인식 장치(100)는 세 개 이상의 라인을 및 전술한 세 개 이상의 라인 사이에 위치한 지정된 공간을 이용하여 탑승객의 승하자 여부를 판단할 수 있다. 일 실시예에서 교통 수단에 승차 또는 하차 하기 위해 통과해야 하는 경로가 긴 경우 또는 교통 수단에 승차 또는 하차하기 위해 탑승객이 거쳐야하는 절차가 많은 경우, 탑승객 인식 장치(100)는 세 개 이상의 라인을 이용하여 탑승객의 승하차 여부를 판단할 수 있고, 이 경우 두 개의 라인을 이용하는 경우보다 탑승객의 승하차 여부에 대한 판단의 정확도가 높아질 수 있다.The apparatus 100 for recognizing passengers according to another embodiment may determine whether or not a passenger is on board by using three or more lines and a designated space located between the three or more lines. In one embodiment, when the route that must be passed to get on or off the means of transportation is long, or when there are many procedures that passengers must go through to get on or off the means of transportation, the passenger recognition device 100 uses three or more lines. Thus, it is possible to determine whether passengers get on or off, and in this case, the accuracy of determining whether passengers get on or off can be higher than when two lines are used.

일 예로 도 7에 도시된 교통 수단의 경우 단일 출입구를 통해 탑승객이 승하차 할 수 있다. 도시된 바를 참조할 때, 일 실시예에서 탑승객이 교통 수단에 승차하는 경우(302, 303), 해당 탑승객에 대응되는 헤드는 제1 라인(310)을 통과한 후, 제2 라인(320)을 통과할 수 밖에 없다. 또한, 일 실시예에서 탑승객이 교통 수단에서 하차하는 경우(301, 304), 해당 탑승객에 대응되는 헤드는 제2 라인(320)을 통과한 후 제1 라인(310)을 통과할 수 밖에 없다. 따라서 본 발명의 몇몇 실시예에 따른 탑승객 승하차 인식 방법은 검출된 헤드가 제1 라인(310) 및 제2 라인(320)을 지나는 순서를 기초로 탑승객의 승하차 여부를 판단할 수 있다.For example, in the case of the means of transportation shown in FIG. 7 , passengers can get on and off through a single entrance. Referring to the drawings, in one embodiment, when a passenger boards a means of transportation (302, 303), the head corresponding to the passenger passes through the first line 310, and then passes through the second line 320. have no choice but to pass In addition, in an embodiment, when the passenger gets off the transportation means ( 301 , 304 ), the head corresponding to the passenger has no choice but to pass through the first line 310 after passing through the second line 320 . Therefore, the method for recognizing passengers getting on and off according to some embodiments of the present invention may determine whether the passengers get on or off based on the order in which the detected heads pass through the first line 310 and the second line 320 .

다른 실시예에서 탑승객이 제1 라인(310)과 제2 라인(320)사이의 지정된 공간(330)에 머무르는 경우 탑승객은 승하차 여부의 판단을 보류하고, 이후 헤드가 통과하는 라인 정보를 기초로 탑승객의 승하차 여부를 판단할 수 있다.In another embodiment, when the passenger stays in the designated space 330 between the first line 310 and the second line 320, the passenger suspends the determination of whether to get on or off, and then the passenger based on the line information through which the head passes. You can decide whether to get on or off.

이상 설명된 본 발명에 따른 실시예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다. The embodiment according to the present invention described above may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program may be recorded in a computer-readable medium. In this case, the medium includes a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floppy disk, and a ROM. , RAM, flash memory, and the like, hardware devices specially configured to store and execute program instructions.

한편, 상기 컴퓨터 프로그램은 본 발명을 위하여 특별히 설계되고 구성된 것이나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the computer program may be specially designed and configured for the present invention, or may be known and used by those skilled in the art of computer software. Examples of the computer program may include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

본 발명에서 설명하는 특정 실행들은 일 실시예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어 시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들로서 나타내어질 수 있다. 또한, "필수적인", "중요하게" 등과 같이 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.The specific implementations described in the present invention are only examples and do not limit the scope of the present invention in any way. For brevity of the specification, descriptions of conventional electronic components, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connections or connecting members of the lines between the components shown in the drawings illustratively represent functional connections and/or physical or circuit connections, and in actual devices, various functional connections, physical connections that are replaceable or additional may be referred to as connections, or circuit connections. In addition, unless there is a specific reference such as "essential" or "importantly", it may not be a necessary component for the application of the present invention.

이와 같이 본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 당해 기술분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호 범위는 첨부된 청구범위의 기술적 사상에 의하여 정해져야 할 것이다.As such, the present invention has been described with reference to the embodiments shown in the drawings, which are merely exemplary, and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. . Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

Claims

A passenger recognition method performed by a computing device, comprising:
acquiring a plurality of frames including at least one head;
inputting the plurality of frames into a neural network including a plurality of convolutional layers and a plurality of head detection networks;
inputting the first feature map output from the first convolutional layer to a first head detection network, and inputting the second feature map output from the second convolutional layer and the first feature map to a second head detection network;
After upsampling the first feature map in the second head detection network, a convolution operation is performed with the second feature map, and based on the output data of the second head detection network, the first feature map is applied to the plurality of frames. detecting an included head; and
When the detected head passes through the first line and then passes through the second line, it is determined that the passenger has boarded, and when the head passes through the first line after passing through the second line, the passenger gets off Including; determining that
The plurality of convolutional layers,
further including a group convolutional layer,
The first head detection network,
performing a group convolution operation based on the first feature map,
The second head detection network,
performing group convolution operation with the second feature map after upsampling the first feature map;
How to recognize passengers.

The method of claim 1,
inputting the first feature map output from the first convolutional layer to the first head detection network, and inputting the second feature map output from the second convolutional layer and the first feature map to the second head detection network,
inputting the third feature map, the second feature map, and the first feature map output from the third convolutional layer to a third head detection network,
After upsampling the first feature map in the second head detection network, a convolution operation is performed with the second feature map, and based on the output data of the second head detection network, the first feature map is applied to the plurality of frames. The step of detecting the included head,
The third head detection network performs a convolution operation between the result data of the convolution operation of the second head detection network and the third feature map, and based on the output data of the third head detection network, the plurality of frames Further comprising the step of detecting the head included in,
How to recognize passengers.

The method of claim 1,
The neural network is
Batch normalization is not performed,
How to recognize passengers.

delete

The method of claim 1,
When the detected head is located in a designated space between the first line and the second line, suspending the passenger's boarding/unloading determination; further comprising
How to recognize passengers.

processor; including,
the processor
Obtaining a plurality of frames including at least one head, inputting the plurality of frames to a neural network including a plurality of convolutional layers and a plurality of head detection networks, and generating a first feature map output from the first convolutional layer Input to a first head detection network, input a second feature map and the first feature map output from a second convolutional layer to a second head detection network, and upsampling the first feature map in the second head detection network After upsampling, a convolution operation is performed with the second feature map, and the heads included in the plurality of frames are detected based on the output D23 data of the second head detection network, and the detected heads are first When passing through the second line after passing through the line, it is determined that the passenger has boarded, and when the head passes through the first line after passing through the second line, it is determined that the passenger has alighted;
The plurality of convolutional layers,
further including a group convolutional layer,
The first head detection network,
performing a group convolution operation based on the first feature map,
The second head detection network,
performing group convolution operation with the second feature map after upsampling the first feature map;
Passenger recognition device.

7. The method of claim 6,
The processor is
The third feature map, the second feature map, and the first feature map output from the third convolutional layer are input to a third head detection network, and the convolution operation of the second head detection network is performed in the third head detection network. performing a convolution operation between result data and the third feature map, and detecting heads included in the plurality of frames based on output data of the third head detection network,
Passenger recognition device.

7. The method of claim 6,
The neural network is
Batch normalization is not performed,
Passenger recognition device.

delete

7. The method of claim 6,
The processor is
When the detected head is located in a designated space between the first line and the second line, the passenger's boarding and getting off judgment is suspended
Passenger recognition device.

A computer program stored in a computer-readable recording medium for executing the method according to any one of claims 1 to 3 and 5.