KR102580715B1

KR102580715B1 - Apparatus and method for counting number of visitors based on deep learning model

Info

Publication number: KR102580715B1
Application number: KR1020210109684A
Authority: KR
Inventors: 김정준; 김주현; 김민규; 이만기; 김경호; 손동섭
Original assignee: 한국로봇융합연구원
Priority date: 2021-08-19
Filing date: 2021-08-19
Publication date: 2023-09-19
Also published as: KR20230027655A

Abstract

인공신경망 기반의 방문객 수 카운팅을 위한 방법은 검출부가 촬영된 영상의 복수의 프레임을 순차로 학습모델에 입력하는 단계와, 상기 학습모델이 상기 복수의 프레임에 대해 복수의 계층 간 학습된 가중치가 적용되는 복수의 연산을 통해 객체를 포함하는 영역상자의 중심 좌표, 폭 및 높이, 상기 영역상자의 신뢰도 및 상기 영역상자 내의 상기 객체가 사람 객체일 확률을 포함하는 출력값을 산출하는 단계와, 상기 검출부가 상기 복수의 프레임에서 상기 신뢰도가 소정 수치 이상이고, 상기 사람 객체일 확률이 소정 수치 이상인 영역상자를 선택하여 상기 복수의 프레임에 포함된 하나 이상의 사람 객체를 영역상자를 통해 검출하는 단계와, 카운트부가 상기 복수의 프레임을 포함하는 영상에서 입구 영역과 출구 영역을 구분하는 카운터 라인을 설정하는 단계와, 상기 카운트부가 상기 복수의 프레임에서 상기 영역상자를 통해 검출된 사람 객체의 상기 카운터 라인에 대응하는 위치에 따라 방문객 수를 카운트하는 단계를 포함한다. A method for counting the number of visitors based on an artificial neural network includes the steps of sequentially inputting a plurality of frames of images captured by a detection unit into a learning model, and the learning model applies learned weights between a plurality of layers to the plurality of frames. Calculating an output value including the center coordinates, width and height of an area box including an object, reliability of the area box, and probability that the object in the area box is a human object through a plurality of operations, wherein the detection unit selecting an area box in which the reliability is greater than a predetermined value and the probability of being the human object is more than a predetermined value from the plurality of frames, and detecting one or more human objects included in the plurality of frames through the area boxes; a count unit setting a counter line dividing an entrance area and an exit area in an image including the plurality of frames, wherein the count unit is positioned at a position corresponding to the counter line of a human object detected through the area box in the plurality of frames; It includes counting the number of visitors according to.

Description

Apparatus and method for counting number of visitors based on deep learning model}

본 발명은 방문객 수를 카운팅하기 위한 기술에 관한 것으로, 보다 상세하게는, 학습모델을 기초로 방문객 수를 카운팅하기 위한 장치 및 이를 위한 방법에 관한 것이다. The present invention relates to a technology for counting the number of visitors, and more specifically, to a device and method for counting the number of visitors based on a learning model.

국내에 다양한 전시관, 테마파크 등이 있으며 실제 전시하고 있는 전시물, 콘텐츠 등에 대한 방문객에 대한 관심도, 흥미도에 대한 분석은 설문정도로 이루어지고 있다. There are various exhibition halls and theme parks in Korea, and analysis of visitors' level of interest and interest in the actual exhibits and contents is conducted through surveys.

한국공개특허 제2015-0018121호 (2015년 02월 23일 공개)Korean Patent Publication No. 2015-0018121 (published on February 23, 2015)

본 발명의 목적은 학습모델을 기초로 방문객 수를 카운팅하기 위한 장치 및 이를 위한 방법을 제공함에 있다. The purpose of the present invention is to provide a device and method for counting the number of visitors based on a learning model.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 인공신경망 기반의 방문객 수 카운팅을 위한 방법은 검출부가 촬영된 영상의 복수의 프레임을 순차로 학습모델에 입력하는 단계와, 상기 학습모델이 상기 복수의 프레임에 대해 복수의 계층 간 학습된 가중치가 적용되는 복수의 연산을 통해 객체를 포함하는 영역상자의 중심 좌표, 폭 및 높이, 상기 영역상자의 신뢰도 및 상기 영역상자 내의 상기 객체가 사람 객체일 확률을 포함하는 출력값을 산출하는 단계와, 상기 검출부가 상기 복수의 프레임에서 상기 신뢰도가 소정 수치 이상이고, 상기 사람 객체일 확률이 소정 수치 이상인 영역상자를 선택하여 상기 복수의 프레임에 포함된 하나 이상의 사람 객체를 영역상자를 통해 검출하는 단계와, 카운트부가 상기 복수의 프레임을 포함하는 영상에서 입구 영역과 출구 영역을 구분하는 카운터 라인을 설정하는 단계와, 상기 카운트부가 상기 복수의 프레임에서 상기 영역상자를 통해 검출된 사람 객체의 상기 카운터 라인에 대응하는 위치에 따라 방문객 수를 카운트하는 단계를 포함한다. A method for counting the number of visitors based on an artificial neural network according to a preferred embodiment of the present invention to achieve the above-described purpose includes sequentially inputting a plurality of frames of images captured by a detector into a learning model, and the learning step. The model calculates the center coordinates, width and height of an area box containing an object, the reliability of the area box, and the object within the area box through a plurality of operations in which weights learned between a plurality of layers are applied to the plurality of frames. calculating an output value including a probability of being a human object, wherein the detection unit selects an area box in which the reliability is greater than a predetermined value and the probability of being a human object is greater than a predetermined value from the plurality of frames and includes the region box in the plurality of frames. Detecting one or more human objects through an area box, a count unit setting a counter line dividing an entrance area and an exit area in an image including the plurality of frames, and the count unit setting a counter line in the image including the plurality of frames. and counting the number of visitors according to the position of the human object detected through the area box corresponding to the counter line.

상기 방문객 수를 카운트하는 단계는 상기 카운트부가 시간상 상대적으로 선행하는 선행 프레임에서 상기 카운터 라인(CL)에 따라 구분되는 입구 영역 및 출구 영역 중 최초로 검출된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 식별자 및 초기 플래그값을 부여하는 단계와, 상기 카운트부가 시간상 상대적으로 후속인 후속 프레임에서 상기 카운터 라인에 따라 구분되는 상기 입구 영역 및 상기 출구 영역 중 상기 식별자가 부여된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 변동 플래그값을 부여하는 단계와, 상기 카운트부가 상기 초기 플래그값과 상기 변동 플래그값을 비교하여 상기 카운터 라인을 상기 출구 영역에서 상기 입구 영역으로 통과한 사람 객체를 방문자 수로 카운팅하는 단계를 포함한다. The step of counting the number of visitors is to determine the number of visitors according to the area where the first detected human object is located among the entrance area and exit area divided by the counter line (CL) in the previous frame that the counter is relatively preceding in time. assigning an identifier and an initial flag value, wherein the count unit is determined according to an area where the human object to which the identifier is assigned is located among the entrance area and the exit area divided by the counter line in a subsequent frame that is relatively subsequent in time; assigning a change flag value to the corresponding human object, wherein the counting unit compares the initial flag value and the change flag value and counting the number of human objects that passed the counter line from the exit area to the entrance area as the number of visitors. Includes.

상기 방문객 수를 카운트하는 단계에서 상기 카운트부가 상기 영역상자의 중심 좌표와 상기 카운터 라인의 2 이상의 점의 좌표를 비교하여 상기 사람 객체가 위치하는 영역을 도출하는 것을 특징으로 한다. In the step of counting the number of visitors, the counting unit compares the coordinates of the center of the area box with the coordinates of two or more points on the counter line to derive the area where the human object is located.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 인공신경망 기반의 방문객 수 카운팅을 위한 장치는 학습모델이 촬영된 영상의 복수의 프레임에 대해 복수의 계층 간 학습된 가중치가 적용되는 복수의 연산을 통해 객체를 포함하는 영역상자의 중심 좌표, 폭 및 높이, 상기 영역상자의 신뢰도 및 상기 영역상자 내의 상기 객체가 사람 객체일 확률을 포함하는 출력값을 산출하면, 상기 복수의 프레임에서 상기 신뢰도가 소정 수치 이상이고, 상기 사람 객체일 확률이 소정 수치 이상인 영역상자를 선택하여 상기 복수의 프레임에 포함된 하나 이상의 사람 객체를 영역상자를 통해 검출하는 검출부와, 상기 복수의 프레임을 포함하는 영상에서 입구 영역과 출구 영역을 구분하는 카운터 라인을 설정하고, 상기 복수의 프레임에서 상기 영역상자를 통해 검출된 사람 객체의 상기 카운터 라인에 대응하는 위치에 따라 방문객 수를 카운트하는 카운트부를 포함한다. In order to achieve the above-described object, a device for counting the number of visitors based on an artificial neural network according to a preferred embodiment of the present invention is a device in which learned weights between a plurality of layers are applied to a plurality of frames of an image in which a learning model is captured. By calculating an output value including the center coordinates, width and height of an area box containing an object, the reliability of the area box, and the probability that the object in the area box is a human object through a plurality of operations, the a detection unit that selects an area box whose reliability is a predetermined value or more and a probability of being the human object is a predetermined value or more and detects one or more human objects included in the plurality of frames through the area box, and an image including the plurality of frames A counter line is set to separate the entrance area and the exit area, and a count unit is included to count the number of visitors according to the position of the human object detected through the area box in the plurality of frames corresponding to the counter line.

상기 카운트부는 시간상 상대적으로 선행하는 선행 프레임에서 상기 카운터 라인(CL)에 따라 구분되는 입구 영역 및 출구 영역 중 최초로 검출된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 식별자 및 초기 플래그값을 부여하고, 시간상 상대적으로 후속인 후속 프레임에서 상기 카운터 라인에 따라 구분되는 상기 입구 영역 및 상기 출구 영역 중 상기 식별자가 부여된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 변동 플래그값을 부여하고, 상기 카운트부가 상기 초기 플래그값과 상기 변동 플래그값을 비교하여 상기 카운터 라인을 상기 출구 영역에서 상기 입구 영역으로 통과한 사람 객체를 방문자 수로 카운팅하는 것을 특징으로 한다. The count unit assigns an identifier and an initial flag value to the corresponding human object according to the area where the first detected human object is located among the entrance area and exit area divided according to the counter line (CL) in the preceding frame that is relatively preceding in time. , in a subsequent frame that is relatively subsequent in time, a variable flag value is assigned to the corresponding human object according to the area in which the human object to which the identifier is assigned is located among the entrance area and the exit area divided according to the counter line, and the count Additionally, the initial flag value is compared with the change flag value, and the number of human objects passing the counter line from the exit area to the entrance area is counted as the number of visitors.

상기 카운트부는 상기 영역상자의 중심 좌표와 상기 카운터 라인의 2 이상의 점의 좌표를 비교하여 상기 사람 객체가 위치하는 영역을 도출하는 것을 특징으로 한다. The count unit compares the coordinates of the center of the area box with the coordinates of two or more points on the counter line to derive the area where the human object is located.

본 발명은 영상 내에서 출입 방향을 지정할 수 있는 카운팅 라인을 통해 조사 대상인 전시물, 콘텐츠 등을 지정하고, 이에 대해 방문객이 이동한 동선과 방문객 수를 정량적 데이터로 제시하여 방문객의 흥미도가 떨어지는 전시물 또는 공간 파악이 가능하고, 이를 통해 전시물 교체가 필요할 때 어떤 전시물을 바꿔야 하는지에 대한 자료를 제공 할 수 있다. The present invention designates exhibitions, contents, etc. that are subject to investigation through a counting line that can specify the entrance and exit direction within the video, and presents quantitative data on the movement path of visitors and the number of visitors. It is possible to understand the space, and through this, data can be provided on which exhibits should be replaced when exhibits need to be replaced.

도 1은 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 시스템의 구성을 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 에지장치의 구성을 설명하기 위한 도면이다.
도 3은 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 에지장치의 세부적인 구성을 설명하기 위한 도면이다.
도 4는 본 발명의 실시예에 따른 방문객 수를 카운팅하기 위한 학습모델을 설명하기 위한 도면이다.
도 5는 본 발명의 실시예에 따른 학습모델이 사람 객체를 검출하는 방법을 설명하기 위한 화면 예이다.
도 6은 본 발명의 실시예에 따른 학습모델에 대한 학습 방법을 설명하기 위한 흐름도이다.
도 7은 학습모델을 기초로 방문객 수를 카운팅하기 위한 방법을 설명하기 위한 흐름도이다.
도 8 및 도 9는 학습모델을 기초로 방문객 수를 카운팅하기 위한 방법을 설명하기 위한 화면 예이다.
도 10은 본 발명의 실시예에 따른 컴퓨팅 장치를 나타내는 도면이다. 1 is a diagram illustrating the configuration of a system for counting the number of visitors based on a learning model according to an embodiment of the present invention.
Figure 2 is a diagram for explaining the configuration of an edge device for counting the number of visitors based on a learning model according to an embodiment of the present invention.
Figure 3 is a diagram for explaining the detailed configuration of an edge device for counting the number of visitors based on a learning model according to an embodiment of the present invention.
Figure 4 is a diagram for explaining a learning model for counting the number of visitors according to an embodiment of the present invention.
Figure 5 is an example screen to explain how a learning model detects a human object according to an embodiment of the present invention.
Figure 6 is a flowchart illustrating a learning method for a learning model according to an embodiment of the present invention.
Figure 7 is a flow chart to explain a method for counting the number of visitors based on a learning model.
Figures 8 and 9 are example screens to explain a method for counting the number of visitors based on a learning model.
Figure 10 is a diagram showing a computing device according to an embodiment of the present invention.

본 발명의 상세한 설명에 앞서, 이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 실시예에 불과할 뿐, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. Prior to the detailed description of the present invention, the terms and words used in the specification and claims described below should not be construed as limited to their ordinary or dictionary meanings, and the inventor should use his/her invention in the best possible manner. In order to explain, it must be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that the term can be appropriately defined as a concept. Therefore, the embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent the entire technical idea of the present invention, and therefore, various equivalents that can replace them at the time of filing the present application may be used. It should be understood that there may be variations and examples.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 이때, 첨부된 도면에서 동일한 구성 요소는 가능한 동일한 부호로 나타내고 있음을 유의해야 한다. 또한, 본 발명의 요지를 흐리게 할 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략할 것이다. 마찬가지의 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the attached drawings. At this time, it should be noted that in the attached drawings, identical components are indicated by identical symbols whenever possible. Additionally, detailed descriptions of well-known functions and configurations that may obscure the gist of the present invention will be omitted. For the same reason, in the accompanying drawings, some components are exaggerated, omitted, or schematically shown, and the size of each component does not entirely reflect the actual size.

먼저, 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 시스템의 구성에 대해서 설명하기로 한다. 도 1은 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 시스템의 구성을 설명하기 위한 도면이다. 도 1을 참조하면, 학습모델을 기초로 방문객 수를 카운팅하기 위한 시스템(이하, '카운팅시스템'으로 축약함)은 복수의 에지장치(10) 및 관리서버(20)를 포함한다. First, the configuration of a system for counting the number of visitors based on a learning model according to an embodiment of the present invention will be described. 1 is a diagram illustrating the configuration of a system for counting the number of visitors based on a learning model according to an embodiment of the present invention. Referring to FIG. 1, a system for counting the number of visitors based on a learning model (hereinafter abbreviated as 'counting system') includes a plurality of edge devices 10 and a management server 20.

복수의 에지장치(10)는 서로 다른 장소에 설치되며, 설치된 위치에서 학습모델을 기초로 방문객 수를 카운팅하고, 그 카운팅 결과를 관리서버(20)로 전송한다. 그러면, 관리서버(20)는 카운팅 결과를 기초로 방문객 수의 통계 데이터를 제공할 수 있다. A plurality of edge devices 10 are installed in different locations, count the number of visitors based on a learning model at the installed location, and transmit the counting result to the management server 20. Then, the management server 20 can provide statistical data on the number of visitors based on the counting results.

그러면, 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 에지장치에 대해서 보다 상세하게 설명하기로 한다. 도 2는 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 에지장치의 구성을 설명하기 위한 도면이다. 도 3은 본 발명의 실시예에 따른 학습모델을 기초로 방문객 수를 카운팅하기 위한 에지장치의 세부적인 구성을 설명하기 위한 도면이다. 도 4는 본 발명의 실시예에 따른 방문객 수를 카운팅하기 위한 학습모델을 설명하기 위한 도면이다. 도 5는 본 발명의 실시예에 따른 학습모델이 사람 객체를 검출하는 방법을 설명하기 위한 화면 예이다. Next, an edge device for counting the number of visitors based on a learning model according to an embodiment of the present invention will be described in more detail. Figure 2 is a diagram for explaining the configuration of an edge device for counting the number of visitors based on a learning model according to an embodiment of the present invention. Figure 3 is a diagram for explaining the detailed configuration of an edge device for counting the number of visitors based on a learning model according to an embodiment of the present invention. Figure 4 is a diagram for explaining a learning model for counting the number of visitors according to an embodiment of the present invention. Figure 5 is an example screen to explain how a learning model detects a human object according to an embodiment of the present invention.

도 2를 참조하면, 에지장치(10)는 통신부(11), 카메라부(12), 저장부(13) 및 제어부(14)를 포함한다. Referring to FIG. 2, the edge device 10 includes a communication unit 11, a camera unit 12, a storage unit 13, and a control unit 14.

통신부(11)는 관리서버(20) 및 다른 장치, 예컨대, 에지장치(10)와 연결되는 원격장치(미도시)와 통신을 위한 것이다. 통신부(11)는 송신되는 신호의 주파수를 상승 변환 및 증폭하는 RF(Radio Frequency) 송신기(Tx) 및 수신되는 신호를 저 잡음 증폭하고 주파수를 하강 변환하는 RF 수신기(Rx)를 포함할 수 있다. 그리고 통신부(11)는 송신되는 신호를 변조하고, 수신되는 신호를 복조하는 모뎀(Modem)을 포함할 수 있다. 통신부(11)는 관리서버(20) 혹은 원격장치(미도시)로부터 본 발명의 실시예에 따른 카운팅 라인을 설정하는 입력을 수신할 수 있다. 통신부(11)는 제어부(14)의 제어에 따라 관리서버(20)로 본 발명의 실시예에 따른 카운팅 결과를 전송할 수 있다. The communication unit 11 is for communication with the management server 20 and other devices, for example, a remote device (not shown) connected to the edge device 10. The communication unit 11 may include an RF (Radio Frequency) transmitter (Tx) that up-converts and amplifies the frequency of a transmitted signal and an RF receiver (Rx) that amplifies the received signal with low noise and down-converts the frequency. And the communication unit 11 may include a modem that modulates the transmitted signal and demodulates the received signal. The communication unit 11 may receive an input for setting a counting line according to an embodiment of the present invention from the management server 20 or a remote device (not shown). The communication unit 11 can transmit the counting result according to an embodiment of the present invention to the management server 20 under the control of the control unit 14.

카메라부(12)는 복수의 프레임을 포함하는 영상, 즉, 동영상을 촬영하기 위한 것이다. 카메라부(12)는 렌즈 및 이미지센서를 포함할 수 있다. 각 이미지센서는 피사체에서 반사되는 빛을 입력받아 전기신호로 변환한다. 이미지 센서는 CCD(Charged Coupled Device), CMOS(Complementary Metal-Oxide Semiconductor) 등을 기반으로 구현될 수 있다. 또한, 카메라부(12)는 하나 이상의 아날로그-디지털 변환기(Analog to Digital Converter)를 더 포함할 수 있으며, 이미지센서에서 출력되는 전기신호를 디지털 수열로 변환하여 제어부(14)로 출력할 수 있다. The camera unit 12 is used to capture images including a plurality of frames, that is, moving images. The camera unit 12 may include a lens and an image sensor. Each image sensor receives light reflected from the subject and converts it into an electrical signal. Image sensors can be implemented based on Charged Coupled Device (CCD), Complementary Metal-Oxide Semiconductor (CMOS), etc. In addition, the camera unit 12 may further include one or more analog to digital converters, and may convert the electrical signal output from the image sensor into a digital sequence and output it to the control unit 14.

저장부(13)는 사용자장치(10)의 동작에 필요한 프로그램 및 데이터를 저장하는 역할을 수행한다. 특히, 저장부(13)는 본 발명의 실시예에 따른 사람 객체를 카운팅하기 위해 필요한 임시 데이터 및 그 카운팅 결과 데이터 등을 저장할 수 있다. 또한, 저장부(16)에 저장되는 각 종 데이터는 사용자장치(10) 사용자의 조작에 따라, 삭제, 변경, 추가될 수 있다. The storage unit 13 serves to store programs and data necessary for the operation of the user device 10. In particular, the storage unit 13 can store temporary data and counting result data necessary for counting human objects according to an embodiment of the present invention. Additionally, various types of data stored in the storage unit 16 can be deleted, changed, or added according to the operation of the user of the user device 10.

제어부(14)는 사용자장치(10)의 전반적인 동작 및 사용자장치(10)의 내부 블록들 간 신호 흐름을 제어하고, 데이터를 처리하는 데이터 처리 기능을 수행할 수 있다. 또한, 제어부(14)는 기본적으로, 사용자장치(10)의 각 종 기능을 제어하는 역할을 수행한다. 제어부(14)는 CPU(Central Processing Unit), BP(baseband processor), AP(application processor), GPU(Graphic Processing Unit), DSP(Digital Signal Processor) 등을 예시할 수 있다. 도 3을 참조하면, 제어부(14)는 학습부(100), 검출부(200) 및 카운트부(300)를 포함한다. 이러한 학습부(100), 검출부(200) 및 카운트부(300)를 포함하는 제어부(14)의 동작은 아래에서 더 상세하게 설명될 것이다. The control unit 14 controls the overall operation of the user device 10 and signal flow between internal blocks of the user device 10, and may perform a data processing function to process data. Additionally, the control unit 14 basically controls various functions of the user device 10. The control unit 14 may include, for example, a Central Processing Unit (CPU), a baseband processor (BP), an application processor (AP), a Graphic Processing Unit (GPU), and a Digital Signal Processor (DSP). Referring to FIG. 3, the control unit 14 includes a learning unit 100, a detection unit 200, and a counting unit 300. The operation of the control unit 14 including the learning unit 100, the detection unit 200, and the counting unit 300 will be described in more detail below.

도 4에 도시된 바와 같이, 학습모델(LM: machine learning model 혹은 deep learning model)은 복수의 계층을 포함하며, 복수의 계층 각각은 복수의 연산을 수행한다. 어느 하나의 계층의 복수의 연산 모듈의 연산 결과 각각은 가중치가 적용되어 다음 계층에 전달된다. 이는 현 계층의 연산 결과에 가중치가 적용되어 다음 계층의 연산에 입력되는 것을 의미한다. 다른 말로, 학습모델(LM)은 복수의 계층의 가중치가 적용되는 복수의 연산을 수행한다. 복수의 계층은 컨볼루션(Convolution) 연산을 수행하는 컨볼루션계층(CVL: Convolution Layer), 다운샘플링(Down Sampling) 연산 혹은 업샘플링(Up Sampling) 연산을 수행하는 풀링계층(PLL: Pooling Layer), 활성화함수에 의한 연산을 수행하는 완전연결층(FCL: Fully Connected Layer) 등을 포함한다. 컨볼루션, 다운샘플링 및 업샘플링 연산 각각은 소정의 행렬로 이루어진 커널을 이용하며, 이러한 커널을 이루는 행렬의 원소의 값들이 가중치(w)가 된다. 여기서, 활성화함수는 시그모이드(Sigmoid), 하이퍼볼릭탄젠트(tanh: Hyperbolic tangent), ELU(Exponential Linear Unit), ReLU(Rectified Linear Unit), Leakly ReLU, Maxout, Minout, Softmax 등을 예시할 수 있다. 이러한 학습모델(LM)은 YOLO(You Only Look Once), YOLOv2, YOLO9000, YOLOv3 등의 알고리즘이 적용된 모델을 예시할 수 있다. 학습모델(LM)은 영상의 프레임이 입력되면, 복수의 계층의 가중치가 적용되는 복수의 연산을 수행하여 도 5에 도시된 바와 같이, 프레임을 복수의 셀로 구분한 후, 복수의 셀 각각에 중심 좌표(x, y), 폭(w) 및 높이(h)를 가지는 하나 이상의 영역상자(BB: Bounding Box), 영역상자(BB) 내에 객체가 포함되어 있으면서 영역상자(BB)의 영역이 사람 객체(obj)가 차지하는 영역과 일치할 확률을 나타내는 신뢰도(confidence) 및 영역상자(BB) 내의 객체가 사람 객체(obj)일 확률(human=0.9161)을 산출하여 출력값(out)으로 출력할 수 있다. As shown in Figure 4, a learning model (LM: machine learning model or deep learning model) includes a plurality of layers, and each of the plurality of layers performs a plurality of operations. Each calculation result of a plurality of calculation modules in one layer is weighted and transmitted to the next layer. This means that weights are applied to the calculation results of the current layer and input into the calculations of the next layer. In other words, the learning model (LM) performs multiple operations to which weights of multiple layers are applied. The plurality of layers includes a convolution layer (CVL) that performs a convolution operation, a pooling layer (PLL: Pooling Layer) that performs a down sampling operation or an up sampling operation, It includes a fully connected layer (FCL) that performs calculations using activation functions. Convolution, downsampling, and upsampling operations each use a kernel composed of a predetermined matrix, and the values of the elements of the matrix forming this kernel become weights (w). Here, examples of activation functions include sigmoid, hyperbolic tangent (tanh), Exponential Linear Unit (ELU), Rectified Linear Unit (ReLU), Leakly ReLU, Maxout, Minout, Softmax, etc. . This learning model (LM) may be an example of a model to which algorithms such as YOLO (You Only Look Once), YOLOv2, YOLO9000, and YOLOv3 are applied. When a frame of an image is input, the learning model (LM) performs a plurality of operations to which weights of a plurality of layers are applied, divides the frame into a plurality of cells as shown in Figure 5, and then assigns a center point to each of the plurality of cells. One or more bounding boxes (BB) with coordinates (x, y), width (w), and height (h). An object is contained within the bounding box (BB), and the area of the bounding box (BB) is a human object. Confidence, which represents the probability of matching the area occupied by (obj), and the probability that the object in the area box (BB) is a human object (obj) (human=0.9161) can be calculated and output as the output value (out).

본 발명은 학습모델을 기초로 방문객 수를 카운팅한다. 이를 위하여, 학습모델에 대한 학습이 요구된다. 학습부(100)는 학습모델(LM)을 학습시키기 위한 것이다. 학습부(100)는 학습모델(LM)이 도 5에 도시된 바와 같은 사람 객체(obj)를 영역상자(BB)를 통해 검출하여 특정할 수 있도록 학습시킨다. 이러한 방법에 대해서 설명하기로 한다. 도 6은 본 발명의 실시예에 따른 학습모델에 대한 학습 방법을 설명하기 위한 흐름도이다. The present invention counts the number of visitors based on a learning model. For this purpose, learning about the learning model is required. The learning unit 100 is for learning a learning model (LM). The learning unit 100 trains the learning model LM to detect and specify the human object obj as shown in FIG. 5 through the area box BB. We will explain these methods. Figure 6 is a flowchart for explaining a learning method for a learning model according to an embodiment of the present invention.

도 6을 참조하면, 학습부(100)는 S110 단계에서 학습 데이터를 마련한다. 학습 데이터는 복수의 학습용 프레임을 포함한다. 학습용 프레임은 사람 객체를 사람 객체(obj)가 차지하는 영역을 특정하는 실측상자(ground-truth box)를 포함한다. 이러한 학습 데이터가 마련되면, 학습부(100)는 S120 단계에서 초기화된 학습모델(LM)에 학습용 프레임을 입력한다. 그러면, 학습모델(LM)은 S130 단계에서 복수의 계층의 가중치가 적용되는 복수의 연산을 통해 출력값을 산출하여 출력할 것이다. 출력값은 영상의 복수의 셀 각각에 중심 좌표(x, y)와 폭(w) 및 높이(h)를 가지는 복수의 영역상자(BB), 영역상자(B)의 영역이 사람 객체(obj)를 100% 포함하고 있는 실측상자(ground-truth box)와 일치하는 정도를 나타내는 신뢰도(confidence: 0~1) 및 영역상자(BB) 내의 객체가 사람 객체(obj)일 확률(예컨대, human = 0,8711)을 포함한다. Referring to FIG. 6, the learning unit 100 prepares learning data in step S110. Learning data includes a plurality of learning frames. The learning frame includes a ground-truth box that specifies the area occupied by the human object (obj). Once such training data is prepared, the learning unit 100 inputs a learning frame to the learning model (LM) initialized in step S120. Then, the learning model (LM) will calculate and output an output value through a plurality of operations to which weights of a plurality of layers are applied in step S130. The output value is a plurality of area boxes (BB) with center coordinates (x, y), width (w), and height (h) in each of the plurality of cells of the image, and the area of the area box (B) represents the human object (obj). Confidence (0~1) indicating the degree to which it matches the ground-truth box containing 100%, and the probability that the object in the area box (BB) is a human object (obj) (e.g., human = 0, 8711).

학습모델(LM)의 출력값을 기초로 학습부(100)는 S140 단계에서 손실 함수에 따라 손실값을 도출할 수 있다. 예컨대, 손실 함수는 다음의 수학식 1와 같다. Based on the output value of the learning model (LM), the learning unit 100 may derive a loss value according to the loss function in step S140. For example, the loss function is expressed as Equation 1 below.

S는 셀의 수를 나타낸다. 도 5에는 12개의 셀이 도시되었다. C는 신뢰 점수를 나타낸다. B는 한 셀 내의 영역상자(BB)의 수를 나타낸다. pi(c)는 i 번째 셀의 객체가 해당하는 클래스(c)일 확률을 나타내며, 본 발명에서는 하나의 클래스, 즉, 사람 객체일 확률만을 사용한다. 여기서, i는 객체가 존재하는 셀을 나타내는 파라미터이고, j는 예측된 영역상자(BB)를 나타내는 파라미터이다. 또한, x, y는 영역상자(BB)의 중심좌표를 나타내며, w 및 h는 각각 영역상자의 폭과 높이를 나타낸다.

는 영역상자(BB)의 변수에 대한 값을 더 반영하기 위한 것으로, 영역상자의 중심 좌표, 폭 및 높이(x, y, w, h)에 대한 손실과 다른 손실들과의 균형을 위한 파라미터이다.

는 영역상자(BB)의 변수에 대한 값을 더 반영하고, 물체가 없는 영역에 대한 값을 덜 반영하기 위한 것이다. 즉,

는 객체가 있는 영역상자(BB)와 객체가 없는 영역상자(BB) 간의 균형을 위한 파라미터이다. 여기서,

=5이고,

=0.5가 될 수 있다.

는 셀 i에 객체가 있는 경우 1이고, 없는 경우 0을 나타낸다.

는 셀 i에 있는 영역상자 j에 객체가 있으면 1이고, 없으면 0을 나타낸다.

는 셀 i에 있는 영역상자 j에 객체가 없으면 1이고, 있으면 0을 나타낸다.S represents the number of cells. 5 shows 12 cells. C represents the trust score. B represents the number of area boxes (BB) in one cell. pi(c) represents the probability that the object in the i-th cell is the corresponding class (c), and in the present invention, only the probability of being a single class, that is, a human object, is used. Here, i is a parameter indicating the cell in which the object exists, and j is a parameter indicating the predicted area box (BB). Additionally, x and y represent the center coordinates of the area box (BB), and w and h represent the width and height of the area box, respectively.

is intended to further reflect the values of the variables of the area box (BB), and is a parameter for balancing the loss for the center coordinate, width, and height (x, y, w, h) of the area box with other losses. .

is intended to reflect more the values of the variables of the area box (BB) and less reflect the values of the area without objects. in other words,

is a parameter for balance between the area box (BB) with objects and the area box (BB) without objects. here,

=5,

=0.5.

represents 1 if there is an object in cell i, and 0 if there is no object.

represents 1 if there is an object in area box j in cell i, and 0 otherwise.

represents 1 if there is no object in area box j in cell i, and 0 if there is an object.

수학식 1의 손실 함수를 살펴보면, 수학식 1의 첫 번째 및 두 번째 항(term)은 다음의 수학식 2와 같다. Looking at the loss function in Equation 1, the first and second terms of Equation 1 are as shown in Equation 2 below.

이러한 수학식 2는 영역상자의 중심 좌표, 폭 및 높이(x, y, w, h)와, 사람 객체(obj)가 차지하는 영역과의 차이를 나타내는 좌표 손실(coordinate loss)을 산출하기 위한 것이다. Equation 2 is used to calculate a coordinate loss that represents the difference between the center coordinates, width, and height (x, y, w, h) of the area box and the area occupied by the human object (obj).

또한, 수학식 1의 세 번째 및 네 번째 항은 다음의 수학식 3과 같다. Additionally, the third and fourth terms of Equation 1 are as shown in Equation 3 below.

수학식 4는 영역상자(BB)의 영역과 사람 객체(obj)가 차지하는 영역을 100% 포함하고 있는 실측상자(ground-truth box)와의 차이를 나타내는 신뢰도 손실(confidence loss)을 산출하기 위한 것이다. Equation 4 is used to calculate the confidence loss that represents the difference between the area of the area box (BB) and the ground-truth box containing 100% of the area occupied by the human object (obj).

마지막으로, 수학식 1의 마지막 항은 다음의 수학식 4와 같다. Finally, the last term of Equation 1 is equivalent to Equation 4 below.

수학식 4는 영역상자(BB) 내의 객체의 클래스, 즉, 사람 객체(obj)와 실제 객체의 클래스와의 차이를 나타내는 분류 손실(classification loss)을 산출하기 위한 것이다. Equation 4 is used to calculate a classification loss that represents the difference between the class of the object in the area box (BB), that is, the class of the human object (obj) and the actual object.

이와 같이, S140 단계에서 손실 함수를 통해 손실값, 즉, 좌표 손실, 신뢰도 손실 및 분류 손실을 산출한 후, 학습부(100)는 S150 단계에서 좌표 손실, 신뢰도 손실 및 분류 손실이 최소가 되도록 학습모델(LM)의 가중치를 최적화한다. In this way, after calculating the loss values, that is, coordinate loss, reliability loss, and classification loss through the loss function in step S140, the learning unit 100 learns to minimize the coordinate loss, reliability loss, and classification loss in step S150. Optimize the weights of the model (LM).

다음으로, 학습부(100)는 S160 단계에서 학습 완료 조건을 만족하는지 여부를 판단한다. 학습부(100)는 복수의 학습 데이터 중 평가용 학습 데이터 세트를 통해 학습모델(LM) 전체에 대한 연산을 수행한 후, 학습모델(LM)의 출력값에 따라 산출되는 손실값이 소정 수치 이내이면, 학습 완료 조건을 만족하는 것으로 판단할 수 있다. Next, the learning unit 100 determines whether the learning completion condition is satisfied in step S160. The learning unit 100 performs calculations on the entire learning model (LM) through a learning data set for evaluation among a plurality of learning data, and then, if the loss value calculated according to the output value of the learning model (LM) is within a predetermined value, , it can be judged that the learning completion conditions are satisfied.

S160 단계의 판단 결과, 학습 완료 조건을 만족하지 못하면, 학습부(100)는 전술한 S120 단계 및 S160 단계를 반복한다. 반면, S160 단계의 판단 결과, 학습 완료 조건을 만족하면, S170 단계에서 학습을 종료한다. 이로써, 학습된 파라미터, 즉, 가중치를 가지는 학습모델(LM)이 완성된다. As a result of the determination in step S160, if the learning completion condition is not satisfied, the learning unit 100 repeats steps S120 and S160 described above. On the other hand, if the learning completion condition is satisfied as a result of the determination in step S160, learning is terminated in step S170. As a result, a learning model (LM) with learned parameters, that is, weights, is completed.

그러면, 전술한 바와 같은 방법에 따라 학습이 완료된 학습모델을 기초로 방문객 수를 카운팅하기 위한 방법에 대해서 설명하기로 한다. 도 7은 학습모델을 기초로 방문객 수를 카운팅하기 위한 방법을 설명하기 위한 흐름도이다. 도 8 및 도 9는 학습모델을 기초로 방문객 수를 카운팅하기 위한 방법을 설명하기 위한 화면 예이다. Next, a method for counting the number of visitors will be described based on a learning model that has been trained according to the method described above. Figure 7 is a flow chart to explain a method for counting the number of visitors based on a learning model. Figures 8 and 9 are example screens to explain a method for counting the number of visitors based on a learning model.

도 7을 참조하면, 에지장치(10)의 검출부(200)는 S210 단계에서 카메라부(12)를 통해 영상을 촬영하고, 촬영된 영상의 복수의 프레임을 순차로 학습모델(LM)에 입력한다. Referring to FIG. 7, the detection unit 200 of the edge device 10 captures an image through the camera unit 12 in step S210, and sequentially inputs a plurality of frames of the captured image into the learning model (LM). .

그러면, 학습모델(LM)은 S220 단계에서 입력되는 복수의 프레임에 대해 복수의 계층 간 학습된 가중치가 적용되는 복수의 연산을 통해 출력값을 산출한다. 이러한 출력값은 객체를 포함하는 영역상자(BB)의 중심 좌표(x, y), 폭(w) 및 높이(h), 영역상자(BB)의 신뢰도 및 영역상자(B) 내의 객체가 사람 객체(obj)일 확률을 포함한다. Then, the learning model (LM) calculates an output value through a plurality of operations in which the learned weights between the plurality of layers are applied to the plurality of frames input in step S220. These output values include the center coordinates (x, y), width (w) and height (h) of the area box (BB) containing the object, the reliability of the area box (BB), and whether the object in the area box (B) is a human object ( obj).

검출부(220)는 S230 단계에서 복수의 프레임에서 신뢰도가 소정 수치 이상이고, 사람 객체일 확률이 소정 수치 이상인 영역상자(BB)를 선택하여 복수의 프레임에 포함된 하나 이상의 사람 객체(obj)를 영역상자(BB)를 통해 검출한다. In step S230, the detection unit 220 selects an area box (BB) in which the reliability is higher than a predetermined value and the probability of being a human object is higher than a predetermined value in the plurality of frames, and selects one or more human objects (obj) included in the plurality of frames into the area. Detected through box (BB).

카운트부(300)는 S240 단계에서 복수의 프레임을 포함하는 영상에서 입구 영역과 출구 영역을 구분하는 카운터 라인(CL)을 설정한다. 예컨대, 도 8 및 도 9에 본 발명의 실시예에 따른 프레임의 화면 예들이 도시되었다. 도 8의 프레임은 도 9의 프레임에 시간 상 선행하는 선행 프레임이다. 다른 말로, 도 9의 프레임은 도 8의 프레임에 시간 상 후속인 후속 프레임이다. 도시된 바와 같이, 카운터 라인(CL)은 영상에 포함된 복수의 프레임의 동일한 위치에 설정된다. 또한, 도시된 바와 같이, 카운터 라인(CL)은 영상을 2개의 영역으로 구분한다. 이때, 카운터 라인(CL)의 왼쪽 영역을 입구 영역으로 설정하고, 오른쪽 영역을 출구 영역으로 설정하거나, 그 반대로 설정할 수 있다. 이러한 입구 영역 및 출구 영역을 구분하는 카운터 라인(CL)의 설정은 사용자에 의한 입력에 따라 설정될 수 있다. 예컨대, 관리서버(20)의 사용자는 카운터 라인(CL)을 설정하는 입력을 에지장치(10)로 전송할 수 있다. 혹은 사용자는 원격장치(미도시)를 통해 에지장치(10)에 카운터 라인(CL)을 설정하는 입력을 전송할 수 있다. 이러한 입력을 수신함에 따라, 카운트부(300)는 복수의 프레임을 포함하는 영상에서 입구 영역과 출구 영역을 구분하는 카운터 라인(CL)을 설정할 수 있다. In step S240, the count unit 300 sets a counter line CL that separates the entrance area and the exit area in the image including a plurality of frames. For example, examples of frame screens according to embodiments of the present invention are shown in FIGS. 8 and 9 . The frame in FIG. 8 is a preceding frame in time that precedes the frame in FIG. 9. In other words, the frame in Figure 9 is a subsequent frame in time to the frame in Figure 8. As shown, the counter line CL is set at the same position in a plurality of frames included in the image. Additionally, as shown, the counter line CL divides the image into two areas. At this time, the left area of the counter line CL can be set as the entrance area and the right area can be set as the exit area, or vice versa. The setting of the counter line (CL) dividing the entrance area and the exit area can be set according to input by the user. For example, the user of the management server 20 can transmit an input for setting the counter line CL to the edge device 10. Alternatively, the user can transmit an input for setting the counter line (CL) to the edge device 10 through a remote device (not shown). Upon receiving this input, the counter 300 may set a counter line CL that separates the entrance area and the exit area in the image including a plurality of frames.

전술한 바와 같이, 카운터 라인(CL)이 설정되면, 카운트부(300)는 S250 단계에서 복수의 프레임에서 영역상자(BB)를 통해 검출된 사람 객체(obj)의 카운터 라인(CL)에 대응하는 위치에 따라 방문객 수를 카운트한다. 이러한 S250 단계에 대해 보다 상세하게 설명하면, 다음과 같다. As described above, when the counter line (CL) is set, the count unit 300 generates a counter line (CL) corresponding to the counter line (CL) of the human object (obj) detected through the area box (BB) in a plurality of frames in step S250. Count the number of visitors based on location. This S250 step will be described in more detail as follows.

카운트부(300)는 먼저, 시간상 상대적으로 선행하는 선행 프레임에서 카운터 라인(CL)에 따라 구분되는 입구 영역 및 출구 영역 중 최초로 검출된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 식별자 및 초기 플래그값을 부여한다. 예컨대, 영상에서 카운터 라인(CL)의 왼쪽이 전시장의 입구이고, 오른쪽이 출구라고 가정한다. 이에 따라, 도 8 및 도 9에서 카운터 라인(CL)에 따라 카운터 라인(CL)의 왼쪽이 입구 영역(A)이고, 오른쪽이 출구 영역(B)으로 설정되었다고 가정한다. 또한, 도시된 바와 같이, 도 8의 선행 프레임에, 4개의 영역상자(BB1, BB2, BB3, BB4)를 통해 4개의 사람 객체가 검출되었다고 가정한다. 이러한 경우, 카운트부(300)는 4개의 사람 객체에 대해 식별자(ID1, ID2, ID3, ID4)를 부여한다. 또한, 카운트부(300)는 최초로 검출된 사람 객체가 위치하는 영역에 따라 제1 및 제2 영역상자(BB1, BB2)를 통해 입구 영역(A)에서 최초로 검출된 제1 및 제2 사람객체(ID1, ID2)에 대해 초기 플래그값 '-1'을 부여하고, 제3 및 제4 영역상자(BB3, BB4)를 통해 출구 영역(B)에서 최초로 검출된 제3 및 제4 사람객체(ID3, ID4)에 대해 초기 플래그값 '1'을 부여한다. 여기서, 제1 내지 제4 사람 객체(ID1, ID2, ID3, ID4)의 변동 프레임은 모두 '0'으로 설정된다. First, the count unit 300 assigns an identifier and an initial flag to the human object according to the area where the first detected human object is located among the entrance area and exit area divided by the counter line (CL) in the preceding frame that is relatively preceding in time. Give a value. For example, assume that the left side of the counter line (CL) in the video is the entrance to the exhibition hall, and the right side is the exit. Accordingly, in FIGS. 8 and 9, it is assumed that the left side of the counter line CL is set as the entrance area (A) and the right side of the counter line CL is set as the exit area (B). Additionally, as shown, it is assumed that four human objects are detected through four area boxes (BB1, BB2, BB3, and BB4) in the preceding frame of FIG. 8. In this case, the count unit 300 assigns identifiers (ID1, ID2, ID3, ID4) to the four human objects. In addition, the count unit 300 counts the first and second human objects ( An initial flag value of '-1' is given to ID1, ID2), and the third and fourth human objects (ID3, ID3, An initial flag value of '1' is given to ID4). Here, the change frames of the first to fourth human objects (ID1, ID2, ID3, and ID4) are all set to '0'.

다음으로, 카운트부(300)는 시간상 상대적으로 후속인 후속 프레임에서 카운터 라인(CL)에 따라 구분되는 입구 영역 및 출구 영역 중 식별자가 부여된 사람 객체가 위치하는 영역에 따라 해당 사람 객체에 변동 플래그값을 부여한다. 전술한 바와 동일하게, 도 9에서 카운터 라인(CL)에 따라 카운터 라인(CL)의 왼쪽이 입구 영역(A)이고, 오른쪽이 출구 영역(B)으로 설정된 상태이다. 이때, 도 9의 후행 프레임에 도시된 바와 같이, 선행 프레임에서 식별자가 부여된 사람 객체(ID1, ID2, ID3, ID4)가 4개의 영역상자(BB1, BB2, BB3, BB4)를 통해 검출되었다고 가정한다. 이러한 경우, 카운트부(300)는 검출된 사람 객체가 위치하는 영역에 따라 입구 영역(A)에서 제1, 제3 및 제4 영역상자(BB1, BB3, BB4)를 통해 검출된 제1, 제3 및 제4 사람객체(ID1, ID3, ID4)에 대해 변동 플래그값 '-1'을 부여하고, 출구 영역(B)에서 제2 영역상자(BB2)를 통해 최초로 검출된 제2 사람객체(ID2)에 대해 변동 플래그값 '1'을 부여한다. Next, the count unit 300 flags a change flag to the corresponding human object according to the area in which the identifier-assigned human object is located among the entrance and exit areas divided by the counter line (CL) in a subsequent frame that is relatively subsequent in time. Give a value. As described above, in FIG. 9, the left side of the counter line CL is set as the entrance area (A), and the right side of the counter line CL is set as the exit area (B). At this time, as shown in the following frame in FIG. 9, it is assumed that the human object (ID1, ID2, ID3, ID4) assigned an identifier in the preceding frame is detected through four area boxes (BB1, BB2, BB3, BB4). do. In this case, the count unit 300 counts the first, third, and fourth area boxes BB1, BB3, and BB4 in the entrance area A according to the area where the detected human object is located. A variable flag value of '-1' is given to the 3rd and 4th human objects (ID1, ID3, ID4), and the second human object (ID2) first detected through the second area box (BB2) in the exit area (B) ) is given a change flag value of '1'.

이에 따라, 카운트부(300)는 초기 플래그값과 변동 플래그값을 비교하여 카운터 라인(CL)을 출구 영역(B)에서 입구 영역(A)으로 통과한 사람 객체를 방문자 수로 카운팅한다. 예컨대, 제1 사람객체(ID1)의 경우, 초기 플래그값과 변동 플래그값을 비교하면, 카운터 라인(CL)을 통과하지 않고 입구 영역(A)에 지속적으로 머물고 있는 상태이기 때문에 방문자 수로 카운팅되지 않는다. 또한, 제2 사람객체(ID2)의 경우, 초기 플래그값과 변동 플래그값을 비교하면, 카운터 라인(CL)을 입구 영역(A)에서 출구 영역(B)으로 통과하기 때문에 방문자 수로 카운팅되지 않는다. 그리고 제3 및 제4 사람객체(ID3, ID4)의 경우, 초기 플래그값과 변동 플래그값을 비교하면, 카운터 라인(CL)을 통과하지 않고 출구 영역(B)에서 입구 영역(A)으로 통과하기 때문에 방문자 수로 카운팅 될 수 있다. Accordingly, the counting unit 300 compares the initial flag value and the change flag value and counts the number of human objects that passed the counter line CL from the exit area (B) to the entrance area (A) as the number of visitors. For example, in the case of the first human object (ID1), comparing the initial flag value and the change flag value, it is not counted as the number of visitors because it does not pass through the counter line (CL) and continues to stay in the entrance area (A). . Additionally, in the case of the second human object (ID2), when comparing the initial flag value and the change flag value, it is not counted as the number of visitors because it passes the counter line (CL) from the entrance area (A) to the exit area (B). And in the case of the third and fourth human objects (ID3, ID4), comparing the initial flag value and the change flag value, it is found that they pass from the exit area (B) to the entrance area (A) without passing the counter line (CL). Therefore, it can be counted as the number of visitors.

한편, S250 단계에서 사람 객체가 위치하는 영역을 도출할 때, 해당 사람 객체를 특정하는 영역상자(BB)의 중심 좌표, 예컨대, (x1, y1), (x2, y2), (x3, y3), (x4, y4)와 카운터 라인(CL)의 2 이상의 점(a, b, c)의 좌표를 비교하여 도출할 수 있다. Meanwhile, when deriving the area where the human object is located in step S250, the center coordinates of the area box (BB) that specifies the human object, for example, (x1, y1), (x2, y2), (x3, y3) , can be derived by comparing the coordinates of (x4, y4) and two or more points (a, b, c) of the counter line (CL).

방문객 수를 카운팅한 후, 카운트부(300)는 S260 단계에서 통신부(11)를 통해 관리서버(20)에 전술한 바와 같은 식별자, 초기 플래그값 및 변동 플래그값과 같은 데이터와 함께 카운팅된 방문객 수를 전송할 수 있다. 이에 따라, 관리서버(20)는 방문객이 이동한 동선과 방문객 수를 정량적 데이터로 제시할 수 있다. 이에 따라, 방문객의 흥미도가 떨어지는 전시물 또는 공간 파악이 가능하고, 이를 통해 전시물 교체가 필요할 때 어떤 전시물을 바꿔야 하는지에 대한 자료를 제공 할 수 있다. After counting the number of visitors, the counting unit 300 sends the number of visitors counted together with data such as the identifier, initial flag value, and change flag value as described above to the management server 20 through the communication unit 11 in step S260. can be transmitted. Accordingly, the management server 20 can present the visitor's movement path and number of visitors as quantitative data. Accordingly, it is possible to identify exhibits or spaces in which visitors' interest level is low, and through this, data can be provided on which exhibits should be replaced when replacement is necessary.

도 10은 본 발명의 실시예에 따른 컴퓨팅 장치를 나타내는 도면이다. 도 10의 컴퓨팅 장치(TN100)는 본 명세서에서 기술된 장치, 예컨대, 에지장치(10), 관리서버(20) 등이 될 수 있다. Figure 10 is a diagram showing a computing device according to an embodiment of the present invention. The computing device TN100 of FIG. 10 may be a device described in this specification, such as an edge device 10 or a management server 20.

도 10의 실시예에서, 컴퓨팅 장치(TN100)는 적어도 하나의 프로세서(TN110), 송수신 장치(TN120), 및 메모리(TN130)를 포함할 수 있다. 또한, 컴퓨팅 장치(TN100)는 저장 장치(TN140), 입력 인터페이스 장치(TN150), 출력 인터페이스 장치(TN160) 등을 더 포함할 수 있다. 컴퓨팅 장치(TN100)에 포함된 구성 요소들은 버스(bus)(TN170)에 의해 연결되어 서로 통신을 수행할 수 있다.In the embodiment of FIG. 10, the computing device TN100 may include at least one processor TN110, a transceiver device TN120, and a memory TN130. Additionally, the computing device TN100 may further include a storage device TN140, an input interface device TN150, an output interface device TN160, etc. Components included in the computing device TN100 may be connected by a bus TN170 and communicate with each other.

프로세서(TN110)는 메모리(TN130) 및 저장 장치(TN140) 중에서 적어도 하나에 저장된 프로그램 명령(program command)을 실행할 수 있다. 프로세서(TN110)는 중앙 처리 장치(CPU: central processing unit), 그래픽 처리 장치(GPU: graphics processing unit), 또는 본 발명의 실시예에 따른 방법들이 수행되는 전용의 프로세서를 의미할 수 있다. 프로세서(TN110)는 본 발명의 실시예와 관련하여 기술된 절차, 기능, 및 방법 등을 구현하도록 구성될 수 있다. 프로세서(TN110)는 컴퓨팅 장치(TN100)의 각 구성 요소를 제어할 수 있다. The processor TN110 may execute a program command stored in at least one of the memory TN130 and the storage device TN140. The processor TN110 may refer to a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods according to embodiments of the present invention are performed. Processor TN110 may be configured to implement procedures, functions, and methods described in connection with embodiments of the present invention. The processor TN110 may control each component of the computing device TN100.

메모리(TN130) 및 저장 장치(TN140) 각각은 프로세서(TN110)의 동작과 관련된 다양한 정보를 저장할 수 있다. 메모리(TN130) 및 저장 장치(TN140) 각각은 휘발성 저장 매체 및 비휘발성 저장 매체 중에서 적어도 하나로 구성될 수 있다. 예를 들어, 메모리(TN130)는 읽기 전용 메모리(ROM: read only memory) 및 랜덤 액세스 메모리(RAM: random access memory) 중에서 적어도 하나로 구성될 수 있다.Each of the memory TN130 and the storage device TN140 can store various information related to the operation of the processor TN110. Each of the memory TN130 and the storage device TN140 may be comprised of at least one of a volatile storage medium and a non-volatile storage medium. For example, the memory TN130 may be comprised of at least one of read only memory (ROM) and random access memory (RAM).

송수신 장치(TN120)는 유선 신호 또는 무선 신호를 송신 또는 수신할 수 있다. 송수신 장치(TN120)는 네트워크에 연결되어 통신을 수행할 수 있다. The transceiving device TN120 can transmit or receive wired signals or wireless signals. The transmitting and receiving device (TN120) can be connected to a network and perform communication.

한편, 앞서 설명된 본 발명의 실시예에 따른 방법은 다양한 컴퓨터수단을 통하여 판독 가능한 프로그램 형태로 구현되어 컴퓨터로 판독 가능한 기록매체에 기록될 수 있다. 여기서, 기록매체는 프로그램 명령, 데이터 파일, 데이터구조 등을 단독으로 또는 조합하여 포함할 수 있다. 기록매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 예컨대 기록매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광 기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치를 포함한다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 와이어뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 와이어를 포함할 수 있다. 이러한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다. Meanwhile, the method according to the embodiment of the present invention described above can be implemented in the form of a program readable through various computer means and recorded on a computer-readable recording medium. Here, the recording medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the recording medium may be those specifically designed and constructed for the present invention, or may be known and available to those skilled in the art of computer software. For example, recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. magneto-optical media), and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions may include machine language wires, such as those produced by a compiler, as well as high-level language wires that can be executed by a computer using an interpreter, etc. These hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상 본 발명을 몇 가지 바람직한 실시예를 사용하여 설명하였으나, 이들 실시예는 예시적인 것이며 한정적인 것이 아니다. 이와 같이, 본 발명이 속하는 기술분야에서 통상의 지식을 지닌 자라면 본 발명의 사상과 첨부된 특허청구범위에 제시된 권리범위에서 벗어나지 않으면서 균등론에 따라 다양한 변화와 수정을 가할 수 있음을 이해할 것이다. Although the present invention has been described above using several preferred examples, these examples are illustrative and not limiting. As such, those of ordinary skill in the technical field to which the present invention pertains will understand that various changes and modifications can be made according to the theory of equivalents without departing from the spirit of the present invention and the scope of rights set forth in the appended claims.

10: 에지장치 11: 통신부
12: 카메라부 13: 저장부
14: 제어부 20: 관리서버
100: 학습부 200: 검출부
300: 카운트부 10: Edge device 11: Communication unit
12: camera unit 13: storage unit
14: Control unit 20: Management server
100: learning unit 200: detection unit
300: Counter part

Claims

In a method for counting the number of visitors based on an artificial neural network,
A step of sequentially inputting a plurality of frames of the image captured by the detection unit into a learning model;
The learning model calculates the center coordinates, width and height of an area box containing an object, the reliability of the area box, and the calculating an output value including a probability that the object is a human object;
wherein the detector selects an area box in which the reliability is higher than a predetermined value and the probability of being the human object is higher than a predetermined value, and detecting one or more human objects included in the plurality of frames through the area box;
A count unit setting a counter line dividing an entrance area and an exit area in an image including the plurality of frames;
the counting unit counting the number of visitors according to a position corresponding to the counter line of a human object detected through the area box in the plurality of frames;
Includes,
The step of counting the number of visitors is
The counter assigns an identifier and an initial flag value to the corresponding human object according to the area where the first detected human object is located among the entrance and exit areas divided according to the counter line (CL) in the previous frame that is relatively preceding in time. step;
The counter assigning a change flag value to the corresponding human object according to the area in which the human object to which the identifier is assigned is located among the entrance area and the exit area divided by the counter line in a subsequent frame that is relatively subsequent in time. ;
the counting unit comparing the initial flag value and the change flag value and counting the number of human objects passing the counter line from the exit area to the entrance area as the number of visitors;
Characterized by including
A method for counting the number of visitors.

delete

According to paragraph 1,
In the step of counting the number of visitors
Characterized in that the counting unit compares the coordinates of the center of the area box and the coordinates of two or more points of the counter line to derive the area where the human object is located.
A method for counting the number of visitors.

In a device for counting the number of visitors based on an artificial neural network,
The center coordinates, width and height of the area box containing the object, the reliability of the area box, and the area box are obtained through a plurality of operations in which the learned weights between the plurality of layers are applied to the plurality of frames of the image in which the learning model is captured. When calculating an output value including the probability that the object in the object is a human object,
a detection unit that selects an area box in which the reliability is higher than a predetermined value and the probability of being the human object is higher than a predetermined value from the plurality of frames, and detects one or more human objects included in the plurality of frames through the area box; and
In the image including the plurality of frames, a counter line is set to separate the entrance area and the exit area, and the number of visitors is calculated according to the position of the human object detected through the area box in the plurality of frames corresponding to the counter line. a counting unit that counts;
Includes,
The count unit
An identifier and an initial flag value are assigned to the human object according to the area where the first detected human object is located among the entrance and exit areas divided by the counter line (CL) in the previous frame that is relatively preceding in time,
In a subsequent frame that is relatively subsequent in time, a variable flag value is assigned to the corresponding human object according to the area in which the human object to which the identifier is assigned is located among the entrance area and the exit area divided by the counter line,
Characterized in that the counting unit compares the initial flag value and the change flag value and counts the number of human objects that passed the counter line from the exit area to the entrance area as the number of visitors.
A device for counting the number of visitors.

delete

According to paragraph 4,
The count unit
Characterized in that the area where the human object is located is derived by comparing the coordinates of the center of the area box and the coordinates of two or more points of the counter line.
A device for counting the number of visitors.