KR102189262B1

KR102189262B1 - Apparatus and method for collecting traffic information using edge computing

Info

Publication number: KR102189262B1
Application number: KR1020200052901A
Authority: KR
Inventors: 홍윤국; 이기영; 류지훈; 김호영; 아바야 라나싱허 무디얀세라게 수다라 라나싱허; 타일러 표; 박종권
Original assignee: 주식회사 글로벌브릿지
Priority date: 2020-04-29
Filing date: 2020-04-29
Publication date: 2020-12-11

Abstract

According to one aspect of one embodiment of the present disclosure, provided is an edge computing device comprising: an input interface obtaining an input video photographed by a camera; at least one first processor; and a communication unit communicating with a traffic information collection server. The at least one first processor: inputs an input frame having a frame rate lower than a frame rate of an input video into a first machine learning model, and obtains identification information and speed information of at least one object detected in each frame from the first machine learning model; detects an event of detecting an object of interest whose speed exceeds a first reference value based on the identification information and speed information of each of the at least one object outputted from the first machine learning model; and transmits the event information and the input video to the traffic information collection server using the communication unit.

Description

Traffic information collection device and method using edge computing {Apparatus and method for collecting traffic information using edge computing}

본 개시의 실시 예들은 교통 정보 수집 시스템에 관한 것이다. 본 개시의 실시 예들은 교통 정보 수집 시스템의 엣지 컴퓨팅 장치, 엣지 컴퓨팅 장치 제어 방법, 및 엣지 컴퓨팅 장치 제어 방법을 수행하는 컴퓨터 프로그램에 관련된다. 또한, 본 개시의 실시 예들은 교통 정보 수집 서버, 교통 정보 수집 서버 제어 방법, 및 교통 정보 수집 서버 제어 방법을 수행하는 컴퓨터 프로그램에 관련된다.Embodiments of the present disclosure relate to a traffic information collection system. Embodiments of the present disclosure relate to a computer program that performs an edge computing device of a traffic information collection system, a method for controlling an edge computing device, and a method for controlling an edge computing device. In addition, embodiments of the present disclosure relate to a computer program that performs a traffic information collection server, a traffic information collection server control method, and a traffic information collection server control method.

현재의 교통 상황 측정 관련 기술은 노면에 설치된 센서를 이용하여 구간 속도를 측정하는 방법, CCTV(closed-circuit television) 영상을 이용하는 방법 등을 이용하고 있다. 그런데 노면에 설치된 센서를 이용하여 구간 속도를 측정하는 경우, 센서의 설치 및 관리 비용이 높고, 차량 자체의 속도가 아닌 구간 속도를 제공하는 단점이 있다. CCTV 영상을 이용한 방식은, 관제 시스템에서 사람이 직접 영상을 보고 교통법규 위반 여부를 판단해야 하는 불편함이 있다. 또한, 드론을 활용한 무인 단속 시스템도 시행되고 있다. 드론을 고속도로 상공에 띄워, 드론을 이용한 교통법규 위반 행위 단속이 시행되었다. 드론을 이용한 방식에서는 드론을 통해 촬영된 영상을 보고 경찰관이 직접 교통 법규 위반 여부를 판단하여, 근처에 있는 동료 경찰관 순찰차에 연락하여 단속하는 방식으로 시행되었다. 드론을 활용한 방식은 경찰관이 직접 영상을 판독해야 하는 불편함이 있다.Current traffic condition measurement related technology uses a method of measuring section speed using a sensor installed on a road surface, a method of using a CCTV (closed-circuit television) image, and the like. However, when the section speed is measured using a sensor installed on the road surface, there is a disadvantage in that the installation and management cost of the sensor is high, and the section speed is provided instead of the speed of the vehicle itself. In the method using CCTV video, there is an inconvenience in that a person must directly view the video in the control system and determine whether a traffic law is violated. In addition, an unmanned enforcement system using drones is also being implemented. Drones were launched over the highway to crack down on traffic violations using drones. In the drone-based method, police officers directly judged whether or not a traffic law was violated by viewing the video captured by the drone, and then contacted the patrol car of a fellow police officer in the vicinity to crack down. The method using a drone has the inconvenience of having to read the video directly by the police officer.

본 개시의 실시 예들은 기존의 CCTV 영상 또는 드론 영상을 이용하여, 기계학습 모델을 이용하여 차량 및 속도를 검출하여, 용이하게 교통 정보를 수집하기 위한 장치 및 방법을 제공하기 위한 것이다.Embodiments of the present disclosure are to provide an apparatus and method for easily collecting traffic information by detecting a vehicle and a speed using a machine learning model using a conventional CCTV image or a drone image.

또한, 본 개시의 실시 예들은, 엣지 컴퓨팅 방식을 이용하여, 매우 많은 양의 CCTV 영상 및 드론 영상을 효율적으로 분석하기 위한 장치 및 방법을 제공하기 위한 것이다.In addition, embodiments of the present disclosure are to provide an apparatus and method for efficiently analyzing a very large amount of CCTV images and drone images using an edge computing method.

또한, 본 개시의 실시 예들은, 수 백대 규모의 실제 교통 흐름에서 교통 정보를 실시간으로 수집하고, 정밀하게 교통 정보를 측정하는 장치 및 방법을 제공하기 위한 것이다. In addition, embodiments of the present disclosure are to provide an apparatus and method for collecting traffic information in real time and accurately measuring traffic information in real traffic flows of hundreds of scales.

본 개시의 일 실시예의 일 측면에 따르면, 카메라로 촬영된 입력 동영상을 획득하는 입력 인터페이스; 적어도 하나의 제1 프로세서; 및 교통 정보 수집 서버와 통신하는 통신부를 포함하고, 상기 적어도 하나의 제1 프로세서는, 상기 입력 동영상의 프레임 레이트보다 낮은 프레임 레이트의 입력 프레임을 제1 기계학습 모델에 입력하여, 상기 제1 기계학습 모델로부터 각 프레임에서 검출된 적어도 하나의 객체의 식별 정보 및 속도 정보를 획득하고, 상기 제1 기계학습 모델로부터 출력된 적어도 하나의 객체 각각의 식별 정보 및 속도 정보에 기초하여, 객체의 속도가 제1 기준 값을 초과하는 관심 객체를 검출한 이벤트를 검출하고, 상기 통신부를 이용하여, 상기 이벤트 정보 및 상기 입력 동영상을 상기 교통 정보 수집 서버로 전송하는, 엣지 컴퓨팅 장치가 제공된다.According to an aspect of an embodiment of the present disclosure, there is provided an input interface for obtaining an input video captured by a camera; At least one first processor; And a communication unit communicating with a traffic information collection server, wherein the at least one first processor inputs an input frame having a frame rate lower than the frame rate of the input video into a first machine learning model, and the first machine learning Obtaining identification information and speed information of at least one object detected in each frame from the model, and based on the identification information and speed information of each of the at least one object output from the first machine learning model, the speed of the object is controlled. An edge computing device is provided that detects an event of detecting an object of interest exceeding one reference value and transmits the event information and the input video to the traffic information collection server using the communication unit.

또한, 본 개시의 일 실시예에 따르면, 상기 교통 정보 수집 서버는 상기 입력 동영상을 입력 받아 적어도 하나의 객체 각각의 식별 정보 및 속도 정보를 생성하는 제2 기계학습 모델을 포함하고, 상기 제1 기계학습 모델은 상기 제2 기계학습 모델의 적어도 하나의 레이어에 대해 바이패스 경로를 적용한 기계학습 모델일 수 있다.In addition, according to an embodiment of the present disclosure, the traffic information collection server includes a second machine learning model that receives the input video and generates identification information and speed information of each of at least one object, and the first machine The learning model may be a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

또한, 본 개시의 일 실시예에 따르면, 상기 제1 기계학습 모델은 상기 제2 기계학습 모델로 입력되는 입력 프레임보다 낮은 프레임 레이트의 입력 프레임을 입력 받을 수 있다.Also, according to an embodiment of the present disclosure, the first machine learning model may receive an input frame having a lower frame rate than the input frame input to the second machine learning model.

또한, 본 개시의 일 실시예에 따르면, 상기 엣지 컴퓨팅 장치는, 상기 입력 동영상을 저장하는 비디오 버퍼를 더 포함하고, 상기 통신부는 상기 비디오 버퍼에 저장된 상기 입력 동영상을 상기 교통 정보 수집 서버로 전송할 수 있다.In addition, according to an embodiment of the present disclosure, the edge computing device further includes a video buffer for storing the input video, and the communication unit may transmit the input video stored in the video buffer to the traffic information collection server. have.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제1 프로세서는, 신호 정보, 정지선 정보, 또는 차선 정보 중 적어도 하나 또는 이들의 조합을 포함하는 메타 정보를 획득하고, 상기 이벤트를 검출할 때, 상기 적어도 하나의 객체 각각의 식별 정보, 상기 속도 정보, 및 상기 메타 정보에 기초하여 상기 이벤트를 검출할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one first processor may obtain meta information including at least one or a combination of signal information, stop line information, or lane information, and detect the event. In this case, the event may be detected based on the identification information of each of the at least one object, the speed information, and the meta information.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제1 프로세서는, 상기 입력 프레임으로부터 객체 위치, 및 객체 영역을 정의하는 트래클렛(tracklet) 정보 생성 처리를 수행하고, 상기 트래클렛 정보를 상기 제1 기계학습 모델에 입력하고, 상기 제1 기계학습 모델로부터 출력된 상기 적어도 하나의 객체의 식별 정보 및 상기 속도 정보를 획득하고, 상기 적어도 하나의 객체의 식별 정보 및 상기 속도 정보를 이용하여, 상기 적어도 하나의 객체 각각에 대한 정보 및 타 객체와의 관련성 정보를 포함하는 그래프 모델을 생성하고, 상기 그래프 모델 및 상기 적어도 하나의 객체 간의 관련성 정보를 이용하여, 상기 적어도 하나의 객체의 클러스터(cluster) 정보를 생성할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one first processor performs a process of generating tracklet information defining an object location and an object region from the input frame, and the tracklet information Acquiring the identification information and the speed information of the at least one object input to the first machine learning model, output from the first machine learning model, and using the identification information and the speed information of the at least one object , A graph model including information on each of the at least one object and relation information with another object is generated, and using the relation information between the graph model and the at least one object, the cluster of the at least one object ( cluster) information can be created.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제1 프로세서는, 상기 클러스터 정보에 기초하여, 하나의 객체에 대한 이벤트 검출 처리의 결과를 이용하여 동일 클러스터의 다른 객체의 이벤트 검출 처리를 수행할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one first processor may perform event detection processing of another object in the same cluster by using the result of event detection processing for one object based on the cluster information. Can be done.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제1 프로세서는, In addition, according to an embodiment of the present disclosure, the at least one first processor,

상기 입력 프레임으로부터 상기 이벤트가 검출되지 않은 경우, 이벤트가 검출되지 않았음을 나타내는 이벤트 미검출 정보를 생성하고, 상기 이벤트가 검출되지 않은 경우, 소정의 주기로 상기 이벤트 미검출 정보를 상기 교통 정보 수집 서버로 전송할 수 있다.When the event is not detected from the input frame, event non-detection information indicating that the event is not detected is generated, and when the event is not detected, the event non-detection information is transmitted to the traffic information collection server at a predetermined period. Can be transferred to.

또한, 본 개시의 일 실시예의 다른 측면에 따르면, 카메라로 촬영된 입력 동영상을 획득하는 단계; 상기 입력 동영상의 프레임 레이트보다 낮은 프레임 레이트의 입력 프레임을 제1 기계학습 모델에 입력하여, 상기 제1 기계학습 모델로부터 각 프레임에서 검출된 적어도 하나의 객체의 식별 정보 및 속도 정보를 획득하는 단계; 상기 제1 기계학습 모델로부터 출력된 적어도 하나의 객체 각각의 식별 정보 및 속도 정보에 기초하여, 객체의 속도가 제1 기준 값을 초과하는 이벤트를 검출하는 단계; 및 상기 이벤트 정보 및 상기 입력 동영상을 상기 교통 정보 수집 서버로 전송하는 단계를 포함하는, 엣지 컴퓨팅 장치 제어 방법이 제공된다.In addition, according to another aspect of an embodiment of the present disclosure, the step of obtaining an input video photographed by a camera; Inputting an input frame having a frame rate lower than the frame rate of the input video into a first machine learning model, and obtaining identification information and speed information of at least one object detected in each frame from the first machine learning model; Detecting an event in which the speed of the object exceeds a first reference value, based on identification information and speed information of each of the at least one object output from the first machine learning model; And transmitting the event information and the input video to the traffic information collection server.

또한, 본 개시의 일 실시예의 또 다른 측면에 따르면, 적어도 하나의 엣지 컴퓨팅 장치와 통신하는 통신부; 적어도 하나의 제2 프로세서; 및 출력 인터페이스를 포함하고, 상기 적어도 하나의 제2 프로세서는, 상기 통신부를 통해, 상기 적어도 하나의 엣지 컴퓨팅 장치로부터, 객체의 속도가 제1 기준 값을 초과하는 이벤트에 대한 정보를 포함하는 이벤트 정보, 및 입력 동영상를 수신하고, 상기 이벤트 정보에 대응하는 관심 프레임 구간을 상기 입력 동영상으로부터 추출하고, 상기 관심 프레임 구간의 입력 프레임을 제2 기계학습 모델에 입력하여, 상기 제2 기계학습 모델로부터 각 프레임에서 검출된 적어도 하나의 객체의 식별 정보 및 속도 정보를 획득하고, 상기 제2 기계학습 모델로부터 출력된 적어도 하나의 객체 각각의 식별 정보 및 속도 정보에 기초하여, 객체의 속도가 제2 기준 값을 초과하는 관심 객체를 검출한 이벤트를 검출하고, 상기 이벤트 정보 및 상기 관심 프레임 구간의 영상 정보를 상기 출력 인터페이스를 통해 출력하는, 교통 정보 수집 서버가 제공된다.In addition, according to another aspect of an embodiment of the present disclosure, a communication unit for communicating with at least one edge computing device; At least one second processor; And an output interface, wherein the at least one second processor includes information on an event in which the speed of an object exceeds a first reference value from the at least one edge computing device through the communication unit. , And an input video, extracting an interest frame section corresponding to the event information from the input video, and inputting the input frame of the frame of interest into a second machine learning model, and each frame from the second machine learning model Acquires identification information and speed information of at least one object detected in, and based on the identification information and speed information of each of the at least one object output from the second machine learning model, the speed of the object sets a second reference value. A traffic information collection server is provided that detects an event in which an object of interest is detected exceeding and outputs the event information and image information of the frame of interest through the output interface.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 엣지 컴퓨팅 장치는, 상기 입력 동영상을 입력 받아 적어도 하나의 객체 각각의 식별 정보 및 속도 정보를 생성하는 제1 기계학습 모델을 포함하고, 상기 제1 기계학습 모델은 상기 제2 기계학습 모델의 적어도 하나의 레이어에 대해 바이패스 경로를 적용한 기계학습 모델일 수 있다.In addition, according to an embodiment of the present disclosure, the at least one edge computing device includes a first machine learning model that receives the input video and generates identification information and speed information of each of at least one object, and the The first machine learning model may be a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제2 프로세서는, 신호 정보, 정지선 정보, 또는 차선 정보 중 적어도 하나 또는 이들의 조합을 포함하는 메타 정보를 획득하고, 상기 이벤트를 검출할 때, 상기 적어도 하나의 객체 각각의 식별 정보, 상기 속도 정보, 및 상기 메타 정보에 기초하여 상기 이벤트를 검출할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one second processor may acquire meta information including at least one or a combination of signal information, stop line information, or lane information, and detect the event. In this case, the event may be detected based on the identification information of each of the at least one object, the speed information, and the meta information.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제2 프로세서는, 상기 입력 프레임으로부터 객체 위치, 및 객체 영역을 정의하는 트래클렛(tracklet) 정보 생성 처리를 수행하고, 상기 트래클렛 정보를 상기 제2 기계학습 모델에 입력하고, 상기 제2 기계학습 모델로부터 출력된 상기 적어도 하나의 객체의 식별 정보 및 상기 속도 정보를 획득하고, 상기 적어도 하나의 객체의 식별 정보 및 상기 속도 정보를 이용하여, 상기 적어도 하나의 객체 각각에 대한 정보 및 타 객체와의 관련성 정보를 포함하는 그래프 모델을 생성하고, 상기 그래프 모델 및 상기 적어도 하나의 객체 간의 관련성 정보를 이용하여, 상기 적어도 하나의 객체의 클러스터(cluster) 정보를 생성할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one second processor performs a process of generating tracklet information defining an object location and an object area from the input frame, and the tracklet information Acquiring the identification information and the speed information of the at least one object input to the second machine learning model, output from the second machine learning model, and using the identification information and the speed information of the at least one object , A graph model including information on each of the at least one object and relation information with another object is generated, and using the relation information between the graph model and the at least one object, the cluster of the at least one object ( cluster) information can be created.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제2 프로세서는, 상기 클러스터 정보에 기초하여, 하나의 객체에 대한 이벤트 검출 처리의 결과를 이용하여 동일 클러스터의 다른 객체의 이벤트 검출 처리를 수행할 수 있다.In addition, according to an embodiment of the present disclosure, the at least one second processor performs event detection processing of another object in the same cluster by using the result of event detection processing for one object, based on the cluster information. Can be done.

또한, 본 개시의 일 실시예에 따르면, 상기 통신부는, 상기 적어도 하나의 엣지 컴퓨팅 장치로부터 이벤트가 검출되지 않았음을 나타내는 이벤트 미검출 정보를 수신하고, 상기 적어도 하나의 제2 프로세서는 상기 이벤트 미검출 정보를 상기 출력 인터페이스를 통해 출력할 수 있다.In addition, according to an embodiment of the present disclosure, the communication unit receives event non-detection information indicating that an event has not been detected from the at least one edge computing device, and the at least one second processor The detection information may be output through the output interface.

또한, 본 개시의 일 실시예에 따르면, 상기 적어도 하나의 제2 프로세서는, 상기 관심 프레임 구간으로부터 상기 이벤트가 검출되지 않은 경우, 상기 이벤트가 발생하지 않았음을 나타내는 이벤트 미검출 정보를 생성하고, 상기 이벤트 미검출 정보를 상기 출력 인터페이스를 통해 출력할 수 있다.In addition, according to an embodiment of the present disclosure, when the event is not detected from the frame of interest, the at least one second processor generates event non-detection information indicating that the event has not occurred, The event non-detection information may be output through the output interface.

또한, 본 개시의 일 실시예의 또 다른 측면에 따르면, 적어도 하나의 엣지 컴퓨팅 장치로부터, 객체의 속도가 제1 기준 값을 초과하는 이벤트에 대한 정보를 포함하는 이벤트 정보, 및 입력 동영상를 수신하는 단계; 상기 이벤트 정보에 대응하는 관심 프레임 구간을 상기 입력 동영상으로부터 추출하는 단계; 상기 관심 프레임 구간의 입력 프레임을 제2 기계학습 모델에 입력하여, 상기 제2 기계학습 모델로부터 각 프레임에서 검출된 적어도 하나의 객체의 식별 정보 및 속도 정보를 획득하는 단계; 상기 제2 기계학습 모델로부터 출력된 적어도 하나의 객체 각각의 식별 정보 및 속도 정보에 기초하여, 객체의 속도가 제2 기준 값을 초과하는 관심 객체를 검출한 이벤트를 검출하는 단계; 및 상기 이벤트 정보 및 상기 관심 프레임 구간의 영상 정보를 출력하는 단계를 포함하는 교통 정보 수집 서버 제어 방법이 제공된다.In addition, according to another aspect of an embodiment of the present disclosure, receiving, from at least one edge computing device, event information including information on an event in which the speed of an object exceeds a first reference value, and an input video; Extracting a frame of interest corresponding to the event information from the input video; Inputting the input frame of the frame of interest into a second machine learning model, and obtaining identification information and speed information of at least one object detected in each frame from the second machine learning model; Detecting an event of detecting an object of interest whose speed exceeds a second reference value, based on identification information and speed information of each of the at least one object output from the second machine learning model; And outputting the event information and image information of the frame of interest.

또한, 본 개시의 일 실시예의 또 다른 측면에 따르면, 기록 매체에 저장된 컴퓨터 프로그램에 있어서, 상기 컴퓨터 프로그램은 프로세서에 의해 수행되었을 때 엣지 컴퓨팅 장치 제어 방법을 수행하는 적어도 하나의 인스트럭션을 포함하는, 컴퓨터 프로그램이 제공된다.In addition, according to another aspect of an embodiment of the present disclosure, in a computer program stored in a recording medium, the computer program includes at least one instruction for performing an edge computing device control method when executed by a processor. The program is provided.

또한, 본 개시의 일 실시예의 또 다른 측면에 따르면, 기록 매체에 저장된 컴퓨터 프로그램에 있어서, 상기 컴퓨터 프로그램은 프로세서에 의해 수행되었을 때 교통 정보 수집 서버 제어 방법을 수행하는 적어도 하나의 인스트럭션을 포함하는, 컴퓨터 프로그램이 제공된다.Further, according to another aspect of an embodiment of the present disclosure, in a computer program stored in a recording medium, the computer program includes at least one instruction for performing a traffic information collection server control method when executed by a processor, A computer program is provided.

본 개시의 실시 예들에 따르면, 기존의 CCTV 영상 또는 드론 영상을 이용하여, 기계학습 모델을 이용하여 차량 및 속도를 검출하고, 용이하게 교통 정보를 수집하기 위한 장치 및 방법을 제공할 수 있는 효과가 있다.According to embodiments of the present disclosure, there is an effect of providing an apparatus and method for detecting vehicle and speed using a machine learning model and easily collecting traffic information using an existing CCTV image or drone image. have.

또한, 본 개시의 실시 예들에 따르면, 엣지 컴퓨팅 방식을 이용하여, 매우 많은 양의 CCTV 영상 및 드론 영상을 효율적으로 분석하기 위한 장치 및 방법을 제공할 수 있는 효과가 있다.In addition, according to embodiments of the present disclosure, there is an effect of providing an apparatus and method for efficiently analyzing a very large amount of CCTV images and drone images using an edge computing method.

또한, 본 개시의 실시 예들에 따르면, 수 백대 규모의 실제 교통 흐름에서 교통 정보를 실시간으로 수집하고, 정밀하게 교통 정보를 측정하는 장치 및 방법을 제공할 수 있는 효과가 있다.In addition, according to the embodiments of the present disclosure, there is an effect of providing an apparatus and method for collecting traffic information in real time and accurately measuring traffic information in real traffic flows on a scale of hundreds of vehicles.

도 1은 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 구조를 나타낸 도면이다.
도 2는 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 구조를 나타낸 도면이다.
도 3은 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 제어 방법을 나타낸 흐름도이다.
도 4a는 본 개시의 일 실시 예에 따른 제1 프로세서의 구조를 나타낸 도면이다.
도 4b는 본 개시의 일 실시 예에 따른 제2 프로세서의 구조를 나타낸 도면이다.
도 5는 본 개시의 일 실시 예에 따른 영상 처리 과정을 나타낸 도면이다.
도 6은 본 개시의 일 실시 예에 따른 카메라 캘리브레이션 처리를 나타낸 도면이다.
도 7은 본 개시의 일 실시 예에 따른 카메라 캘리브레이션 처리 과정을 나타낸 도면이다.
도 8a은 본 개시의 일 실시 예에 따른 객체 검출 처리를 설명하기 위한 도면이다.
도 8b는 본 개시의 일 실시 예에 따른 객체 검출 처리(514)를 나타낸 도면이다.
도 9는 본 개시의 일 실시 예에 따른 트래클렛 정보 생성 처리를 설명하기 위한 도면이다.
도 10은 본 개시의 일 실시 예에 따른 트래클렛 네트워크의 구조를 나타낸 도면이다.
도 11은 본 개시의 일 실시 예에 따른 그래프 모델을 나타낸 도면이다.
도 12는 본 개시의 일 실시 예에 따라 객체 간의 관련성을 산출하는 처리를 나타낸 도면이다.
도 13은 본 개시의 일 실시 예에 따른 그래프 모델 생성 처리 및 클러스터링 처리를 나타낸 도면이다.
도 14는 본 개시의 일 실시 예에 따른 결과 영상을 나타낸 도면이다.1 is a diagram showing the structure of a traffic information collection system according to an embodiment of the present disclosure.
2 is a diagram showing the structure of a traffic information collection system according to an embodiment of the present disclosure.
3 is a flowchart illustrating a control method of a traffic information collection system according to an embodiment of the present disclosure.
4A is a diagram illustrating a structure of a first processor according to an embodiment of the present disclosure.
4B is a diagram illustrating a structure of a second processor according to an embodiment of the present disclosure.
5 is a diagram illustrating an image processing process according to an embodiment of the present disclosure.
6 is a diagram illustrating a camera calibration process according to an embodiment of the present disclosure.
7 is a diagram illustrating a camera calibration process according to an embodiment of the present disclosure.
8A is a diagram for describing an object detection process according to an embodiment of the present disclosure.
8B is a diagram illustrating an object detection process 514 according to an embodiment of the present disclosure.
9 is a diagram for describing a process of generating tracklet information according to an embodiment of the present disclosure.
10 is a diagram illustrating a structure of a tracklet network according to an embodiment of the present disclosure.
11 is a diagram illustrating a graph model according to an embodiment of the present disclosure.
12 is a diagram illustrating a process of calculating a relationship between objects according to an embodiment of the present disclosure.
13 is a diagram illustrating a graph model generation process and a clustering process according to an embodiment of the present disclosure.
14 is a diagram illustrating a result image according to an embodiment of the present disclosure.

본 명세서는 본 개시의 청구항의 권리범위를 명확히 하고, 본 개시의 실시 예들이 속하는 기술분야에서 통상의 지식을 가진 자가 본 개시의 실시 예들을 실시할 수 있도록, 본 개시의 실시 예들의 원리를 설명하고, 실시 예들을 개시한다. 개시된 실시 예들은 다양한 형태로 구현될 수 있다.The present specification clarifies the scope of the claims of the present disclosure, and describes the principles of the embodiments of the present disclosure so that those of ordinary skill in the art to which the embodiments of the present disclosure belong may implement the embodiments of the present disclosure. And, the embodiments are disclosed. The disclosed embodiments may be implemented in various forms.

명세서 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다. 본 명세서가 실시 예들의 모든 요소들을 설명하는 것은 아니며, 본 개시의 실시 예들이 속하는 기술분야에서 일반적인 내용 또는 실시 예들 간에 중복되는 내용은 생략한다. 명세서에서 사용되는 '부'(part, portion)라는 용어는 소프트웨어 또는 하드웨어로 구현될 수 있으며, 실시 예들에 따라 복수의 '부'가 하나의 요소(unit, element)로 구현되거나, 하나의 '부'가 복수의 요소들을 포함하는 것도 가능하다. 이하 첨부된 도면들을 참고하여 본 개시의 실시 예들, 및 실시 예들의 작용 원리에 대해 설명한다.The same reference numerals refer to the same elements throughout the specification. This specification does not describe all elements of the embodiments, and general content in the technical field to which the embodiments of the present disclosure pertain or overlapping content between the embodiments will be omitted. The term'part, portion' used in the specification may be implemented in software or hardware, and according to embodiments, a plurality of'parts' may be implemented as one element or one It is also possible for'to contain multiple elements. Hereinafter, embodiments of the present disclosure and operating principles of the embodiments will be described with reference to the accompanying drawings.

도 1은 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 구조를 나타낸 도면이다.1 is a diagram showing the structure of a traffic information collection system according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따른 교통 정보 수집 시스템(100)은 도로 주변에 배치된 복수의 카메라(150)로부터 수집된 영상을 분석하여, 교통 법규를 위반한 차량을 검출하는 시스템이다. 교통 정보 수집 시스템(100)은 복수의 카메라(150), 복수의 엣지 컴퓨팅 장치(110), 및 교통 정보 수집 서버(130)를 포함한다. 카메라(150)와 엣지 컴퓨팅 장치(110)는 일대일로 대응되거나, 복수의 카메라(150)와 하나의 엣지 컴퓨팅 장치(110)가 대응될 수 있다.The traffic information collection system 100 according to an exemplary embodiment of the present disclosure is a system that detects a vehicle in violation of traffic laws by analyzing images collected from a plurality of cameras 150 disposed around a road. The traffic information collection system 100 includes a plurality of cameras 150, a plurality of edge computing devices 110, and a traffic information collection server 130. The camera 150 and the edge computing device 110 may correspond one-to-one, or a plurality of cameras 150 and one edge computing device 110 may correspond.

카메라(150)는 CCTV 카메라, 드론 카메라 등의 형태의 카메라에 대응될 수 있다. 카메라(150)는 교통 정보를 수집하고자 하는 도로를 향하는 FOV(field of view)를 갖도록 배치된다. 카메라(150)는 소정의 위치에 고정되어 있거나, 이동할 수 있다. 일 실시 예에 따르면, 카메라(150)는 팬, 틸트, 줌 등의 동작에 의해 FOV를 변경할 수 있다. 또한, 일 실시 예에 따르면, 카메라(150)는 드론에 의해 이동하고, 엣지 컴퓨팅 장치(110) 또는 교통 정보 수집 서버(130)의 제어 신호에 의해 소정의 위치로 이동할 수 있다.The camera 150 may correspond to a camera in the form of a CCTV camera or a drone camera. The camera 150 is arranged to have a field of view (FOV) facing a road in which traffic information is to be collected. The camera 150 may be fixed at a predetermined position or may be moved. According to an embodiment, the camera 150 may change the FOV by operations such as pan, tilt, and zoom. In addition, according to an embodiment, the camera 150 may be moved by a drone and may be moved to a predetermined position by a control signal from the edge computing device 110 or the traffic information collection server 130.

엣지 컴퓨팅 장치(110)는 카메라(150)의 영상을 수신하여, 입력 동영상으부터 규정 속도 위반 등의 이벤트 정보를 검출하고, 이벤트 정보 및 입력 동영상을 교통 정보 수집 서버(130)로 전송한다. 엣지 컴퓨팅 장치(110)는 제1 기계학습 모델을 이용하여, 입력 동영상을 처리하고, 객체의 식별 정보 및 속도 정보를 획득한다. The edge computing device 110 receives the image of the camera 150, detects event information such as a regulation speed violation from the input video, and transmits the event information and the input video to the traffic information collection server 130. The edge computing device 110 processes the input video by using the first machine learning model and obtains object identification information and speed information.

엣지 컴퓨팅 장치(110)는 입력 인터페이스(112), 제1 프로세서(114), 및 통신부(118)를 포함한다. 일 실시 예에 따르면, 엣지 컴퓨팅 장치(110)는 비디오 버퍼(116)를 더 포함할 수 있다.The edge computing device 110 includes an input interface 112, a first processor 114, and a communication unit 118. According to an embodiment, the edge computing device 110 may further include a video buffer 116.

입력 인터페이스(112)는 카메라(150)에 의해 촬영된 입력 동영상을 입력 받는다. 입력 동영상은 카메라(150)에 의해 결정된 제1 프레임 레이트를 가질 수 있다. 입력 인터페이스(112)는 카메라(150)로부터 입력 동영상을 수신하기 위한 소정 규격의 입력 장치, 또는 통신부에 대응될 수 있다. 입력 인터페이스(112)로 입력된 동영상은 제1 프로세서(114) 및 비디오 버퍼(116)로 전달된다. The input interface 112 receives an input video captured by the camera 150. The input video may have a first frame rate determined by the camera 150. The input interface 112 may correspond to an input device of a predetermined standard or a communication unit for receiving an input video from the camera 150. The video input through the input interface 112 is transmitted to the first processor 114 and the video buffer 116.

제1 프로세서(114)는 엣지 컴퓨팅 장치(110)의 전반적인 동작을 제어한다. 제1 프로세서(114)는 하나 또는 그 이상의 프로세서를 포함할 수 있다. 제1 프로세서(114)는 적어도 하나의 인스트럭션 또는 코맨드를 입력 받아, 소정의 동작을 수행한다. 제1 프로세서(114)는 입력 동영상을 입력 받아, 입력 동영상의 적어도 일부의 프레임으로부터 차량에 대응하는 객체를 검출하고, 검출된 차량의 속도 정보를 생성한다. 제1 프로세서(114)는 제1 기계학습 모델을 실행하여, 객체의 식별 정보 및 객체의 속도 정보를 생성한다. 제1 프로세서(114)는 입력 동영상의 제1 프레임 레이트보다 낮은 제2 프레임 레이트로 입력 동영상의 프레임을 샘플링하여, 제1 기계학습 모델에 입력한다. 또한, 제1 프로세서(114)는 객체의 속도가 제1 기준 값을 초과하는 이벤트를 검출하면, 이벤트 정보를 생성한다. The first processor 114 controls the overall operation of the edge computing device 110. The first processor 114 may include one or more processors. The first processor 114 receives at least one instruction or command and performs a predetermined operation. The first processor 114 receives the input video, detects an object corresponding to the vehicle from at least some frames of the input video, and generates speed information of the detected vehicle. The first processor 114 executes the first machine learning model and generates identification information of the object and speed information of the object. The first processor 114 samples the frames of the input video at a second frame rate lower than the first frame rate of the input video and inputs them to the first machine learning model. Also, when the first processor 114 detects an event in which the speed of the object exceeds the first reference value, it generates event information.

기계학습 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘 기술이며, 요소기술은 딥 러닝 등의 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 기술이다. 본 개시의 실시예들의 기계학습 모델은 예를 들면 딥 뉴럴 네트워크(deep neural network) 구조를 가질 수 있다. 기계학습 모델은 하나 이상의 노드 및 노드들 간의 연산 규칙에 기초하여 트레이닝 데이터를 이용하여 학습될 수 있다. 노드의 구조, 레이어의 구조, 및 노드들 간의 연산 규칙은 실시 예에 따라 다양하게 결정될 수 있다. 기계학습 모델은 하나 이상의 프로세서, 메모리, 레지스터, 합산 처리부, 병렬 처리부 또는 곱셈 처리부 등의 하드웨어 리소스를 포함하고, 각 하드웨어 리소스에 적용되는 파라미터 세트에 기초하여 하드웨어 리소스를 동작시킨다. 이를 위해, 기계학습 모델을 동작시키는 프로세서는 기계학습 모델의 각 동작에 대해 하드웨어 리소스를 할당하는 태스크 또는 리소스 관리 처리를 수행할 수 있다. 기계학습 모델은 예를 들면, CNN(Convolutional Neural Network), RNN(Recurrent Neural Network), LSTM(Long Short-Term Memory) 등의 구조를 가질 수 있다. 본 개시의 일 실시 예에 따른 제1 기계학습 모델 및 제2 기계학습 모델은, CNN과 RNN 구조의 조합을 포함할 수 있다.Machine learning is an algorithm technology that classifies/learns the features of input data by itself, and element technology is a technology that simulates functions such as cognition and judgment of the human brain using machine learning algorithms such as deep learning. The machine learning model of the embodiments of the present disclosure may have, for example, a deep neural network structure. The machine learning model may be trained using training data based on one or more nodes and arithmetic rules between nodes. A structure of a node, a structure of a layer, and an operation rule between nodes may be variously determined according to embodiments. The machine learning model includes hardware resources such as one or more processors, memory, registers, summing processing units, parallel processing units, or multiplication processing units, and operates hardware resources based on a set of parameters applied to each hardware resource. To this end, the processor operating the machine learning model may perform a task of allocating hardware resources or resource management processing for each operation of the machine learning model. The machine learning model may have, for example, a structure such as a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), and a Long Short-Term Memory (LSTM). The first machine learning model and the second machine learning model according to an embodiment of the present disclosure may include a combination of a CNN and an RNN structure.

이벤트 정보는 객체의 속도가 제1 기준 값 또는 제2 기준 값을 초과하는 객체를 검출하였는지 여부를 나타내는 정보이다. 이벤트 정보는 객체에 대한 정보, 이벤트가 발생한 프레임에 대한 정보, 및 검출된 속도 정보를 포함할 수 있다. 이벤트 정보는 이벤트를 검출하였음을 나타내는 이벤트 검출 정보, 및 이벤트가 검출되지 않았음을 나타내는 이벤트 미검출 정보를 포함할 수 있다. 이벤트 정보는 각 프레임마다 생성될 수 있다. 객체에 대한 정보는 예를 들면, 프레임 내에서 객체에 대응하는 영역의 좌표, 가로 길이, 및 세로 길이에 대한 정보를 포함할 수 있다. 이벤트가 발생한 프레임에 대한 정보는 프레임 번호, 또는 프레임에 대응하는 시간 등의 정보를 포함할 수 있다. The event information is information indicating whether an object whose speed exceeds a first reference value or a second reference value is detected. The event information may include information on an object, information on a frame in which an event occurs, and detected speed information. The event information may include event detection information indicating that an event has been detected, and event non-detection information indicating that an event has not been detected. Event information may be generated for each frame. The information on the object may include, for example, information on coordinates, horizontal length, and vertical length of an area corresponding to the object within the frame. The information on the frame in which the event has occurred may include information such as a frame number or a time corresponding to the frame.

비디오 버퍼(116)는 입력 동영상(116)을 저장한다. 비디오 버퍼(116)는 동영상을 저장할 수 있는 소정의 저장 매체를 포함할 수 있다. 비디오 버퍼(116)는 예를 들면, 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. 비디오 버퍼(116)는 제1 프레임 레이트의 입력 동영상을 저장할 수 있다.The video buffer 116 stores the input video 116. The video buffer 116 may include a predetermined storage medium capable of storing a moving picture. The video buffer 116 is, for example, a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD Memory, etc.), RAM (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (ROM, Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), It may include at least one type of storage medium among magnetic memory, magnetic disk, and optical disk. The video buffer 116 may store an input video having a first frame rate.

통신부(118)는 유선 또는 무선으로 외부 장치와 통신할 수 있다. 통신부(118)는 이벤트 정보 및 입력 동영상을 교통 정보 수집 서버(130)로 전송한다. The communication unit 118 may communicate with an external device by wire or wirelessly. The communication unit 118 transmits event information and input video to the traffic information collection server 130.

본 개시의 일 실시 예에 따르면, 제1 프로세서(114)는 입력 동영상 중 이벤트가 검출된 구간인 관심 프레임 구간만을 추출하여 교통 정보 수집 서버(130)로 전송한다. 엣지 컴퓨팅 장치(110)는 제1 기계학습 모델을 이용하여 객체의 속도를 검출하고, 이벤트를 검출함에 의해, 모든 정보를 교통 정보 수집 서버(130)로 바로 보내지 않고, 입력 동영상의 전송 여부를 스스로 판단하고, 전송이 필요한 관심 프레임 구간의 입력 동영상만을 추출하여 교통 정보 수집 서버(130)로 전송한다. 이러한 실시 예에 따르면, 엣지 컴퓨팅 장치(110)의 데이터 전송량을 최소화하고, 교통 정보 수집 서버(130)에서 입력 동영상의 저장을 위해 이용되는 데이터 처리 리소스 및 저장 리소스를 최소화하여, 시스템 효율성을 증대할 수 있다. 즉, 본 개시의 실시 예는 교통 정보 수집 서버(130)의 제2 프로세서(134)의 부하를 감소시키고, 엣지 컴퓨팅 장치(110)에서 관심 프레임 구간을 결정하고 추출함에 의해, 엣지 컴퓨팅 장치(110)와 교통 정보 수집 서버(130) 간의 네트워크 코스트도 감소시킬 수 있다. According to an embodiment of the present disclosure, the first processor 114 extracts only a frame of interest, which is a section in which an event is detected, from the input video and transmits it to the traffic information collection server 130. The edge computing device 110 detects the speed of an object using the first machine learning model and detects an event, so that it does not directly send all information to the traffic information collection server 130, but automatically determines whether to transmit the input video. It determines, extracts only the input video of the frame of interest that needs to be transmitted, and transmits it to the traffic information collection server 130. According to this embodiment, by minimizing the amount of data transmission of the edge computing device 110 and minimizing the data processing resources and storage resources used for storing the input video in the traffic information collection server 130, the system efficiency can be increased. I can. That is, the embodiment of the present disclosure reduces the load of the second processor 134 of the traffic information collection server 130, and determines and extracts the frame of interest in the edge computing device 110, and thus, the edge computing device 110 ) And the network cost between the traffic information collection server 130 may be reduced.

본 개시의 다른 실시 예에 따르면, 제1 프로세서(114)는 입력 동영상 중 이벤트가 검출된 구간인 관심 프레임 구간을 추출하여 교통 정보 수집 서버(130)로 전송하고, 전체 프레임 구간의 입력 동영상을 별도의 관제 시스템으로 전송한다. 이러한 실시 예에 따르면, 교통 정보 수집 서버(130)에서 입력 동영상의 저장을 위해 이용되는 데이터 처리 리소스 및 저장 리소스를 최소화하여, 시스템 효율성을 증대할 수 있다. 또한, 본 실시 예에 따르면, 전체 프레임 구간의 입력 동영상은 엣지 컴퓨팅 장치(110)로부터 별도의 관제 시스템으로 전송되어, 관제 시스템에서 저장함에 의해, 시스템 효율성을 기하면서 입력 동영상도 보존 및 관리할 수 있는 효과가 있다.According to another embodiment of the present disclosure, the first processor 114 extracts an interest frame section, which is a section in which an event is detected, from the input video and transmits it to the traffic information collection server 130, and separates the input video of the entire frame section. To the control system of According to this embodiment, the traffic information collection server 130 minimizes data processing resources and storage resources used to store an input video, thereby increasing system efficiency. In addition, according to the present embodiment, the input video of the entire frame section is transmitted from the edge computing device 110 to a separate control system and stored in the control system, so that the input video can be preserved and managed while enhancing system efficiency. There is an effect.

통신부(118)는 근거리 통신을 수행할 수 있으며, 예를 들면, 블루투스, BLE(Bluetooth Low Energy), 근거리 무선 통신 (Near Field Communication), WLAN(와이파이), 지그비(Zigbee), 적외선(IrDA, infrared Data Association) 통신, WFD(Wi-Fi Direct), UWB(ultra wideband), Ant+ 통신 등을 이용할 수 있다. 다른 예로서, 통신부(118)는 이동 통신을 이용할 수 있으며, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신할 수 있다.The communication unit 118 may perform short-range communication, for example, Bluetooth, Bluetooth Low Energy (BLE), Near Field Communication, WLAN (Wi-Fi), Zigbee, and infrared (IrDA, infrared). Data Association) communication, WFD (Wi-Fi Direct), UWB (ultra wideband), Ant+ communication, etc. can be used. As another example, the communication unit 118 may use mobile communication, and may transmit and receive wireless signals with at least one of a base station, an external terminal, and a server on a mobile communication network.

교통 정보 수집 서버(130)는 복수의 엣지 컴퓨팅 장치(110)와 통신하고, 각각의 엣지 컴퓨팅 장치(110)로부터 입력 동영상 및 이벤트 정보를 수신한다. 교통 정보 수집 서버(130)는 교통 관제실 등에 배치된 물리적인 서버에 대응되거나, 클라우드 서버에 대응될 수 있다.The traffic information collection server 130 communicates with the plurality of edge computing devices 110 and receives input video and event information from each of the edge computing devices 110. The traffic information collection server 130 may correspond to a physical server disposed in a traffic control room or the like, or may correspond to a cloud server.

교통 정보 수집 서버(130)는 통신부(132), 제2 프로세서(134), 및 출력 인터페이스(136)를 포함한다. The traffic information collection server 130 includes a communication unit 132, a second processor 134, and an output interface 136.

통신부(132)는 유선 또는 무선으로 외부 장치와 통신할 수 있다. 통신부(132)는 복수의 엣지 컴퓨팅 장치(110)로부터 이벤트 정보 및 입력 동영상을 수신한다. The communication unit 132 may communicate with an external device by wire or wirelessly. The communication unit 132 receives event information and input video from a plurality of edge computing devices 110.

통신부(132)는 근거리 통신을 수행할 수 있으며, 예를 들면, 블루투스, BLE(Bluetooth Low Energy), 근거리 무선 통신 (Near Field Communication), WLAN(와이파이), 지그비(Zigbee), 적외선(IrDA, infrared Data Association) 통신, WFD(Wi-Fi Direct), UWB(ultra wideband), Ant+ 통신 등을 이용할 수 있다. 다른 예로서, 통신부(132)는 이동 통신을 이용할 수 있으며, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신할 수 있다.The communication unit 132 may perform short-range communication, for example, Bluetooth, Bluetooth Low Energy (BLE), Near Field Communication, WLAN (Wi-Fi), Zigbee, and infrared (IrDA, infrared). Data Association) communication, WFD (Wi-Fi Direct), UWB (ultra wideband), Ant+ communication, etc. can be used. As another example, the communication unit 132 may use mobile communication, and may transmit and receive wireless signals with at least one of a base station, an external terminal, and a server on a mobile communication network.

제2 프로세서(134)는 교통 정보 수집 서버(130)의 전반적인 동작을 제어한다. 제2 프로세서(134)는 적어도 하나의 인스트럭션 또는 코맨드를 입력 받아, 소정의 동작을 수행한다. 제2 프로세서(134)는 하나 또는 그 이상의 프로세서를 포함할 수 있다. 제2 프로세서(134)는 이벤트 정보 및 입력 동영상을 입력 받아, 이벤트가 발생한 프레임 구간의 프레임들로부터 차량에 대응하는 객체를 검출하고, 검출된 차량의 속도 정보를 생성한다. 제2 프로세서(134)는 제2 기계학습 모델을 실행하여, 객체의 식별 정보 및 객체의 속도 정보를 생성한다. 제2 기계학습 모델은 제1 프레임 레이트 이상 제2 프레임 레이트 이하의 프레임 레이트로 입력 동영상의 프레임을 입력 받아 처리한다. 또한, 제2 프로세서(134)는 객체의 속도가 제2 기준 값을 초과하는 이벤트를 검출하면, 이벤트 정보를 생성한다. 제2 프로세서(134)의 제2 기계학습 모델은 제1 프로세서(114)의 제1 기계학습 모델에 비해 정확도가 높은 모델이기 때문에, 엣지 컴퓨팅 장치(110)에서 이벤트가 검출되더라도, 교통 정보 수집 서버(130)에서 이벤트 검출 여부에 대한 판단 결과가 구체화 될 수 있다.The second processor 134 controls the overall operation of the traffic information collection server 130. The second processor 134 receives at least one instruction or command and performs a predetermined operation. The second processor 134 may include one or more processors. The second processor 134 receives event information and an input video, detects an object corresponding to the vehicle from frames in the frame section in which the event occurs, and generates the detected vehicle speed information. The second processor 134 executes the second machine learning model and generates identification information of the object and speed information of the object. The second machine learning model receives and processes frames of the input video at a frame rate equal to or greater than the first frame rate and equal to or lower than the second frame rate. In addition, when the second processor 134 detects an event in which the speed of the object exceeds the second reference value, it generates event information. Since the second machine learning model of the second processor 134 is a model with higher accuracy than the first machine learning model of the first processor 114, even if an event is detected by the edge computing device 110, the traffic information collection server At (130), a result of determining whether an event is detected may be specified.

제1 기계학습 모델은 제2 기계학습 모델에 대해, 적어도 일부 레이어에 대한 바이패스 경로를 적용한 모델이다. 따라서 제1 기계학습 모델은 제2 기계학습 모델에 비해 처리량이 적고 처리 속도가 빠르다. 또한, 제1 기계학습 모델은 제2 기계학습 모델보다 낮은 프레임 레이트로 입력 동영상을 입력 받기 때문에, 처리량이 제2 기계학습 모델에 비해 적다. 그러나 제1 기계학습 모델은 제2 기계학습 모델에 바이패스 경로를 적용함에 의해, 객체의 식별 정보 검출 및 속도 정보 검출의 정확도가 제2 기계학습 모델에 비해 떨어진다. 본 개시의 실시 예들에 따르면, 엣지 컴퓨팅 장치(110)는 입력 동영상의 프레임 전반에 대해 낮은 정확도로 빠르게 객체 식별 정보 검출 및 속도 검출을 수행하고, 이벤트가 발생한 프레임 구간에 대해 교통 정보 수집 서버(130)가 더 높은 정확도로 객체 식별 정보 검출 및 속도 검출 처리를 수행한다. 이러한 시스템 구조에 의해, 본 개시의 실시 예들은 실시간으로 입력 동영상을 모니터링하고, 시스템의 부하를 현저하게 감소시킬 수 있는 효과가 있다.The first machine learning model is a model in which a bypass path for at least some layers is applied to the second machine learning model. Therefore, the first machine learning model has less throughput and faster processing speed than the second machine learning model. Also, since the first machine learning model receives an input video at a lower frame rate than the second machine learning model, the throughput is less than that of the second machine learning model. However, since the first machine learning model applies a bypass path to the second machine learning model, the accuracy of object identification information detection and speed information detection is inferior to that of the second machine learning model. According to embodiments of the present disclosure, the edge computing device 110 rapidly detects object identification information and speed with low accuracy for the entire frame of the input video, and the traffic information collection server 130 for the frame section in which the event occurs. ) Performs object identification information detection and speed detection processing with higher accuracy. With such a system structure, embodiments of the present disclosure have an effect of monitoring an input video in real time and remarkably reducing a system load.

제2 프로세서(134)는 이벤트 검출 결과에 기초하여, 이벤트가 발생한 관심 프레임에 대한 정보 및 이벤트 정보를 생성하여 출력한다. The second processor 134 generates and outputs information on a frame of interest in which an event occurs and event information based on the event detection result.

출력 인터페이스(136)는 데이터 및 다양한 정보를 출력한다. 출력 인터페이스(136)는 디스플레이, 터치스크린, 통신부 등에 대응될 수 있다. 출력 인터페이스(136)는 관심 프레임 정보 및 이벤트 정보를 출력한다. 일 실시 예에 따르면, 출력 인터페이스(136)는 관심 프레임 상에 객체의 식별 정보 및 속도 정보를 함께 표시하고, 이벤트가 검출된 객체에 대한 정보를 관심 프레임과 함께 출력할 수 있다. 예를 들면, 복수의 객체가 관심 프레임 상에 표시되고, 이벤트가 검출된 객체에 인디케이터가 표시될 수 있다.The output interface 136 outputs data and various information. The output interface 136 may correspond to a display, a touch screen, a communication unit, or the like. The output interface 136 outputs frame of interest information and event information. According to an embodiment, the output interface 136 may display the object identification information and speed information together on the frame of interest, and output information on the object in which the event is detected together with the frame of interest. For example, a plurality of objects may be displayed on a frame of interest, and an indicator may be displayed on an object in which an event is detected.

도 2는 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 구조를 나타낸 도면이다. 2 is a diagram showing the structure of a traffic information collection system according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따른 교통 정보 수집 시스템(100a)은 엣지 컴퓨팅 장치(110a), 교통 정보 수집 서버(130a), 및 관제 시스템(210)을 포함한다. 관제 시스템(210)은 엣지 컴퓨팅 장치(110a)에서 수집된 입력 동영상, 교통 정보 수집 서버(130a)에서 생성된 이벤트 정보 및 관심 프레임 정보를 수신하여, 교통 관제를 수행한다.The traffic information collection system 100a according to an embodiment of the present disclosure includes an edge computing device 110a, a traffic information collection server 130a, and a control system 210. The control system 210 receives the input video collected by the edge computing device 110a, event information generated by the traffic information collection server 130a, and frame information of interest, and performs traffic control.

도 2에서는 엣지 컴퓨팅 장치(110a)의 제1 프로세서(114a), 비디오 버퍼(116), 및 교통 정보 수집 서버(130)의 제2 프로세서(134a)를 중심으로 설명하고, 기타 구성은 생략한다.In FIG. 2, the first processor 114a of the edge computing device 110a, the video buffer 116, and the second processor 134a of the traffic information collection server 130 will be described, and other configurations will be omitted.

본 개시에서 제1 프로세서(114a) 및 제2 프로세서(134a) 내에 정의한 블록들은 본 개시의 실시예들을 수행하기 위한 하드웨어 블록 또는 소프트웨어 처리 단위의 일례일 뿐이고, 본 개시에서 개시된 처리 단위 이외에도 다양한 방식으로 본 개시의 실시예들을 수행하는 처리 단위가 정의될 수 있다.Blocks defined in the first processor 114a and the second processor 134a in the present disclosure are only examples of hardware blocks or software processing units for performing the embodiments of the present disclosure, and in various ways other than the processing units disclosed in the present disclosure. A processing unit that performs the embodiments of the present disclosure may be defined.

엣지 컴퓨팅 장치(110a)의 제1 프로세서(114a)는 코덱(222), 샘플러(224), 제1 기계학습 모델(226), 및 제1 속도 분석부(228)를 포함할 수 있다. 제1 프로세서(114a)의 각 블록은 하드웨어 블록 또는 소프트웨어 블록에 대응될 수 있다. 예를 들면, 제1 프로세서(114a)는 코덱(222)에 대응하는 전용 프로세서, 또는 제1 기계학습 모델(226)에 대응하는 전용 프로세서를 포함할 수 있다.The first processor 114a of the edge computing device 110a may include a codec 222, a sampler 224, a first machine learning model 226, and a first speed analysis unit 228. Each block of the first processor 114a may correspond to a hardware block or a software block. For example, the first processor 114a may include a dedicated processor corresponding to the codec 222 or a dedicated processor corresponding to the first machine learning model 226.

코덱(222)은 카메라(150)로부터 입력된 입력 동영상을 소정의 규격으로 디코딩한다. 코덱(222)은 예를 들면, AVI, MPEG(Moving Picture Experts Group), MOV, WMV(Window Media Video) 등의 규격을 지원할 수 있다. 코덱(222)은 입력 동영상을 디코딩하여 복수의 입력 프레임을 생성한다.The codec 222 decodes the input video input from the camera 150 to a predetermined standard. The codec 222 may support standards such as AVI, Moving Picture Experts Group (MPEG), MOV, and Window Media Video (WMV), for example. The codec 222 generates a plurality of input frames by decoding the input video.

샘플러(224)는 코덱(222)으로부터 복수의 입력 프레임을 입력 받아, 복수의 입력 프레임에 대한 샘플링 처리를 수행한다. 예를 들면, 샘플러(224)는 30fps로 입력 프레임을 입력 받아, 3fps로 샘플링할 수 있다. 샘플러(224)의 샘플링 레이트는 실시 예에 따라 달라질 수 있다.The sampler 224 receives a plurality of input frames from the codec 222 and performs sampling processing on the plurality of input frames. For example, the sampler 224 may receive an input frame at 30 fps and sample at 3 fps. The sampling rate of the sampler 224 may vary according to embodiments.

제1 기계학습 모델(226)은 샘플러(224)로부터 입력된 복수의 입력 프레임으로부터, 각각의 입력 프레임 내의 객체의 식별 정보 및 속도 정보를 생성하여 출력한다. 복수의 입력 프레임은 소정의 전 처리를 거쳐, 제1 기계학습 모델(226)에서 요구되는 입력 벡터로 변환되고, 입력 벡터가 제1 기계학습 모델(226)로 입력될 수 있다. 예를 들면, 전 처리에서 입력 프레임으로부터 차량에 대응하는 적어도 하나의 객체가 검출되고, 적어도 하나의 객체에 대한 정보가 제1 기계학습 모델(226)로 입력될 수 있다. 제1 기계학습 모델(226)에서 생성된 객체의 식별 정보 및 속도 정보는 입력 프레임에 삽입되어 출력된다. 예를 들면, 입력 프레임의 영상 데이터에, 객체의 영역을 나타내는 박스, 객체의 식별 정보를 나타내는 정보(예를 들면 박스의 컬러), 및 각 객체의 속도가 함께 표시될 수 있다. 제1 프로세서(114a)는 제1 기계학습 모델(226)의 출력에 대한 후 처리를 통해, 객체의 식별 정보 및 속도 정보를 영상 데이터에 삽입할 수 있다.The first machine learning model 226 generates and outputs identification information and speed information of an object in each input frame from a plurality of input frames input from the sampler 224. The plurality of input frames may be converted into input vectors required by the first machine learning model 226 through a predetermined pre-processing, and the input vectors may be input to the first machine learning model 226. For example, in pre-processing, at least one object corresponding to the vehicle may be detected from an input frame, and information on the at least one object may be input to the first machine learning model 226. Identification information and speed information of the object generated by the first machine learning model 226 are inserted into the input frame and output. For example, in the image data of the input frame, a box indicating an area of an object, information indicating identification information of an object (for example, a color of a box), and a speed of each object may be displayed together. The first processor 114a may insert object identification information and speed information into image data through post-processing of the output of the first machine learning model 226.

제1 속도 분석부(228)는 제1 기계학습 모델(226)로부터 생성된 객체의 식별 정보 및 속도 정보를 입력 받아, 각 객체의 속도가 제1 기준 값을 초과하는지 여부를 판단하고, 이벤트 정보를 생성한다. 제1 기준 값은 속도 위반의 기준 값이 되는 규정 속도에 기초하여 결정될 수 있다. 예를 들면, 제1 기준 값은 규정 속도의 110%에 해당하는 값으로 결정될 수 있다. 제1 속도 분석부(228)는 입력 프레임 내의 각 객체의 속도가 제1 기준 값을 초과하는 이벤트를 검출한다. 제1 속도 분석부(228)는 처리 결과에 따라 이벤트가 검출되었다는 이벤트 검출 정보 또는 이벤트가 검출되지 않았다는 이벤트 미검출 정보를 생성할 수 있다. 이벤트 검출 정보는, 이벤트가 검출된 객체의 식별 정보 및 프레임 정보(또는 시간 정보)를 포함할 수 있다. The first speed analysis unit 228 receives identification information and speed information of an object generated from the first machine learning model 226, determines whether the speed of each object exceeds a first reference value, and event information Create The first reference value may be determined based on a prescribed speed that becomes a reference value of the speed violation. For example, the first reference value may be determined as a value corresponding to 110% of the prescribed speed. The first speed analysis unit 228 detects an event in which the speed of each object in the input frame exceeds a first reference value. The first speed analysis unit 228 may generate event detection information indicating that an event has been detected or event non-detection information indicating that an event has not been detected according to a processing result. The event detection information may include identification information and frame information (or time information) of the object in which the event is detected.

이벤트가 검출된 경우, 이벤트 검출 정보에 대응하는 이벤트 정보가 교통 정보 수집 서버(130a)로 전송된다. 이벤트가 검출되지 않은 경우, 이벤트 미검출 정보에 대응하는 이벤트 정보가 교통 정보 수집 서버(130a) 또는 관제 시스템(210)으로 전송된다. 이벤트가 검출되지 않더라도, 이벤트 미검출 정보가 소정의 주기로 교통 정보 수집 서버(130a) 또는 관제 시스템(210)으로 전송될 수 있다. When an event is detected, event information corresponding to the event detection information is transmitted to the traffic information collection server 130a. When the event is not detected, event information corresponding to the event undetected information is transmitted to the traffic information collection server 130a or the control system 210. Even if an event is not detected, event non-detection information may be transmitted to the traffic information collection server 130a or the control system 210 at a predetermined period.

이벤트가 검출된 경우, 이벤트 정보와 함께, 이벤트가 발생된 프레임 구간에 대응하는 관심 프레임 구간의 복수의 프레임이 교통 정보 수집 서버(130a)로 전송된다. When an event is detected, together with the event information, a plurality of frames of the frame of interest corresponding to the frame section in which the event has occurred are transmitted to the traffic information collection server 130a.

엣지 컴퓨팅 장치(110a)의 비디오 버퍼(116)는 코덱(222)으로부터 복호화되어 생성된 복수의 입력 프레임을 포함하는 입력 동영상을 저장한다. 비디오 버퍼(116)는 입력 동영상을 교통 정보 수집 서버(130a) 또는 관제 시스템(210)으로 전송한다. The video buffer 116 of the edge computing device 110a stores an input video including a plurality of input frames generated by decoding from the codec 222. The video buffer 116 transmits the input video to the traffic information collection server 130a or the control system 210.

교통 정보 수집 서버(130a)의 제2 프로세서(134a)는 제2 기계학습 모델(242) 및 제2 속도 분석부(244)를 포함한다.The second processor 134a of the traffic information collection server 130a includes a second machine learning model 242 and a second speed analysis unit 244.

교통 정보 수집 서버(130a)는 복수의 엣지 컴퓨팅 장치(110a)로부터 이벤트가 발생한 관심 프레임 또는 관심 프레임에 대한 정보(예를 들면, 관심 프레임 구간)를 수신하여, 이벤트가 검출된 관심 프레임만 제2 기계학습 모델(242)을 통해 처리함으로써, 복수의 엣지 컴퓨팅 장치(110a)의 입력 동영상 전체에 대해 제2 기계학습 모델(242)로 처리하는 경우에 비해, 처리량을 현저하게 감소시킬 수 있다. 특히 제2 기계학습 모델(242)은 제1 기계학습 모델(226)에 비해 정밀도가 높은 기계학습 모델이기 때문에, 입력 프레임에 대한 처리량이 제1 기계학습 모델(226)에 비해 더 많은데, 제2 기계학습 모델(242)은 관심 프레임만 선택적으로 처리하여 이벤트 검출 가능성이 높은 프레임에 처리 자원을 집중시키고, 시스템의 처리 효율을 현저하게 증가시킬 수 있다.The traffic information collection server 130a receives a frame of interest in which an event occurs or information on a frame of interest (for example, a frame of interest) from a plurality of edge computing devices 110a, and second only the frame of interest in which the event is detected. By processing through the machine learning model 242, the throughput can be significantly reduced compared to the case where the entire input videos of the plurality of edge computing devices 110a are processed by the second machine learning model 242. In particular, since the second machine learning model 242 is a machine learning model with high precision compared to the first machine learning model 226, the throughput for the input frame is higher than that of the first machine learning model 226. The machine learning model 242 selectively processes only the frame of interest, concentrating processing resources on frames having a high probability of event detection, and significantly increasing the processing efficiency of the system.

제2 기계학습 모델(242)은 입력된 복수의 입력 프레임으로부터, 각각의 입력 프레임 내의 객체의 식별 정보 및 속도 정보를 생성하여 출력한다. 제2 기계학습 모델(242)로 입력되는 복수의 입력 프레임은 관심 프레임에 대응된다. 제2 기계학습 모델(242)은 제1 기계학습 모델(226)에 비해 높은 프레임 레이트로 입력 프레임을 입력 받아 처리한다. 예를 들면, 제2 기계학습 모델(242)은 제1 기계학습 모델(224)로다 10배의 프레임 레이트로 입력 프레임을 입력 받을 수 있다. 제2 기계학습 모델(242)은 카메라(150)에서 생성된 입력 동영상과 동일한 프레임 레이트로 입력 프레임을 입력 받거나, 입력 동영상에 소정의 샘플링 처리를 거쳐 입력 프레임을 입력 받을 수 있다. 복수의 입력 프레임은 소정의 전 처리를 거쳐, 제2 기계학습 모델(242)에서 요구되는 입력 벡터로 변환되고, 입력 벡터가 제2 기계학습 모델(242)로 입력될 수 있다. 예를 들면, 전 처리에서 입력 프레임으로부터 차량에 대응하는 적어도 하나의 객체가 검출되고, 적어도 하나의 객체에 대한 정보가 제2 기계학습 모델(242)로 입력될 수 있다. 제2 기계학습 모델(242)에서 생성된 객체의 식별 정보 및 속도 정보는 입력 프레임에 삽입되어 출력된다. 예를 들면, 입력 프레임의 영상 데이터에, 객체의 영역을 나타내는 박스, 객체의 식별 정보를 나타내는 정보(예를 들면 박스의 컬러), 및 각 객체의 속도가 함께 표시될 수 있다. 제2 프로세서(134a)는 제2 기계학습 모델(242)의 출력에 대한 후 처리를 통해, 객체의 식별 정보 및 속도 정보를 영상 데이터에 삽입할 수 있다.The second machine learning model 242 generates and outputs identification information and speed information of an object in each input frame from a plurality of input frames. A plurality of input frames input to the second machine learning model 242 corresponds to a frame of interest. The second machine learning model 242 receives and processes an input frame at a higher frame rate than the first machine learning model 226. For example, the second machine learning model 242 may receive an input frame from the first machine learning model 224 at a frame rate of 10 times. The second machine learning model 242 may receive an input frame at the same frame rate as the input video generated by the camera 150 or may receive an input frame through a predetermined sampling process on the input video. The plurality of input frames may be converted into input vectors required by the second machine learning model 242 through a predetermined pre-processing, and the input vectors may be input to the second machine learning model 242. For example, in pre-processing, at least one object corresponding to a vehicle may be detected from an input frame, and information on at least one object may be input to the second machine learning model 242. Identification information and speed information of the object generated by the second machine learning model 242 are inserted into the input frame and output. For example, in the image data of the input frame, a box indicating an area of an object, information indicating identification information of an object (for example, a color of a box), and a speed of each object may be displayed together. The second processor 134a may insert object identification information and speed information into the image data through post-processing of the output of the second machine learning model 242.

제1 기계학습 모델(226) 및 제2 기계학습 모델(242)은 다수의 객체 검출 결과, 입력 프레임, 객체 식별 정보, 및 속도 정보에 기초하여 기계학습될 수 있다. 일 실시예에 따르면, 제1 기계학습 모델(226) 및 제2 기계학습 모델(242)은 교차로로부터 획득된 학습 데이터를 이용하여 학습될 수 있다. 교차로로부터 획득된 학습 데이터를 이용하는 경우, 하루 분량의 학습 데이터만으로도 목표 성능을 갖는 제2 기계학습 모델(242)이 생성될 수 있다.The first machine learning model 226 and the second machine learning model 242 may be machine learned based on a plurality of object detection results, input frames, object identification information, and speed information. According to an embodiment, the first machine learning model 226 and the second machine learning model 242 may be trained using training data obtained from an intersection. In the case of using the learning data obtained from the intersection, the second machine learning model 242 having a target performance may be generated with only one day's worth of training data.

제2 속도 분석부(244)는 제2 기계학습 모델(242)로부터 생성된 객체의 식별 정보 및 속도 정보를 입력 받아, 각 객체의 속도가 제2 기준 값을 초과하는지 여부를 판단하고, 이벤트 정보를 생성한다. 제2 기준 값은 속도 위반의 기준 값이 되는 규정 속도에 기초하여 결정될 수 있다. 제2 기준 값은 제1 기준 값과 같거나, 제1 기준 값보다 클 수 있다. 예를 들면, 제2 기준 값은 규정 속도의 110%에 해당하는 값으로 결정될 수 있다. 제2 속도 분석부(244)는 입력 프레임 내의 각 객체의 속도가 제2 기준 값을 초과하는 이벤트를 검출한다. 제2 속도 분석부(244)는 처리 결과에 따라 이벤트가 검출되었다는 이벤트 검출 정보 또는 이벤트가 검출되지 않았다는 이벤트 미검출 정보를 생성할 수 있다. 이벤트 검출 정보는, 이벤트가 검출된 객체의 식별 정보 및 프레임 정보(또는 시간 정보)를 포함할 수 있다. The second speed analysis unit 244 receives identification information and speed information of an object generated from the second machine learning model 242, determines whether the speed of each object exceeds a second reference value, and event information Create The second reference value may be determined based on a prescribed speed that becomes a reference value of the speed violation. The second reference value may be the same as the first reference value or may be greater than the first reference value. For example, the second reference value may be determined as a value corresponding to 110% of the prescribed speed. The second speed analysis unit 244 detects an event in which the speed of each object in the input frame exceeds a second reference value. The second speed analyzer 244 may generate event detection information indicating that an event has been detected or event non-detection information indicating that an event has not been detected according to a processing result. The event detection information may include identification information and frame information (or time information) of the object in which the event is detected.

일 실시 예에 따르면, 제1 속도 분석부(228) 또는 제2 속도 분석부(244) 중 적어도 하나 또는 이들의 조합은 메타 정보를 입력 받아, 기계학습 모델의 처리 결과 및 메타 정보에 기초하여 이벤트 검출 여부를 판단할 수 있다. 메타 정보는 신호 정보, 차선 정보, 정지선 정보, 또는 횡단보도 정보 등을 포함할 수 있다. 메타 정보는 외부로부터 입력되거나, 입력 프레임으로부터 영상 처리에 의해 검출될 수 있다. 일 실시 예에 따르면, 제1 속도 분석부(228) 또는 제2 속도 분석부(244)는 정지 신호에서는 이벤트 검출 처리를 수행하지 않고, 주행 신호에서는 이벤트 검출 처리를 수행할 수 있다. According to an embodiment, at least one of the first speed analysis unit 228 or the second speed analysis unit 244, or a combination thereof, receives meta information, and provides an event based on the process result of the machine learning model and the meta information. Whether or not it is detected can be determined. The meta information may include signal information, lane information, stop line information, or crosswalk information. Meta information may be input from the outside or may be detected from an input frame by image processing. According to an embodiment, the first speed analysis unit 228 or the second speed analysis unit 244 may not perform event detection processing on a stop signal, and may perform event detection processing on a driving signal.

교통 정보 수집 서버(130a)는 이벤트 검출 정보 또는 이벤트 미검출 정보를 포함하는 이벤트 정보를 관제 시스템(210)으로 전송한다. The traffic information collection server 130a transmits event information including event detection information or event non-detection information to the control system 210.

관제 시스템(210)은 소정의 컴퓨팅 장치 또는 사람에 의해, 교통 법규 위반 여부를 결정한다. 관제실(212)은 적어도 하나의 프로세서 및 복수의 디스플레이를 포함하고, 복수의 디스플레이를 통해 입력 동영상 및 이벤트 정보를 디스플레이한다. 관제실(212)은 입력 동영상을 그대로 디스플레이하거나, 관심 프레임에 이벤트 정보를 삽입하여 디스플레이할 수 있다. 관제실(212)은 복수의 디스플레이를 통해 다양한 모드로 입력 동영상 및 이벤트 정보를 디스플레이한다. The control system 210 determines whether a traffic law is violated by a predetermined computing device or a person. The control room 212 includes at least one processor and a plurality of displays, and displays input video and event information through the plurality of displays. The control room 212 may display the input video as it is or by inserting event information into the frame of interest. The control room 212 displays input video and event information in various modes through a plurality of displays.

관제 시스템(210)은 저장부(214)를 포함하고, 저장부(214)에 입력 동영상, 관심 프레임, 및 이벤트 정보를 저장할 수 있다.The control system 210 may include a storage unit 214 and may store input video, a frame of interest, and event information in the storage unit 214.

도 3은 본 개시의 일 실시 예에 따른 교통 정보 수집 시스템의 제어 방법을 나타낸 흐름도이다. 도 3에서 엣지 컴퓨팅 장치(110)의 동작은 엣지 컴퓨팅 장치 제어 방법의 흐름도에 대응되고, 교통 정보 수집 서버(130)의 동작은 교통 정보 수집 서버 제어 방법의 흐름도에 대응된다. 도 3은 한 프레임에 대한 처리 과정을 나타내고, 엣지 컴퓨팅 장치(110) 및 교통 정보 수집 서버(130)는 복수의 프레임에 대해 도 3의 단계들을 반복해서 수행할 수 있다.3 is a flowchart illustrating a control method of a traffic information collection system according to an embodiment of the present disclosure. In FIG. 3, the operation of the edge computing device 110 corresponds to a flowchart of the method for controlling the edge computing device, and the operation of the traffic information collection server 130 corresponds to the flowchart of the method for controlling the traffic information collection server. 3 shows a processing process for one frame, and the edge computing device 110 and the traffic information collection server 130 may repeatedly perform the steps of FIG. 3 for a plurality of frames.

일 실시 예에 따르면, 엣지 컴퓨팅 장치(110)는 S312, S314, S316, 및 S318 단계들을 병렬적으로 수행할 수 있다. 교통 정보 수집 서버(130)는 S320, S322, S324, 및 S326 단계들을 병렬적으로 수행할 수 있다.According to an embodiment, the edge computing device 110 may perform steps S312, S314, S316, and S318 in parallel. The traffic information collection server 130 may perform steps S320, S322, S324, and S326 in parallel.

본 개시의 엣지 컴퓨팅 장치 제어 방법 및 교통 정보 수집 서버 제어 방법의 각 단계들은 프로세서 및 통신부를 구비하고, 기계학습 모델을 이용하는 다양한 형태의 전자 장치에 의해 수행될 수 있다. 본 개시는 본 개시의 실시 예들에 따른 엣지 컴퓨팅 장치(110, 110a)가 엣지 컴퓨팅 장치 제어 방법을 수행하는 실시 예를 중심으로 설명한다. 따라서 엣지 컴퓨팅 장치(110, 110a)에 대해 설명된 실시 예들은 엣지 컴퓨팅 장치 제어 방법에 대한 실시 예들에 적용 가능하고, 반대로 엣지 컴퓨팅 장치 제어 방법에 대해 설명된 실시 예들은 엣지 컴퓨팅 장치(110, 110a)에 대한 실시 예들에 적용 가능하다. 개시된 실시 예들에 따른 엣지 컴퓨팅 장치 제어 방법은 본 명세서에 개시된 엣지 컴퓨팅 장치(110, 110a)에 의해 수행되는 것으로 그 실시 예가 한정되지 않고, 다양한 형태의 전자 장치에 의해 수행될 수 있다.Each step of the method for controlling an edge computing device and a method for controlling a traffic information collection server of the present disclosure may be performed by various types of electronic devices including a processor and a communication unit, and using a machine learning model. The present disclosure will be described centering on an embodiment in which the edge computing devices 110 and 110a according to embodiments of the present disclosure perform a method of controlling an edge computing device. Accordingly, the embodiments described for the edge computing devices 110 and 110a can be applied to the embodiments of the edge computing device control method, and conversely, the embodiments described for the edge computing device control method are the edge computing devices 110 and 110a. ) Is applicable to the embodiments. The method for controlling an edge computing device according to the disclosed embodiments is performed by the edge computing devices 110 and 110a disclosed in the present specification, and the embodiment is not limited and may be performed by various types of electronic devices.

또한, 본 개시는 본 개시의 실시 예들에 따른 교통 정보 수집 서버(130, 130a)가 교통 정보 수집 서버 제어 방법을 수행하는 실시 예를 중심으로 설명한다. 따라서 교통 정보 수집 서버(130, 130a)에 대해 설명된 실시 예들은 교통 정보 수집 서버 제어 방법에 대한 실시 예들에 적용 가능하고, 반대로 교통 정보 수집 서버 제어 방법에 대해 설명된 실시 예들은 교통 정보 수집 서버(130, 130a)에 대한 실시 예들에 적용 가능하다. 개시된 실시 예들에 따른 교통 정보 수집 서버 제어 방법은 본 명세서에 개시된 교통 정보 수집 서버(130, 130a)에 의해 수행되는 것으로 그 실시 예가 한정되지 않고, 다양한 형태의 전자 장치에 의해 수행될 수 있다.In addition, the present disclosure will be described centering on an embodiment in which the traffic information collection servers 130 and 130a according to the embodiments of the present disclosure perform a method of controlling the traffic information collection server. Therefore, the embodiments described for the traffic information collection servers 130 and 130a can be applied to the embodiments of the traffic information collection server control method, and conversely, the embodiments described for the traffic information collection server control method are the traffic information collection server. It is applicable to the embodiments for (130, 130a). The traffic information collection server control method according to the disclosed embodiments is performed by the traffic information collection servers 130 and 130a disclosed in the present specification, and the embodiment is not limited and may be performed by various types of electronic devices.

이외에도, 아래에서 도 4 내지 도 15를 참고하여 설명하는 본 개시의 실시 예들이 엣지 컴퓨팅 장치 제어 방법 또는 교통 정보 수집 서버 제어 방법에 적용될 수 있다. 또한, 도 4 내지 도 15를 참고하여 설명한 엣지 컴퓨팅 장치 및 교통 정보 수집 서버의 동작들이 엣지 컴퓨팅 장치 제어 방법 또는 교통 정보 수집 서버 제어 방법의 단계들로 추가될 수 있다. In addition, embodiments of the present disclosure described below with reference to FIGS. 4 to 15 may be applied to a method for controlling an edge computing device or a method for controlling a traffic information collection server. In addition, the operations of the edge computing device and the traffic information collection server described with reference to FIGS. 4 to 15 may be added as steps of the edge computing device control method or the traffic information collection server control method.

우선 엣지 컴퓨팅 장치(110)는 카메라(150)로부터 입력 동영상을 획득한다(S312). 엣지 컴퓨팅 장치(110)는 코덱을 이용하여 입력 동영상을 복호화하여 복수의 입력 프레임을 획득하고, 복수의 입력 프레임을 소정의 프레임 레이트로 샘플링할 수 있다.First, the edge computing device 110 obtains an input video from the camera 150 (S312). The edge computing device 110 may obtain a plurality of input frames by decoding an input video using a codec, and sample the plurality of input frames at a predetermined frame rate.

다음으로, 엣지 컴퓨팅 장치(110)는 샘플링 된 입력 프레임을 제1 기계 학습 모델에 입력하여, 제1 기계학습 모델로부터 객체의 식별 정보 및 각 객체의 속도 정보를 획득할 수 있다(S314). 제1 기계학습 모델은 입력 프레임에 대해 전 처리를 거쳐 생성된 객체 검출 정보를 포함하는 입력 벡터를 입력 받아, 객체의 식별 정보 및 각 객체의 속도 정보를 출력한다. Next, the edge computing device 110 may input the sampled input frame into the first machine learning model to obtain object identification information and speed information of each object from the first machine learning model (S314). The first machine learning model receives an input vector including object detection information generated through pre-processing of an input frame, and outputs object identification information and speed information of each object.

다음으로, 엣지 컴퓨팅 장치(110)는 각 프레임의 객체의 식별 정보 및 객체의 속도 정보에 기초하여, 객체의 속도가 제1 기준 값을 초과하는 이벤트를 검출한다(S316). 엣지 컴퓨팅 장치(110)는 이벤트 검출 결과에 기초하여 이벤트 검출 정보 또는 이벤트 미검출 정보를 포함하는 이벤트 정보를 생성한다.Next, the edge computing device 110 detects an event in which the speed of the object exceeds the first reference value, based on the identification information of the object and the speed information of the object in each frame (S316). The edge computing device 110 generates event information including event detection information or non-event detection information based on the event detection result.

다음으로, 엣지 컴퓨팅 장치(110)는 입력 동영상 및 이벤트 정보를 교통 정보 수집 서버(130)로 전송한다(S318). 엣지 컴퓨팅 장치(110)는 이벤트가 검출된 경우, 이벤트가 검출된 프레임 구간인 관심 프레임에 대한 정보를 교통 정보 수집 서버(130)로 전송할 수 있다. Next, the edge computing device 110 transmits the input video and event information to the traffic information collection server 130 (S318). When an event is detected, the edge computing device 110 may transmit information on a frame of interest, which is a frame section in which the event is detected, to the traffic information collection server 130.

교통 정보 수집 서버(130)는 엣지 컴퓨팅 장치(110)로부터 입력 동영상 및 이벤트 정보를 수신한(S320).The traffic information collection server 130 receives the input video and event information from the edge computing device 110 (S320).

교통 정보 수집 서버(130)는 입력 동영상으로부터 관심 프레임에 대응하는 복수의 입력 프레임을 추출하여, 관심 프레임을 제2 기계학습 모델에 입력한다. 제2 기계학습 모델은 입력 프레임을 입력 받아, 입력 프레임으로부터 검출된 객체의 식별 정보 및 객체의 속도 정보를 출력한다(S322). 교통 정보 수집 서버(130)는 제2 기계학습 모델로 입력되는 복수의 프레임들에 대해, 전 처리를 수행하여, 객체 정보를 포함하는 입력 벡터를 생성하고, 입력 벡터를 제2 기계학습 모델로 입력할 수 있다.The traffic information collection server 130 extracts a plurality of input frames corresponding to the frame of interest from the input video, and inputs the frame of interest into the second machine learning model. The second machine learning model receives the input frame and outputs identification information of the object and speed information of the object detected from the input frame (S322). The traffic information collection server 130 performs pre-processing on a plurality of frames input to the second machine learning model, generates an input vector including object information, and inputs the input vector as a second machine learning model. can do.

다음으로, 교통 정보 수집 서버(130)는 제2 기계학습 모델로부터 출력된 객체의 식별 정보 및 속도 정보에 기초하여, 객체의 속도가 제2 기준 값을 초과하는 이벤트를 검출한다(S324). 교통 정보 수집 서버(130)는 이벤트 검출 결과에 기초하여 이벤트 검출 정보 또는 이벤트 미검출 정보를 포함하는 이벤트 정보를 생성한다.Next, the traffic information collection server 130 detects an event in which the speed of the object exceeds the second reference value, based on the identification information and speed information of the object output from the second machine learning model (S324). The traffic information collection server 130 generates event information including event detection information or non-event detection information based on the event detection result.

다음으로, 교통 정보 수집 서버(130)는 이벤트 정보 및 관심 프레임을 출력 인터페이스를 통해 출력한다(S326). 이벤트 정보 및 관심 프레임은 디스플레이 상에 디스플레이되거나, 통신부를 통해 외부 장치로 전송될 수 있다.Next, the traffic information collection server 130 outputs the event information and the frame of interest through the output interface (S326). Event information and a frame of interest may be displayed on a display or transmitted to an external device through a communication unit.

도 4a는 본 개시의 일 실시 예에 따른 제1 프로세서의 구조를 나타낸 도면이다.4A is a diagram illustrating a structure of a first processor according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따른 제1 프로세서(114b)는 샘플러(224), 전처리부(410), 제1 기계학습 모델(226), 후처리부(420), 및 제1 속도 분석부(228)를 포함할 수 있다. 도 4a의 샘플러(224), 제1 기계학습 모델(226), 및 제1 속도 분석부(228)는 앞서 도 2에서 설명한 것과 동일하므로, 도 4a에서는 전처리부(410) 및 후처리부(420)를 중심으로 설명한다.The first processor 114b according to an embodiment of the present disclosure includes a sampler 224, a preprocessor 410, a first machine learning model 226, a postprocessor 420, and a first speed analysis unit 228. It may include. Since the sampler 224, the first machine learning model 226, and the first velocity analysis unit 228 of FIG. 4A are the same as those described in FIG. 2, in FIG. 4A, the preprocessor 410 and the postprocessor 420 It will be explained mainly.

전처리부(410)는 샘플러(224)로부터 출력된 입력 프레임을 입력 받아, 객체를 검출하고, 객체 정보를 포함하는 입력 벡터를 생성한다. 전처리부(410)는 입력 프레임으로부터 차량에 대응하는 객체를 생성할 수 있다. 전처리부(410)는 입력 프레임으로부터 객체의 위치 정보, 객체 영역의 폭 및 높이 정보, 및 외관 정보를 추출하고, 위치 정보, 폭, 높이, 외관 정보를 포함하는 입력 벡터를 생성할 수 있다. 입력 프레임으로부터 복수의 객체가 검출되는 경우, 각 객체에 대해 위치 정보, 폭, 높이, 외관 정보가 생성된다.The preprocessor 410 receives the input frame output from the sampler 224, detects an object, and generates an input vector including object information. The preprocessor 410 may generate an object corresponding to the vehicle from the input frame. The preprocessor 410 may extract location information of an object, width and height information of an object area, and appearance information from the input frame, and generate an input vector including location information, width, height, and appearance information. When a plurality of objects are detected from the input frame, location information, width, height, and appearance information are generated for each object.

일 실시 예에 따르면, 전처리부(410)는 기계학습 모델로 구현될 수 있다. 다른 실시 예에 따르면, 전처리부(410)는 소정의 객체 검출 알고리즘을 수행하는 소프트웨어 모듈로 구현될 수 있다.According to an embodiment, the preprocessor 410 may be implemented as a machine learning model. According to another embodiment, the preprocessor 410 may be implemented as a software module that performs a predetermined object detection algorithm.

제1 기계학습 모델(226)은 전처리부(410)에서 생성된 입력 벡터 및 입력 프레임을 입력 받아, 객체의 식별 정보 및 객체의 속도 정보를 생성한다. The first machine learning model 226 receives an input vector and an input frame generated by the preprocessor 410 and generates identification information of an object and speed information of the object.

후처리부(420)는 제1 기계학습 모델(226)로부터 출력된 객체의 식별 정보 및 객체의 속도 정보에 기초하여, 각 객체에 대응하는 객체 영역의 영상 데이터를 누적한 복수의 연결선을 생성하고, 복수의 연결선 간의 경로 유사성을 나타내는 관련성 정보를 포함하는 출력 데이터 구조를 생성한다. 각 노드는 각각의 객체에 대응하고, 각 노드에는 객체 영역의 영상 데이터를 잘라서 생성한 영상 데이터가 누적될 수 있다. 출력 데이터 구조는 각 노드를 정의하고, 각 노드 간의 관련성 정보에 따라 각 노드 간의 연결선의 속성을 변경한 모델에 대응될 수 있다.The post-processing unit 420 generates a plurality of connection lines in which image data of an object area corresponding to each object is accumulated, based on the identification information of the object and the speed information of the object output from the first machine learning model 226, An output data structure including relationship information indicating path similarity between a plurality of connection lines is generated. Each node corresponds to each object, and image data generated by cutting image data of the object region may be accumulated in each node. The output data structure may correspond to a model in which each node is defined and properties of a connection line between each node are changed according to relationship information between each node.

후처리부(420)는 그래프 모델에 기초하여, 각 객체의 경로 정보를 추가로 생성할 수 있다. 각 객체의 경로 정보는, 각 객체가 과거의 소정 시간 동안 이동한 궤적을 보여주는 정보이다. 후처리부(420)는 경로 정보의 유사성에 기초하여 관련성 정보를 생성한다.The post-processing unit 420 may additionally generate path information of each object based on the graph model. The path information of each object is information showing a trajectory that each object has moved for a predetermined time in the past. The post-processing unit 420 generates relevance information based on the similarity of the route information.

제1 속도 분석부(228)는 제1 기계학습 모델(226)에서 생성된 객체의 식별 정보 및 객체의 속도 정보에 기초하여 이벤트를 검출한다. 제1 속도 분석부(228)는 후처리부(420)에서 생성된 그래프 모델에 기초하여, 객체의 속도가 제1 기준 값을 초과하는 이벤트를 검출할 수 있다. 제1 속도 분석부(228)는 그래프 모델의 각 노드에 누적된 각 객체의 속도 정보에 기초하여 이벤트를 검출할 수 있다.The first speed analysis unit 228 detects an event based on identification information of the object and speed information of the object generated from the first machine learning model 226. The first speed analysis unit 228 may detect an event in which the speed of the object exceeds the first reference value, based on the graph model generated by the post-processing unit 420. The first speed analyzer 228 may detect an event based on speed information of each object accumulated in each node of the graph model.

도 4b는 본 개시의 일 실시 예에 따른 제2 프로세서의 구조를 나타낸 도면이다.4B is a diagram illustrating a structure of a second processor according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따른 제2 프로세서(134b)는 샘플러(430), 전처리부(440), 제2 기계학습 모델(242), 후처리부(450), 및 제2 속도 분석부(244)를 포함할 수 있다. 도 4b의 제2 기계학습 모델(242) 및 제2 속도 분석부(244)는 앞서 도 2에서 설명한 것과 동일하므로, 도 4b에서는 샘플러(430), 전처리부(440), 및 후처리부(450)를 중심으로 설명한다.The second processor 134b according to an embodiment of the present disclosure includes a sampler 430, a pre-processor 440, a second machine learning model 242, a post-processor 450, and a second speed analysis unit 244. It may include. Since the second machine learning model 242 and the second speed analysis unit 244 of FIG. 4B are the same as those described in FIG. 2, in FIG. 4B, the sampler 430, the preprocessor 440, and the postprocessor 450 It will be explained mainly.

샘플러(430)는 입력 동영상을 소정의 프레임 레이트로 샘플링할 수 있다. 제2 프로세서(134b)가 입력 동영상과 동일 프레임 레이트로 입력 동영상을 처리하는 경우, 샘플러(430)는 생략될 수 있다.The sampler 430 may sample the input video at a predetermined frame rate. When the second processor 134b processes the input video at the same frame rate as the input video, the sampler 430 may be omitted.

전처리부(440)는 샘플러(430)로부터 출력된 입력 프레임을 입력 받아, 객체를 검출하고, 객체 정보를 포함하는 입력 벡터를 생성한다. 전처리부(440)는 입력 프레임으로부터 차량에 대응하는 객체를 생성할 수 있다. 전처리부(440)는 입력 프레임으로부터 객체의 위치 정보, 객체 영역의 폭 및 높이 정보, 및 외관 정보를 추출하고, 위치 정보, 폭, 높이, 외관 정보를 포함하는 입력 벡터를 생성할 수 있다. 입력 프레임으로부터 복수의 객체가 검출되는 경우, 각 객체에 대해 위치 정보, 폭, 높이, 외관 정보가 생성된다.The preprocessor 440 receives the input frame output from the sampler 430, detects an object, and generates an input vector including object information. The preprocessor 440 may generate an object corresponding to the vehicle from the input frame. The preprocessor 440 may extract location information of an object, width and height information of an object area, and appearance information from the input frame, and generate an input vector including location information, width, height, and appearance information. When a plurality of objects are detected from the input frame, location information, width, height, and appearance information are generated for each object.

일 실시 예에 따르면, 전처리부(440)는 기계학습 모델로 구현될 수 있다. 다른 실시 예에 따르면, 전처리부(440)는 소정의 객체 검출 알고리즘을 수행하는 소프트웨어 모듈로 구현될 수 있다.According to an embodiment, the preprocessor 440 may be implemented as a machine learning model. According to another embodiment, the preprocessor 440 may be implemented as a software module that performs a predetermined object detection algorithm.

제2 프로세서(134b)의 전처리부(440)는 제1 프로세서(114b)의 전처리부(410)에 비해 높은 정확도를 가질 수 있고, 처리량은 더 많을 수 있다. 제2 프로세서(134b)의 전처리부(440)가 복수의 레이어를 포함하는 기계학습 모델로 구현되는 경우, 제1 프로세서(114b)의 전처리부(410)는 제2 프로세서(134b)의 전처리부(440)의 적어도 일부 레이어를 바이패스하는 바이패스 경로를 적용한 기계학습 모델을 포함할 수 있다. The preprocessor 440 of the second processor 134b may have higher accuracy than the preprocessor 410 of the first processor 114b and may have a higher throughput. When the preprocessor 440 of the second processor 134b is implemented as a machine learning model including a plurality of layers, the preprocessor 410 of the first processor 114b is a preprocessor of the second processor 134b ( A machine learning model to which a bypass path for bypassing at least some layers of 440) is applied may be included.

제2 기계학습 모델(242)은 전처리부(440)에서 생성된 입력 벡터를 입력 받아, 객체의 식별 정보 및 객체의 속도 정보를 생성한다. The second machine learning model 242 receives the input vector generated by the preprocessor 440 and generates identification information of an object and speed information of the object.

후처리부(450)는 제2 기계학습 모델(242)로부터 출력된 객체의 식별 정보 및 객체의 속도 정보에 기초하여, 각 객체에 대응하는 객체 영역의 영상 데이터를 누적한 복수의 노드를 생성하고, 복수의 노드 간의 경로 유사성을 나타내는 관련성 정보를 포함하는 출력 데이터 구조를 생성한다. 각 노드는 각각의 객체에 대응하고, 각 노드에는 객체 영역의 영상 데이터를 잘라서 생성한 영상 데이터가 누적될 수 있다. 출력 데이터 구조는 각 노드를 정의하고, 각 노드 간의 관련성 정보에 따라 각 노드 간의 연결선의 속성을 변경한 그래프 모델에 대응될 수 있다.The post-processing unit 450 generates a plurality of nodes in which image data of an object area corresponding to each object is accumulated, based on the object identification information and the object velocity information output from the second machine learning model 242, An output data structure including relationship information indicating path similarity between a plurality of nodes is generated. Each node corresponds to each object, and image data generated by cutting image data of the object region may be accumulated in each node. The output data structure may correspond to a graph model in which each node is defined and properties of a connection line between each node are changed according to relationship information between each node.

후처리부(450)는 그래프 모델에 기초하여, 각 객체의 경로 정보를 추가로 생성할 수 있다. 각 객체의 경로 정보는, 각 객체가 과거의 소정 시간 동안 이동한 궤적을 보여주는 정보이다. 후처리부(450)는 경로 정보의 유사성에 기초하여 관련성 정보를 생성한다.The post-processing unit 450 may additionally generate path information of each object based on the graph model. The path information of each object is information showing a trajectory that each object has moved for a predetermined time in the past. The post-processing unit 450 generates relevance information based on the similarity of the route information.

제2 속도 분석부(244)는 제2 기계학습 모델(242)에서 생성된 객체의 식별 정보 및 객체의 속도 정보에 기초하여 이벤트를 검출한다. 제2 속도 분석부(244)는 후처리부(450)에서 생성된 그래프 모델에 기초하여, 객체의 속도가 제2 기준 값을 초과하는 이벤트를 검출할 수 있다. 제2 속도 분석부(244)는 그래프 모델의 각 노드에 누적된 각 객체의 속도 정보에 기초하여 이벤트를 검출할 수 있다.The second speed analysis unit 244 detects an event based on identification information of the object and speed information of the object generated from the second machine learning model 242. The second speed analysis unit 244 may detect an event in which the speed of the object exceeds the second reference value, based on the graph model generated by the post-processing unit 450. The second speed analyzer 244 may detect an event based on speed information of each object accumulated in each node of the graph model.

도 5는 본 개시의 일 실시 예에 따른 영상 처리 과정을 나타낸 도면이다.5 is a diagram illustrating an image processing process according to an embodiment of the present disclosure.

제1 프로세서(114b) 및 제2 프로세서(134b)는 전처리, 기계학습 모델 처리, 및 후처리 과정을 거쳐 입력 프레임을 처리한다. 도 5에서는 제1 프로세서(114b) 및 제2 프로세서(134b)에 의해 수행되는, 전처리(510), 기계학습 모델 처리(520), 및 후처리(530) 과정을 설명한다. The first processor 114b and the second processor 134b process the input frame through pre-processing, machine learning model processing, and post-processing processes. In FIG. 5, a process of pre-processing 510, machine learning model processing 520, and post-processing 530 performed by the first processor 114b and the second processor 134b will be described.

본 개시의 일 실시 예에 따르면, 전처리(510)는 카메라 캘리브레이션(512), 객체 검출(514), 및 임베딩 처리(516)를 포함한다. 기계학습 모델 처리(520)는 제1 기계학습 모델에 의한 처리 또는 제2 기계학습 모델에 의한 처리를 포함한다. 후처리(530)는 그래프 모델 생성 처리(532) 및 클러스터링 처리(534)를 포함한다.According to an embodiment of the present disclosure, the preprocessing 510 includes a camera calibration 512, an object detection 514, and an embedding process 516. The machine learning model processing 520 includes processing by a first machine learning model or processing by a second machine learning model. The post-processing 530 includes a graph model generation process 532 and a clustering process 534.

도 6은 본 개시의 일 실시 예에 따른 카메라 캘리브레이션 처리를 나타낸 도면이다.6 is a diagram illustrating a camera calibration process according to an embodiment of the present disclosure.

카메라 캘리브레이션 처리(512)는 입력 프레임에 대해 실제 도로 상에서의 거리를 직사각형 형태로 변환하는 처리이다. 카메라 캘리브레이션 처리(512)는 입력 프레임(610)에서 도로 상의 동일 거리를 나타내는 기준선(612a, 612c) 및 도로 상의 차선 진행 방향을 나타내는 기준선(612b)를 생성할 수 있다. 일 실시 예에 따르면, 기준선(612a, 612b, 및 612c)은 입력 프레임(610)에서 검출되는 차선 정보에 기초하여 결정될 수 있다. The camera calibration process 512 is a process of converting a distance on an actual road with respect to an input frame into a rectangular shape. The camera calibration process 512 may generate reference lines 612a and 612c representing the same distance on the road from the input frame 610 and a reference line 612b representing a lane traveling direction on the road. According to an embodiment, the reference lines 612a, 612b, and 612c may be determined based on lane information detected in the input frame 610.

다음으로 카메라 캘리브레이션 처리(512)는 도로 상의 거리를 차선 진행 방향을 기준으로 나타낸 거리 기준 점(622)들을 생성할 수 있다. 카메라 캘리브레이션 처리(512)가 수행된 조정 입력 프레임(620)에 기초하면, 입력 프레임으로부터 객체의 프레임 간 이동 거리를 정확하게 측정할 수 있는 효과가 있다. Next, the camera calibration process 512 may generate distance reference points 622 indicating a distance on a road based on a lane traveling direction. Based on the adjustment input frame 620 on which the camera calibration process 512 has been performed, there is an effect of accurately measuring a moving distance between frames of an object from the input frame.

도 7은 본 개시의 일 실시 예에 따른 카메라 캘리브레이션 처리 과정을 나타낸 도면이다. 7 is a diagram illustrating a camera calibration process according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면, 카메라 캘리브레이션 처리(512)는 입력 동영상의 영상 타입 또는 도로 표시 선에 기초하여 카메라 캘리브레이션 처리를 수행할 수 있다. 이를 위해, 카메라 캘리브레이션 처리(512)는 우선 입력 동영상(710)에 기초하여, 영상 타입 또는 도로 표시 선을 식별한다(712). 영상 타입은 카메라 장비의 종류 및 촬영 구도에 대한 정보를 포함할 수 있다. 도로 표시 선은 차선, 횡단보도, 또는 정지선을 포함할 수 있다. 다음으로, 카메라 캘리브레이션 처리(512)는 영상 타입 또는 도로 표시 선에 기초하여 카메라 캘리브레이션을 수행한다(714).According to an embodiment of the present disclosure, the camera calibration process 512 may perform a camera calibration process based on an image type of an input video or a road marking line. To this end, the camera calibration process 512 first identifies an image type or a road marking line based on the input video 710 (712). The image type may include information on the type of camera equipment and a photographing composition. Road marking lines may include lanes, crosswalks, or stop lines. Next, the camera calibration processing 512 performs camera calibration based on the image type or road marking line (714).

입력 동영상(710)은 다양한 카메라 장비에 이해 촬영되어 획득될 수 있다. 또한, 입력 동영상의 촬영 시, 카메라 장비의 촬영 구도도 다양하게 결정될 수 있다. 카메라 캘리브레이션 처리(512)는 입력 동영상을 촬영한 카메라 장비의 종류 또는 촬영 구도 중 적어도 하나 또는 이들의 조합에 대한 정보를 나타내는 영상 타입 정보에 기초하여 카메라 캘리브레이션 처리(512)를 수행한다. The input video 710 may be acquired by being captured by various camera devices. In addition, when the input video is photographed, the photographing composition of the camera equipment may be variously determined. The camera calibration processing 512 performs a camera calibration processing 512 on the basis of image type information indicating information on at least one of a type of camera equipment or a photographing composition, or a combination of the types of camera equipment that has captured the input video.

카메라 장비의 종류는 예를 들면, 고정형 CCTV 카메라, 회전형 CCTV 카메라, 또는 드론 카메라 등을 포함할 수 있다. 촬영 구도는 사선 구도, 정면 구도, 또는 상공 촬영 구도 등을 포함할 수 있다. 카메라 캘리브레이션 처리(512)는 카메라 장비가 고정형인지, 회전형인지, 이동형(예를 들면, 드론)인지 여부에 따라, 달라질 수 있다. 카메라 장비가 고정형인 경우, 초기의 카메라 캘리브레이션 처리(512) 이후에 추가적인 카메라 캘리브레이션 처리(512)가 수행되지 않거나, 소정의 주기(예를 들면, 1시간)로 카메라 캘리브레이션 처리(512)가 수행될 수 있다. 카메라 장비가 회전형인 경우, 카메라 장비의 회전(예를 들면, 팬, 틸드 등의 동작)이 발생한 경우, 카메라 캘리브레이션 처리(512)가 수행될 수 있다. 회전형 카메라 장비의 경우, 회전이 발생한 경우 이외에도, 마지만 카메라 캘리브레이션 처리(512) 이후에 소정의 주기로 카메라 캘리브레이션 처리(512)가 수행될 수 있다. 드론 카메라와 같은 이동형 카메라 장비의 경우, 실시간으로 카메라 캘리브레이션 처리(512)가 수행될 수 있고, 예를 들면, 샘플링 처리 이후의 매 입력 프레임마다 카메라 캘리브레이션 처리(512)가 수행될 수 있다.The type of camera equipment may include, for example, a fixed CCTV camera, a rotating CCTV camera, or a drone camera. The photographing composition may include a diagonal composition, a front composition, or an aerial shot composition. The camera calibration process 512 may vary depending on whether the camera equipment is a fixed type, a rotating type, or a mobile type (eg, a drone). When the camera equipment is a fixed type, additional camera calibration processing 512 is not performed after the initial camera calibration processing 512, or camera calibration processing 512 is performed at a predetermined period (e.g., 1 hour). I can. When the camera device is of a rotation type, when the camera device is rotated (eg, pan, tilt, etc.), the camera calibration process 512 may be performed. In the case of the rotary camera device, in addition to the case where rotation has occurred, the camera calibration process 512 may be performed at a predetermined period after the camera calibration process 512. In the case of a mobile camera device such as a drone camera, the camera calibration process 512 may be performed in real time, and, for example, the camera calibration process 512 may be performed every input frame after the sampling process.

또한, 카메라 캘리브레이션 처리(512)는 촬영 구도에 따라 달라진다. 카메라 캘리브레이션 처리(512)는 사선 구도, 정면 구도, 또는 상공 촬영 구도인지 여부에 따라 카메라 캘리브레이션 처리(512)의 기준선(612a, 612b, 612c)에 의해 생성되는 사각형의 형태를 다르게 정의한다.In addition, the camera calibration process 512 varies depending on the photographing composition. The camera calibration process 512 defines differently the shape of a rectangle generated by the reference lines 612a, 612b, 612c of the camera calibration process 512 depending on whether it is a diagonal composition, a front composition, or an aerial shot composition.

또한, 카메라 캘리브레이션 처리(512)는 차선 등의 도로 표시 선에 기초하여 기준선(612a, 612b, 612c)의 방향 및 거리 기준 점(622)의 배치를 바탕으로 결정할 수 있다. Further, the camera calibration process 512 may be determined based on the direction of the reference lines 612a, 612b, and 612c and the arrangement of the distance reference point 622 based on road marking lines such as lanes.

다시 도 5를 참조하면, 전처리(510)는 객체 검출 처리(514)를 포함한다. 객체 검출 처리(514)는 입력 프레임으로부터 차량에 대응하는 객체를 검출한다. Referring back to FIG. 5, the preprocessing 510 includes an object detection process 514. The object detection process 514 detects an object corresponding to the vehicle from the input frame.

도 8a은 본 개시의 일 실시 예에 따른 객체 검출 처리를 설명하기 위한 도면이다. 8A is a diagram for describing an object detection process according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면, 객체 검출 처리(514)는 입력 프레임(810)으로부터 차량에 대응하는 적어도 하나의 객체(822)를 검출한다. 객체 검출 처리(514)는 입력 프레임(810)에서 검출된 객체에 대해, 객체 영역에 대응하는 블록(820)을 정의하고, 블록의 기준 점(824)의 좌표(x, y), 블록(820)의 폭(w) 및 블록의 높이(h)를 정의한다. 기준 점(824)은 예를 들면, 블록(820)의 좌측 상단의 좌표로 정의될 수 있다. 좌표(x, y)는 입력 프레임(810) 내에서 가로 축 및 세로 축을 기준으로 한 좌표를 나타낸다. According to an embodiment of the present disclosure, the object detection process 514 detects at least one object 822 corresponding to a vehicle from the input frame 810. The object detection processing 514 defines a block 820 corresponding to the object area for the object detected in the input frame 810, and coordinates (x, y) of the reference point 824 of the block, and block 820 Define the width (w) of) and the height (h) of the block. The reference point 824 may be defined as, for example, a coordinate of the upper left corner of the block 820. Coordinates (x, y) represent coordinates based on the horizontal axis and the vertical axis within the input frame 810.

또한, 객체 검출 처리(514)는 검출된 객체로부터 외관 정보를 생성할 수 있다. 외관 정보는 예를 들면, 객체의 차종(예를 들면, 승용차, 버스, 트럭, SUV(sport utility vehicle) 등), 색상 등의 정보를 포함할 수 있다. Also, the object detection process 514 may generate appearance information from the detected object. The appearance information may include, for example, information such as a vehicle type (eg, a car, a bus, a truck, a sport utility vehicle (SUV), etc.) of the object, and a color.

도 8b는 본 개시의 일 실시 예에 따른 객체 검출 처리(514)를 나타낸 도면이다. 8B is a diagram illustrating an object detection process 514 according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면, 객체 검출 처리(514)는 순차적으로 입력되는 복수의 입력 프레임으로부터, 프레임 간 중첩 영역 처리(830) 및 검출된 객체의 외관 유사도 산출 처리(840)를 수행한다. According to an embodiment of the present disclosure, the object detection processing 514 performs an inter-frame overlapping area processing 830 and an appearance similarity calculation processing 840 of the detected object from a plurality of sequentially inputted input frames.

프레임 간 중첩 영역 처리(830)는 시간적으로 인접한 프레임 간의 중첩 영역을 검출하여, 중첩 영역에 대해서 이전 프레임의 처리 결과를 이용한다. 프레임 간 중첩 영역 처리(830)는 제1 프레임(832)과, 제1 프레임(832)의 다음 프레임인 제2 프레임(834)에서 중첩되는 중첩 영역(836)을 검출하고, 중첩 영역(836)에 대해, 제1 프레임(832)에 대한 객체 검출 처리 결과에 기초하여 제2 프레임(834)에 대한 객체 검출 처리를 수행한다. 중첩 영역(836)은 검출된 객체에 기초하여 정의되거나, 영상 유사도에 기초하여 정의될 수 있다.The inter-frame overlapping area processing 830 detects an overlapping area between temporally adjacent frames, and uses the processing result of the previous frame for the overlapping area. The inter-frame overlapping area processing 830 detects an overlapping area 836 overlapping in the first frame 832 and the second frame 834 which is a frame following the first frame 832, and the overlapping area 836 For, object detection processing for the second frame 834 is performed based on the object detection processing result for the first frame 832. The overlapping area 836 may be defined based on the detected object or may be defined based on image similarity.

외관 유사도 산출 처리(840)는 복수의 입력 프레임으로부터 유사한 외관 정보를 갖는 객체들을 검출한다. 예를 들면, 외관 유사도 산출 처리(840)는 제1 프레임에서 검출한 객체와 제2 프레임에서 검출한 객체 간의 외관 유사도를 산출한다. 일 실시 예에 따르면, 외관 유사도는 입력 프레임으로부터 CNN을 이용하여 생성된 피처 맵에 기초하여 생성될 수 있다. 다른 실시 예에 따르면, 외관 유사도는 제1 프레임의 객체의 외관 정보 및 제2 프레임의 객체의 외관 정보에 기초하여 산출될 수 있다.The appearance similarity calculation processing 840 detects objects having similar appearance information from a plurality of input frames. For example, the appearance similarity calculation processing 840 calculates an appearance similarity between an object detected in a first frame and an object detected in a second frame. According to an embodiment, the appearance similarity may be generated based on a feature map generated using a CNN from an input frame. According to another embodiment, the appearance similarity may be calculated based on appearance information of the object of the first frame and appearance information of the object of the second frame.

프레임간 중첩 영역 처리(830) 및 외관 유사도 산출 처리(840)에 기초하여 생성된 객체 검출 처리 결과에 기초하여, 트래클렛 정보(V)가 생성된다.Tracklet information V is generated based on the object detection processing result generated based on the inter-frame overlapping area processing 830 and the appearance similarity calculation processing 840.

다시 도 5를 참조하면, 전처리(510)는 객체 검출 처리(514)의 결과 및 카메라 캘리브레이션 처리(512)가 수행된 입력 프레임에 기초하여 트래클렛 정보 생성 처리(516)를 수행한다. 트래클렛 정보 생성 처리(516)는 객체 검출 처리(514)의 결과 및 입력 프레임으로부터 제1 기계학습 모델 또는 제2 기계학습 모델에 입력되는 입력 벡터에 대응하는 트래클렛 정보(518)를 생성한다. Referring back to FIG. 5, the preprocessing 510 performs a tracklet information generation process 516 based on the result of the object detection process 514 and the input frame on which the camera calibration process 512 has been performed. The tracklet information generation process 516 generates tracklet information 518 corresponding to an input vector input to the first machine learning model or the second machine learning model from the result of the object detection process 514 and the input frame.

도 9는 본 개시의 일 실시 예에 따른 트래클렛 정보 생성 처리를 설명하기 위한 도면이다. 9 is a diagram for describing a process of generating tracklet information according to an embodiment of the present disclosure.

트래클렛 정보 생성 처리(516)는 객체 검출 결과 및 입력 프레임을 입력 받아, 피처 임베딩 처리(910)를 수행한다. 피처 임베딩 처리(910)는 객체 검출 결과 및 입력 프레임으로부터 트래클렛 정보(518)를 생성한다. 트래클렛 정보(518)는 제1 기계학습 모델 또는 제2 기계학습 모델에 대응하는 트래클렛 네트워크(930)로 입력되는 입력 벡터에 대응하는 정보이다. 트래클렛은 트레클렛 네트워크(930)에서 요구되는 형식으로 검출된 객체의 영역에 대한 영역 정보(924) 및 외관 정보(926)를 포함한다. 영역 정보(924)는 객체 영역에 대응하는 블록의 좌표(x, y), 폭(w), 및 높이(h)를 포함한다. 외관 정보(926)는 앞서 객체 검출 처리(514)에서 생성된 외관 정보를 포함한다. 트래클렛 정보(518)는 한 프레임 내의 각 객체에 대한 정보(922)를 포함할 수 있다. 트래클렛 정보(518)는 복수의 객체에 대한 정보를 소정의 형식으로 포함할 수 있다. The tracklet information generation process 516 receives an object detection result and an input frame, and performs a feature embedding process 910. The feature embedding process 910 generates tracklet information 518 from the object detection result and the input frame. The tracklet information 518 is information corresponding to an input vector input to the tracklet network 930 corresponding to the first machine learning model or the second machine learning model. The tracklet includes area information 924 and appearance information 926 for an area of an object detected in a format required by the treklet network 930. The area information 924 includes coordinates (x, y), width (w), and height (h) of a block corresponding to the object area. The appearance information 926 includes appearance information previously generated in the object detection process 514. The tracklet information 518 may include information 922 for each object in one frame. The tracklet information 518 may include information on a plurality of objects in a predetermined format.

다시 도 5를 참조하면, 트래클렛 정보 생성 처리(516)에 의해 생성된 트래클렛 정보(518)는 입력 프레임과 함께 기계학습 모델 처리(520)로 입력된다. 기계학습 모델 처리(520)는 앞서 설명한 제1 기계학습 모델 또는 제2 기계학습 모델에 의해 트래클렛 정보 및 입력 프레임을 처리한다. Referring back to FIG. 5, the tracklet information 518 generated by the tracklet information generation process 516 is input to the machine learning model process 520 together with an input frame. The machine learning model processing 520 processes tracklet information and input frames by the first machine learning model or the second machine learning model described above.

도 10은 본 개시의 일 실시 예에 따른 트래클렛 네트워크의 구조를 나타낸 도면이다. 10 is a diagram illustrating a structure of a tracklet network according to an embodiment of the present disclosure.

트래클렛 네트워크(1000)는 복수의 레이어를 포함하는 DNN 구조를 갖는다. 트래클렛 네트워크(100)는 CNN 구조와 RNN 구조의 조합을 포함할 수 있다. 트래클렛 네트워크(1000)는 입력 레이어(1010), 복수의 히든 레이어(1020), 및 출력 레이어(1030)를 포함한다. The tracklet network 1000 has a DNN structure including a plurality of layers. The tracklet network 100 may include a combination of a CNN structure and an RNN structure. The tracklet network 1000 includes an input layer 1010, a plurality of hidden layers 1020, and an output layer 1030.

입력 레이어(1010)는 적어도 하나의 레이어(1014)를 포함한다. 입력 레이어(1010)는 입력 벡터(1012) 및 입력 프레임을 입력 받아, 적어도 하나의 입력 피처 맵(1022)을 생성한다. The input layer 1010 includes at least one layer 1014. The input layer 1010 receives an input vector 1012 and an input frame, and generates at least one input feature map 1022.

적어도 하나의 피처 맵(1022)은 히든 레이어(1020)로 입력되어 처리된다. 히든 레이어(1020)는 소정의 기계학습 알고리즘에 의해 미리 학습되어 생성된다. 히든 레이어(1020)는 적어도 하나의 입력 피처 맵(1022)을 입력 받아, 소정의 액티베이션 처리, 풀링 처리, 선형 처리, 컨벌루션 처리 등을 수행하여, 적어도 하나의 출력 피처 맵(1026)을 생성한다. 일 실시 예에 따르면, 히든 레이어(1020)는 각각의 입력 피처 맵(1022)에 대응하는 처리 레이어를 포함할 수 있다. 피처 맵(1026)은 추가적인 처리를 통해 출력 벡터(1028)의 형태로 변환된다. At least one feature map 1022 is input to the hidden layer 1020 and processed. The hidden layer 1020 is generated by learning in advance by a predetermined machine learning algorithm. The hidden layer 1020 receives at least one input feature map 1022 and performs predetermined activation processing, pooling processing, linear processing, convolution processing, etc. to generate at least one output feature map 1026. According to an embodiment, the hidden layer 1020 may include a processing layer corresponding to each input feature map 1022. The feature map 1026 is converted into an output vector 1028 through additional processing.

출력 벡터(1028)는 출력 레이어(1030)를 통해 트래클렛 네트워크로부터 출력된다. The output vector 1028 is output from the tracklet network through the output layer 1030.

교통 정보 수집 서버(130)에 포함되는 제2 기계학습 모델은 트래클렛 네트워크(1000)를 포함한다. 제1 기계학습 모델은 트래클렛 네트워크(1000) 구조에 적어도 하나의 바이패스 경로(1040)가 추가된 구조를 포함한다. 바이패스 경로(1040)의 시작 점과 끝점, 바이패스 경로(1040)의 개수는 실시 예에 따라 달라질 수 있다. 바이패스 경로(1040)가 배치되면, 바이패스 경로의 시작 점에 대응하는 레이어의 출력 값이 바이패스 경로의 끝 점에 대응하는 레이어로 전달되고, 바이패스 경로(1040) 중간의 레이어들에 의한 처리는 수행되지 않고 바이패스 된다. 제1 기계학습 모델은 제2 기계학습 모델의 트래클렛 네트워크(1000)에 바이패스 경로(1040)를 적용하여, 처리량을 줄이고 처리 속도를 현저하게 향상시킬 수 있는 효과가 있다. The second machine learning model included in the traffic information collection server 130 includes a tracklet network 1000. The first machine learning model includes a structure in which at least one bypass path 1040 is added to the structure of the tracklet network 1000. The start and end points of the bypass path 1040 and the number of bypass paths 1040 may vary according to exemplary embodiments. When the bypass path 1040 is arranged, the output value of the layer corresponding to the start point of the bypass path is transmitted to the layer corresponding to the end point of the bypass path, and the layers in the middle of the bypass path 1040 Processing is bypassed without being performed. The first machine learning model applies the bypass path 1040 to the tracklet network 1000 of the second machine learning model, thereby reducing throughput and remarkably improving processing speed.

다시 도 5를 참조하면, 기계학습 모델 처리(520)에 의해 생성된 각 입력 프레임의 객체의 식별 정보, 속도 정보, 및 트래클렛 정보(518)는 후 처리된다(530). 후처리(530)는 그래프 모델 생성 처리(532) 및 클러스터링 처리(536)를 포함한다. Referring back to FIG. 5, object identification information, velocity information, and tracklet information 518 of each input frame generated by the machine learning model processing 520 are post-processed (530). The post-processing 530 includes a graph model generation process 532 and a clustering process 536.

도 11은 본 개시의 일 실시 예에 따른 그래프 모델을 나타낸 도면이다.11 is a diagram illustrating a graph model according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면, 그래프 모델 생성 처리(532)는 객체의 식별 정보, 속도 정보, 및 트래클렛 정보로부터 그래프 모델(1100)을 생성한다. 그래프 모델(1100)은 적어도 하나의 노드(1110) 및 노드(1110) 사이를 연결하는 적어도 하나의 연결선(1120)를 포함한다. 각 노드(1110)에는 동일 객체에 대한 각 프레임에서의 정보가 저장된다. 그래프 모델 생성 처리(532)는 기계학습 모델에 의해 생성된 객체의 식별 정보에 기초하여 각 노드에 동일 객체에 대한 정보를 식별하여 저장할 수 있다. 예를 들면, 각 노드(1110)는 동일 객체에 대해, 각 프레임에서의 객체 영역 정보, 속도 정보, 또는 영상 데이터(1112) 중 적어도 하나 또는 이들의 조합을 포함할 수 있다.According to an embodiment of the present disclosure, the graph model generation process 532 generates a graph model 1100 from object identification information, speed information, and tracklet information. The graph model 1100 includes at least one node 1110 and at least one connection line 1120 connecting the nodes 1110. Each node 1110 stores information about the same object in each frame. The graph model generation process 532 may identify and store information on the same object in each node based on identification information of the object generated by the machine learning model. For example, each node 1110 may include at least one of object region information, speed information, or image data 1112 in each frame, or a combination thereof, for the same object.

각 노드(1110) 사이에 배치된 연결선(1120)는 각 객체 간의 관련성을 나타낸다. 관련성은 각 객체 간의 경로의 유사성을 나타내는 정보이다. 그래프 모델(1100)은 트래클렛 정보(V)와 관련성 정보(E)의 함수인 g(V,E)로 나타낼 수 있다.A connection line 1120 disposed between each node 1110 indicates a relationship between each object. Relevance is information indicating the similarity of paths between objects. The graph model 1100 may be represented by g(V,E), which is a function of tracklet information V and relevance information E.

도 12는 본 개시의 일 실시 예에 따라 객체 간의 관련성을 산출하는 처리를 나타낸 도면이다.12 is a diagram illustrating a process of calculating a relationship between objects according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면, 후처리(530)는 관련성 산출 처리(1200)를 포함한다. 관련성 산출 처리(1200)는 트래클렛 정보로부터 서로 다른 두 객체의 트래클렛 정보를 선택하여, 트래클렛 네트워크(1220)로 입력한다. 선택된 서로 다른 두 객체의 트래클렛 정보는 각 객체에 대해 복수의 프레임으로부터 추출한 트래클렛 정보를 포함할 수 있다.According to an embodiment of the present disclosure, the post-processing 530 includes a relevance calculation process 1200. The relevance calculation processing 1200 selects tracklet information of two different objects from the tracklet information and inputs it to the tracklet network 1220. The tracklet information of two different selected objects may include tracklet information extracted from a plurality of frames for each object.

트래클렛 네트워크(1220)는 서로 다른 두 객체에 대해, 경로의 유사도를 산출하여, 관련성 정보(E)를 생성한다. 관련성 정보(E)가 높으면 두 객체의 경로의 유사도가 높고, 관련성 정보(E)가 낮으면 두 객체의 경로의 유사도가 낮다고 판단할 수 있다. 관련성 정보(E)는 0 이상 1 이하의 값으로 정의될 수 있다. 만약 동일 객체에 대한 트래클렛 정보가 트래클렛 네트워크로 입력되면, 관련성 정보(E)는 1이 나오고, 서로 다른 객체의 트래클렛 정보가 트래클렛 네트워크(1220)로 입력되면, 관련성 정보(E)는 1보다 작은 값은 값으로 나올 수 있다.The tracklet network 1220 calculates a degree of similarity of paths for two different objects, and generates correlation information E. When the relevance information E is high, the similarity of the paths of the two objects is high, and when the relevance information E is low, it may be determined that the similarity of the paths of the two objects is low. The relevance information E may be defined as a value of 0 or more and 1 or less. If tracklet information for the same object is input to the tracklet network, the relevance information (E) is 1, and if tracklet information of different objects is input to the tracklet network 1220, the correlation information (E) is Values less than 1 can be displayed as values.

도 13은 본 개시의 일 실시 예에 따른 그래프 모델 생성 처리 및 클러스터링 처리를 나타낸 도면이다. 13 is a diagram illustrating a graph model generation process and a clustering process according to an embodiment of the present disclosure.

그래프 모델 생성 처리(532)는 앞서 설명한 바와 같이 트래클렛 정보 및 관련성 정보(E)에 기초하여 그래프 모델 생성 처리를 수행한다. 그래프 모델 생성 처리(532)는 관련성 정보(E)에 기초하여 각 노드 사이의 연결선의 속성을 변경할 수 있다. 예를 들면, 그래프 모델 생성 처리(532)는 연결선의 두께를 변경하거나 연결선의 색깔을 변경하여 관련성 정보(E)를 연결선에 나타낼 수 있다. 또한, 그래프 모델 생성 처리(532)는 관련성이 있는 노드를 인접하게 배치하여 관련성 정보(E)에 의해 각 노드를 연결하고, 관련성이 없는 노드는 멀리 배치하여 관련성이 없는 노드 사이에는 연결선을 배치하지 않을 수 있다.As described above, the graph model generation process 532 performs a graph model generation process based on the tracklet information and the relevance information E. The graph model generation process 532 may change a property of a connection line between each node based on the relationship information E. For example, the graph model generation process 532 may change the thickness of the connection line or change the color of the connection line to display the relevance information E on the connection line. In addition, the graph model generation process 532 arranges related nodes adjacent to each other and connects each node according to the relevance information (E), and the non-relevant nodes are placed away so that connection lines are not placed between unrelated nodes. May not.

클러스터링 처리(534)는 관련성 정보(E)에 기초하여 객체들을 클러스터링할 수 있다. 예를 들면, 클러스터링 처리(534)는 관련성이 높은 객체들을 하나의 클러스터로 정의할 수 있다. The clustering process 534 may cluster objects based on the relevance information E. For example, the clustering process 534 may define objects with high relevance as one cluster.

도 14는 본 개시의 일 실시 예에 따른 결과 영상을 나타낸 도면이다. 14 is a diagram illustrating a result image according to an embodiment of the present disclosure.

본 개시의 일 실시 예에 따르면 입력 프레임에 객체의 식별 정보 및 속도 정보를 삽입할 수 있다. 결과 영상(1400)은 객체 영역을 나타내는 블록(1410), 속도 정보(1420), 및 경로 정보(1430)를 포함할 수 있다. 객체의 식별 정보는 블록(1410)의 색깔에 의해 나타낼 수 있다. 결과 영상(1400)은 검출된 적어도 하나의 객체들 각각에 대해 블록(1410), 속도 정보(1420), 및 경로 정보(1430)를 나타낼 수 있다. 경로 정보(1430)는 미리 설정된 소정의 시간 구간 동안 추적된 경로 정보를 나타낼 수 있다.According to an embodiment of the present disclosure, identification information and speed information of an object may be inserted into an input frame. The resulting image 1400 may include a block 1410 indicating an object area, speed information 1420, and path information 1430. The object identification information may be represented by the color of the block 1410. The resulting image 1400 may represent a block 1410, speed information 1420, and path information 1430 for each of the detected at least one object. The route information 1430 may indicate route information tracked during a preset predetermined time period.

결과 영상(1400)은 입력 동영상의 모든 프레임에 대해 생성되거나, 교통 정보 수집 서버에 의해 이벤트가 검출된 관심 프레임에 대해서만 생성될 수 있다. 제2 프로세서(134)는 결과 영상(1400)을 생성하여 출력 인터페이스(136)를 통해 출력할 수 있다.The resulting image 1400 may be generated for all frames of the input video, or may be generated only for a frame of interest in which an event is detected by the traffic information collection server. The second processor 134 may generate the result image 1400 and output it through the output interface 136.

한편, 개시된 실시 예들은 컴퓨터에 의해 실행 가능한 명령어 및 데이터를 저장하는 컴퓨터로 읽을 수 있는 기록매체의 형태로 구현될 수 있다. 상기 명령어는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 소정의 프로그램 모듈을 생성하여 소정의 동작을 수행할 수 있다. 또한, 상기 명령어는 프로세서에 의해 실행되었을 때, 개시된 실시 예들의 소정의 동작들을 수행할 수 있다. Meanwhile, the disclosed embodiments may be implemented in the form of a computer-readable recording medium that stores instructions and data executable by a computer. The instruction may be stored in the form of a program code, and when executed by a processor, a predetermined program module may be generated to perform a predetermined operation. Further, when the command is executed by a processor, certain operations of the disclosed embodiments may be performed.

이상에서와 같이 첨부된 도면을 참조하여 개시된 실시 예들을 설명하였다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고도, 개시된 실시 예들과 다른 형태로 본 발명이 실시될 수 있음을 이해할 것이다. 개시된 실시 예들은 예시적인 것이며, 한정적으로 해석되어서는 안 된다.As described above, the disclosed embodiments have been described with reference to the accompanying drawings. Those of ordinary skill in the art to which the present invention pertains will understand that the present invention may be practiced in a form different from the disclosed embodiments without changing the technical spirit or essential features of the present invention. The disclosed embodiments are exemplary and should not be construed as limiting.

100 교통 정보 수집 시스템
110, 110a 엣지 컴퓨팅 장치
114 제1 프로세서
130 교통 정보 수집 서버
134 제2 프로세서
150 카메라
226 제1 기계학습 모델
228 제1 속도 분석부
242 제2 기계학습 모델
244 제2 속도 분석부100 traffic information collection system
110, 110a edge computing devices
114 first processor
130 traffic information collection server
134 second processor
150 cameras
226 first machine learning model
228 first speed analysis unit
242 second machine learning model
244 2nd speed analysis unit

Claims

An input interface for obtaining an input video captured by the camera;
At least one first processor; And
Including a communication unit that communicates with the traffic information collection server,
The at least one first processor,
By inputting an input frame with a frame rate lower than the frame rate of the input video into a first machine learning model, identification information and speed information of at least one object detected in each frame are obtained from the first machine learning model,
On the basis of identification information and speed information of each of the at least one object output from the first machine learning model, detect an event of detecting an object of interest in which the speed of the object exceeds a first reference value,
Using the communication unit, event information and the input video, which are information on the detected event, are transmitted to the traffic information collection server,
The traffic information collection server includes a second machine learning model that receives the input video and generates identification information and speed information of each of at least one object,
The first machine learning model is a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

delete

The method of claim 1,
The edge computing device, wherein the first machine learning model receives an input frame having a lower frame rate than the input frame input to the second machine learning model.

The method of claim 1,
The edge computing device further comprises a video buffer for storing the input video,
The communication unit transmits the input video stored in the video buffer to the traffic information collection server.

The method of claim 1,
The at least one first processor obtains meta information including at least one of signal information, stop line information, or lane information, or a combination thereof,
When detecting the event, the edge computing device for detecting the event based on the identification information, the speed information, and the meta information of each of the at least one object.

The method of claim 1,
The at least one first processor,
Performs a process of generating tracklet information defining an object location and an object area from the input frame,
Inputting the tracklet information into the first machine learning model,
Acquiring identification information and the speed information of the at least one object output from the first machine learning model,
Using the identification information of the at least one object and the speed information, a graph model including information on each of the at least one object and relation information with another object is generated,
An edge computing device that generates cluster information of the at least one object by using the relationship information between the graph model and the at least one object.

The method of claim 6,
The at least one first processor, based on the cluster information, performs event detection processing of another object in the same cluster by using the result of event detection processing for one object.

The method of claim 1,
The at least one first processor,
When the event is not detected from the input frame, event non-detection information indicating that the event has not been detected is generated,
When the event is not detected, the event non-detection information is transmitted to the traffic information collection server at a predetermined period.

Acquiring an input video captured by a camera;
Inputting an input frame having a frame rate lower than the frame rate of the input video into a first machine learning model, and obtaining identification information and speed information of at least one object detected in each frame from the first machine learning model;
Detecting an event in which the speed of the object exceeds a first reference value, based on identification information and speed information of each of the at least one object output from the first machine learning model; And
Transmitting event information and the input video, which are information on the detected event, to a traffic information collection server,
The traffic information collection server includes a second machine learning model that receives the input video and generates identification information and speed information of each of at least one object,
The first machine learning model is a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

A communication unit communicating with at least one edge computing device;
At least one second processor; And
Includes an output interface,
The at least one second processor,
Receive event information including information on an event in which the speed of an object exceeds a first reference value, and an input video from the at least one edge computing device through the communication unit,
Extracting an interest frame section corresponding to the event information from the input video,
By inputting the input frame of the frame of interest into a second machine learning model, identification information and speed information of at least one object detected in each frame are obtained from the second machine learning model,
Based on identification information and speed information of each of the at least one object output from the second machine learning model, an event of detecting an object of interest in which the speed of the object exceeds a second reference value is detected,
Outputting event information, which is information on an event of detecting an object of interest in which the speed of the object exceeds a second reference value, and image information of the frame of interest, through the output interface,
The at least one edge computing device includes a first machine learning model that receives the input video and generates identification information and speed information of each of at least one object,
The traffic information collection server, wherein the first machine learning model is a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

delete

The method of claim 10,
The traffic information collection server, wherein the first machine learning model receives an input frame having a lower frame rate than the input frame input to the second machine learning model.

The method of claim 10,
The at least one second processor acquires meta information including at least one of signal information, stop line information, or lane information, or a combination thereof,
When detecting the event, the traffic information collection server for detecting the event based on the identification information, the speed information, and the meta information of each of the at least one object.

The method of claim 10,
The at least one second processor,
Performs a process of generating tracklet information defining an object location and an object area from the input frame,
Inputting the tracklet information into the second machine learning model,
Acquire identification information and the speed information of the at least one object output from the second machine learning model,
Using the identification information of the at least one object and the speed information, a graph model including information on each of the at least one object and relation information with another object is generated,
The traffic information collection server for generating cluster information of the at least one object by using the relationship information between the graph model and the at least one object.

The method of claim 14,
The at least one second processor, based on the cluster information, performs event detection processing of another object in the same cluster using a result of event detection processing for one object.

The method of claim 10,
The communication unit receives event non-detection information indicating that no event has been detected from the at least one edge computing device,
The at least one second processor outputs the event non-detection information through the output interface.

The method of claim 10,
The at least one second processor,
When the event is not detected from the frame of interest, event non-detection information indicating that the event has not occurred is generated,
Traffic information collection server for outputting the event undetected information through the output interface.

Receiving, from at least one edge computing device, event information including information on an event in which the speed of the object exceeds a first reference value, and an input video;
Extracting a frame of interest corresponding to the event information from the input video;
Inputting the input frame of the frame of interest into a second machine learning model, and obtaining identification information and speed information of at least one object detected in each frame from the second machine learning model;
Detecting an event of detecting an object of interest whose speed exceeds a second reference value, based on identification information and speed information of each of the at least one object output from the second machine learning model; And
And outputting event information, which is information on an event of detecting an object of interest in which the speed of the object exceeds a second reference value, and image information of the frame of interest,
The at least one edge computing device includes a first machine learning model that receives the input video and generates identification information and speed information of each of at least one object,
The first machine learning model is a machine learning model in which a bypass path is applied to at least one layer of the second machine learning model.

A computer program stored in a recording medium, wherein the computer program includes at least one instruction for performing the method of controlling the edge computing device of claim 9 when executed by a processor.

A computer program stored in a recording medium, wherein the computer program comprises at least one instruction for performing the traffic information collection server control method of claim 18 when executed by a processor.