KR20230138385A

KR20230138385A - Object counting system utilizing multi-object tracking

Info

Publication number: KR20230138385A
Application number: KR1020220144027A
Authority: KR
Inventors: 원형필; 전현일; 오재현; 정성훈; 김덕용; 이승수
Original assignee: 주식회사 일루베이션
Priority date: 2022-03-23
Filing date: 2022-11-01
Publication date: 2023-10-05
Also published as: KR102604809B9; KR102604809B1

Abstract

본 발명은 다중 객체 추적을 활용한 객체 계수 시스템에 관한 것으로, 본 발명에 따르면, 이동식 또는 비이동식으로 설치되어, 계수하기 위한 객체를 촬영하는 촬영부 및 상기 촬영부로부터 객체가 촬영된 영상을 수신 받아, 상기 영상으로부터 일정 시간 간격으로 복수의 프레임을 추출하고, 추출된 복수의 프레임에서 각각 하나 이상의 객체를 검출하고 검출된 객체를 추적하여 객체 수를 카운팅 하는 객체 카운팅 장치를 포함하되, 상기 객체 카운팅 장치는 상기 객체 추적 시, 복수의 프레임에서 현재 프레임의 객체를 기준으로 이전 프레임의 객체를 추적하는 것을 특징으로 하는 다중 객체 추적을 활용한 객체 계수 시스템을 제공할 수 있다.The present invention relates to an object counting system utilizing multi-object tracking. According to the present invention, a photographing unit installed in a mobile or non-movable manner to photograph an object to be counted, and receiving an image of an object captured from the photographing unit. An object counting device that receives the image, extracts a plurality of frames at regular time intervals, detects one or more objects in each of the extracted plurality of frames, tracks the detected objects, and counts the number of objects, wherein the object counting device When tracking the object, the device may provide an object counting system utilizing multiple object tracking, which is characterized in that it tracks the object of the previous frame based on the object of the current frame in a plurality of frames.

Description

Object counting system utilizing multi-object tracking}

본 발명은 다중 객체 추적을 활용한 객체 계수 시스템에 관한 것으로, 더욱 자세하게는 계수를 하고자 하는 객체가 촬영된 영상을 기반으로 현재 시점을 기준으로 이전 시점의 객체를 추적하는 다중 객체 추적 기술을 활용함으로써, 정확한 계수가 가능할 뿐만 아니라 추적 과정이 간소화되어 효율이 우수한 다중 객체 추적을 활용한 객체 계수 시스템에 관한 것이다.The present invention relates to an object counting system using multi-object tracking. More specifically, the present invention relates to an object counting system using multi-object tracking. , It is about an object counting system utilizing multi-object tracking that not only enables accurate counting but also has excellent efficiency by simplifying the tracking process.

최근 4차 산업혁명에 따라 데이터를 활용한 ICT 기술을 축산업 농가에 활용하여, 인력 부족 및 고령화 문제를 해결하고 생산성을 향상시키고자 하는 노력이 이루어지고 있다.Recently, in accordance with the 4th Industrial Revolution, efforts are being made to solve the problems of manpower shortage and aging and improve productivity by utilizing data-based ICT technology for livestock farms.

한편, 양돈 농가의 경우, 돼지의 성장단계에 따라서 포유돈, 자돈, 육성돈, 비육돈으로 구분하며, 성장 효율을 높이기 위해 각 단계별로 돈사를 구분하여 관리하고 있다.Meanwhile, in the case of pig farms, pigs are divided into lactation pigs, piglets, raised pigs, and fattening pigs depending on the growth stage, and pig houses are managed separately for each stage to increase growth efficiency.

여기서 4주된 자돈(약 30kg)은 두당 12~13만원에 다른 양돈농가로 판매되고 있는데, 한 번에 200두 이상의 자돈이 판매되며, 판매 시 자돈을 계수하여 가격을 계산하고 있다.Here, 4-week-old piglets (about 30 kg) are sold to other pig farms for 120,000 to 130,000 won per head. More than 200 piglets are sold at a time, and the price is calculated by counting the piglets at the time of sale.

200두 이상의 다수의 자돈을 계수할 경우 1~2시간 이상의 소요시간이 필요하며, 활동성이 강한 자돈 특성상 정확성을 위해 두 번 이상의 계수가 필요하고, 계수할 때마다 매번 마릿수가 다르게 기록되어 공급자와 수요자간의 이격이 자주 발생하는 문제가 있다.When counting more than 200 piglets, it takes more than 1 to 2 hours, and due to the highly active nature of piglets, more than two counts are required for accuracy, and the number of piglets is recorded differently each time, making it difficult for suppliers and consumers. There is a problem where space between the livers occurs frequently.

또한 정확한 계수가 어려워 마릿수의 오차로 몇 십 만원 이상의 금액 차이가 발생하여, 양돈농가의 손실이 생기는 문제가 있다.In addition, since accurate counting is difficult, errors in the number of pigs can result in differences of several hundred thousand won or more, resulting in losses for pig farms.

따라서, 자동 계수가 가능하며 정확성이 우수한 카운팅 기술에 대한 개발이 필요하다.Therefore, there is a need to develop a counting technology that enables automatic counting and has excellent accuracy.

한국등록특허 제10-2129042호 머신러닝 기반 영상 내의 대상 카운팅 장치 및 방법 (등록일자 2020.06.25)Korean Patent No. 10-2129042 Machine learning-based object counting device and method in video (registration date 2020.06.25)

상기와 같은 문제를 해결하고자, 본 발명은 계수를 하고자 하는 객체가 촬영된 영상을 기반으로 현재 시점을 기준으로 이전 시점의 객체를 추적하는 다중 객체 추적 기술을 활용함으로써, 정확한 계수가 가능할 뿐만 아니라 추적 과정이 간소화되어 효율이 우수한 다중 객체 추적을 활용한 객체 계수 시스템을 제공하는데 목적이 있다.In order to solve the above problem, the present invention utilizes a multi-object tracking technology that tracks objects at previous viewpoints based on the current viewpoint based on images in which the object to be counted is captured, thereby enabling accurate counting as well as tracking. The purpose is to provide an object counting system using multi-object tracking that simplifies the process and is highly efficient.

상기와 같은 과제를 해결하기 위하여, 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템은 이동식 또는 비이동식으로 설치되어, 계수하기 위한 객체를 촬영하는 촬영부 및 상기 촬영부로부터 객체가 촬영된 영상을 수신 받아, 상기 영상으로부터 일정 시간 간격으로 복수의 프레임을 추출하고, 추출된 복수의 프레임에서 각각 하나 이상의 객체를 검출하고 검출된 객체를 추적하여 객체 수를 카운팅 하는 객체 카운팅 장치를 포함하되, 상기 객체 카운팅 장치는, 상기 객체 추적 시, 복수의 프레임에서 현재 프레임의 객체를 기준으로 이전 프레임의 객체를 추적하는 것을 특징으로 하는 다중 객체 추적을 활용한 객체 계수 시스템을 제공할 수 있다.In order to solve the above problems, the object counting system utilizing multi-object tracking according to an embodiment of the present invention is installed in a mobile or non-movable manner, and includes a photographing unit that photographs objects to be counted, and objects that are captured from the photographing unit. It includes an object counting device that receives captured images, extracts a plurality of frames from the images at regular time intervals, detects one or more objects in each of the extracted multiple frames, tracks the detected objects, and counts the number of objects. However, the object counting device may provide an object counting system utilizing multiple object tracking, wherein when tracking the object, the object of the previous frame is tracked based on the object of the current frame in a plurality of frames.

또한 객체 카운팅 정보를 출력하는 출력부를 더 포함할 수 있다.Additionally, it may further include an output unit that outputs object counting information.

여기서 상기 객체 카운팅 장치는, 상기 영상으로부터 일정 시간 간격으로 복수의 프레임을 추출하는 프레임 추출부; 상기 프레임별로 바운딩 박스(bounding box)를 통해 객체를 검출하고, 검출된 객체에 식별정보를 부여하는 객체 검출부; 상기 현재 프레임의 객체에 대해 상기 이전 프레임의 객체에서 동일한 객체를 추정하여 매칭시켜 객체 추적정보를 생성하는 객체 추적부 및 상기 객체 추적정보를 이용하여 상기 객체 수를 카운팅 하는 객체 카운팅부를 포함할 수 있다.Here, the object counting device includes a frame extractor that extracts a plurality of frames from the image at regular time intervals; an object detection unit that detects an object through a bounding box for each frame and provides identification information to the detected object; It may include an object tracking unit that generates object tracking information by estimating and matching the same object from the object in the previous frame with respect to the object of the current frame, and an object counting unit that counts the number of objects using the object tracking information. .

또한 상기 객체 추적부는, 상기 현재 프레임과 이전 프레임에서 각각 검출된 객체에 따른 객체 영역을 추출하여 현재 객체 특징맵과 이전 객체 특징맵을 생성하는 객체 임베딩부; 객체에 대한 상기 현재 객체 특징맵과 이전 객체 특징맵을 이용하여 상기 현재 프레임과 이전 프레임의 객체 간의 매칭점수를 추출하는 매칭점수 추출부 및 상기 매칭점수를 기반으로 상기 현재 프레임의 객체에 대해 동일한 객체로 추정되는 상기 이전 프레임의 객체를 매칭시켜 상기 객체 추적정보를 생성하는 객체 매칭부를 포함할 수 있다.In addition, the object tracking unit includes an object embedding unit that generates a current object feature map and a previous object feature map by extracting object areas according to objects detected in the current frame and the previous frame, respectively; A matching score extractor that extracts a matching score between an object in the current frame and a previous frame using the current object feature map and the previous object feature map for the object, and an object that is the same as the object in the current frame based on the matching score. It may include an object matching unit that generates the object tracking information by matching the object of the previous frame, which is estimated to be .

또한 상기 객체 임베딩부는, 상기 현재 프레임과 이전 프레임에서 각각 상기 바운딩 박스를 이용하여 객체 영역을 추출하고, 추출된 객체 영역을 복수의 패치(Patch)로 나누는 패치 분할부; 상기 패치별로 특징을 추출하여 특징맵을 생성하는 특징 추출부 및 상기 패치별 특징맵을 하나의 특징맵으로 압축하되, 상기 현재 프레임에 대해서는 상기 현재 객체 특징맵, 상기 이전 프레임에 대해서는 상기 이전 객체 특징맵으로 생성하는 차원 축소부를 포함할 수 있다.In addition, the object embedding unit may include a patch division unit that extracts an object area from the current frame and the previous frame using the bounding box, and divides the extracted object area into a plurality of patches; A feature extraction unit that extracts features for each patch and generates a feature map, and compresses the feature maps for each patch into one feature map, wherein the current object feature map for the current frame and the previous object feature for the previous frame It may include a dimension reduction unit created as a map.

또한 상기 매칭점수 추출부는, 상기 현재 프레임의 객체별로 상기 이전 프레임의 객체 간과의 매칭 점수를 하기 수학식 1을 통해 추출하여, 상기 현재 프레임의 객체별 매칭 점수 데이터(eⁱ)를 생성하는 것을 특징으로 한다.In addition, the matching score extraction unit extracts a matching score between objects of the previous frame for each object of the current frame through Equation 1 below, and generates matching score data (e ⁱ ) for each object of the current frame. Do it as

[수학식 1][Equation 1]

[수학식 2][Equation 2]

(여기서, Q_i는 현재 프레임의 i번째 객체, K_j는 이전 프레임의 j번째 객체, n은 총 객체 수이고, T는 전치를 의미함.)(Here, Q _i is the i-th object of the current frame, K _j is the j-th object of the previous frame, n is the total number of objects, and T means transpose.)

또한 상기 객체 매칭부는, 상기 현재 프레임의 객체에 대해 하기 수학식 3 및 4를 통해 상기 매칭점수가 높은 상기 이전 프레임의 객체를 매칭시키는 것을 특징으로 한다.In addition, the object matching unit is characterized in that it matches the object of the previous frame with the high matching score with the object of the current frame through Equations 3 and 4 below.

[수학식 3][Equation 3]

(여기서, wⁱ는 현재 프레임의 i번째 객체의 확률값, eⁱ는 현재 프레임의 i번째 객체의 매칭 점수 데이터임.)(Here, w ⁱ is the probability value of the ith object of the current frame, and e ⁱ is the matching score data of the ith object of the current frame.)

[수학식 4][Equation 4]

(여기서, O_i는 현재 프레임의 i번째 객체에 매칭되는 이전 프레임의 객체, w_i는 현재 프레임의 i번째 객체의 확률값, V_j는 이전 프레임의 j번째 객체임.)(Here, O _i is the object of the previous frame that matches the ith object of the current frame, w _i is the probability value of the ith object of the current frame, and V _j is the jth object of the previous frame.)

또한 상기 객체 카운팅부는, 관찰영역과 계수영역을 구분하여, 상기 객체 추적정보를 통해 상기 관찰영역에서 계수영역으로 넘어가는 객체에 대해 카운팅하는 것을 특징으로 한다.In addition, the object counting unit distinguishes between an observation area and a counting area and counts objects that pass from the observation area to the counting area through the object tracking information.

상기와 같은 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템은 계수를 하고자 하는 객체가 촬영된 영상을 기반으로 현재 시점을 기준으로 이전 시점의 객체를 추적하는 다중 객체 추적 기술을 활용함으로써, 동일 개체의 중복 계수를 방지하여 정확한 계수가 가능할 수 있다.The object counting system using multi-object tracking according to the embodiment of the present invention as described above utilizes multi-object tracking technology that tracks objects at previous viewpoints based on the current viewpoint based on images in which the object to be counted is captured. By doing so, duplicate counting of the same entity can be prevented and accurate counting can be possible.

이뿐만 아니라 현재 시점을 기준으로 이전 시점의 객체를 추적함으로써, 이전(과거) 시점을 기준으로 현재 시점에 대한 객체를 추적하는 것과 달리 새로 발견되는 객체가 없어 추적 과정이 간소화 될 수 있고, 이에 효율이 우수할 수 있다.In addition, by tracking objects at a previous point in time based on the current point in time, unlike tracking objects for the current point in time based on a previous (past) point in time, the tracking process can be simplified as there are no newly discovered objects, thereby making it more efficient. This can be excellent.

한편, 이동하는 객체와 이동하지 않는 객체에 모두 적용 가능하여, 돼지 등 가축 외 다양한 분야에 활용될 수 있을 것으로 기대된다.Meanwhile, it can be applied to both moving and non-moving objects, so it is expected to be used in various fields other than livestock such as pigs.

도 1은 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템을 나타낸 블록도.
도 2는 도 1의 객체 카운팅 장치를 나타낸 블록도.
도 3은 도 2의 객체 추적부에서 현재 프레임의 객체를 기준으로 이전 프레임의 객체가 매칭되는 것을 나타낸 예시도.
도 4는 도 2의 객체 추적부를 나타낸 블록도.
도 5는 도 4의 객체 임베딩부를 나타낸 블록도.
도 6은 도 2의 객체 추적부에서 사용되는 현재 프레임과 이전 프레임의 객체를 나타낸 예시도.
도 7은 도 4의 매칭점수 추출부에서 현재 프레임과 이전 프레임의 객체간의 매칭점수를 도출하는 과정을 나타낸 예시도.
도 8은 도 2의 객체 카운팅부에서 객체를 카운팅 할 시 이용하는 관찰영역과 계수영역을 나타낸 예시도.Figure 1 is a block diagram showing an object counting system utilizing multi-object tracking according to an embodiment of the present invention.
Figure 2 is a block diagram showing the object counting device of Figure 1.
FIG. 3 is an example diagram showing that an object in a previous frame is matched based on an object in the current frame in the object tracking unit of FIG. 2.
Figure 4 is a block diagram showing the object tracking unit of Figure 2.
Figure 5 is a block diagram showing the object embedding unit of Figure 4.
FIG. 6 is an example diagram showing objects of the current frame and the previous frame used in the object tracking unit of FIG. 2.
Figure 7 is an example diagram showing the process of deriving a matching score between objects in the current frame and the previous frame in the matching score extraction unit of Figure 4.
Figure 8 is an example diagram showing the observation area and counting area used when counting objects in the object counting unit of Figure 2.

이하, 도면을 참조한 본 발명의 설명은 특정한 실시 형태에 대해 한정되지 않으며, 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있다. 또한, 이하에서 설명하는 내용은 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Hereinafter, the description of the present invention with reference to the drawings is not limited to specific embodiments, and various changes may be made and various embodiments may be possible. In addition, the content described below should be understood to include all conversions, equivalents, and substitutes included in the spirit and technical scope of the present invention.

이하의 설명에서 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용되는 용어로서, 그 자체에 의미가 한정되지 아니하며, 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.In the following description, the terms first, second, etc. are terms used to describe various components, and their meaning is not limited, and is used only for the purpose of distinguishing one component from other components.

본 명세서 전체에 걸쳐 사용되는 동일한 참조번호는 동일한 구성요소를 나타낸다.Like reference numerals used throughout this specification refer to like elements.

본 발명에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 또한, 이하에서 기재되는 "포함하다", "구비하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것으로 해석되어야 하며, 하나 또는 그 이상의 다른 특징들이나, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.As used herein, singular expressions include plural expressions, unless the context clearly dictates otherwise. In addition, terms such as “comprise,” “provide,” or “have” used below are intended to designate the presence of features, numbers, steps, operations, components, parts, or a combination thereof described in the specification. It should be construed and understood as not excluding in advance the possibility of the presence or addition of one or more other features, numbers, steps, operations, components, parts or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 갖고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 갖는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person of ordinary skill in the technical field to which the present invention pertains. Terms such as those defined in commonly used dictionaries should be interpreted as having meanings consistent with the meanings they have in the context of the related technology, and should not be interpreted as having ideal or excessively formal meanings, unless explicitly defined in the present application. No.

또한, 명세서에 기재된 "…부", "…기", "…모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as “…unit”, “…unit”, and “…module” used in the specification refer to a unit that processes at least one function or operation, which may be implemented through hardware or software or a combination of hardware and software. You can.

이하, 첨부된 도면을 참조하여 본 발명의 실시 예에 따른 다중 객체 추적을 활용한 객체 계수 시스템을 상세히 살펴보기로 한다.Hereinafter, an object counting system utilizing multi-object tracking according to an embodiment of the present invention will be examined in detail with reference to the attached drawings.

도 1은 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템을 나타낸 블록도이고, 도 2는 도 1의 객체 카운팅 장치를 나타낸 블록도이고, 도 3은 도 2의 객체 추적부에서 현재 프레임의 객체를 기준으로 이전 프레임의 객체가 매칭되는 것을 나타낸 예시도이고, 도 4는 도 2의 객체 추적부를 나타낸 블록도이고, 도 5는 도 4의 객체 임베딩부를 나타낸 블록도이고, 도 6은 도 2의 객체 추적부에서 사용되는 현재 프레임과 이전 프레임의 객체를 나타낸 예시도이고, 도 7은 도 4의 매칭점수 추출부에서 현재 프레임과 이전 프레임의 객체간의 매칭점수를 도출하는 과정을 나타낸 예시도이며, 도 8은 도 2의 객체 카운팅부에서 객체를 카운팅 할 시 이용하는 관찰영역과 계수영역을 나타낸 예시도이다.Figure 1 is a block diagram showing an object counting system utilizing multiple object tracking according to an embodiment of the present invention, Figure 2 is a block diagram showing the object counting device of Figure 1, and Figure 3 is an object tracking unit of Figure 2. This is an example diagram showing that an object in the previous frame is matched based on an object in the current frame, FIG. 4 is a block diagram showing the object tracking unit of FIG. 2, FIG. 5 is a block diagram showing the object embedding unit of FIG. 4, and FIG. 6 is an example diagram showing the objects of the current frame and the previous frame used in the object tracking unit of Figure 2, and Figure 7 shows the process of deriving the matching score between the objects of the current frame and the previous frame in the matching score extraction unit of Figure 4. This is an example diagram, and FIG. 8 is an example diagram showing the observation area and counting area used when counting objects in the object counting unit of FIG. 2.

도 1을 참조하면, 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템(이하 '객체 계수 시스템'이라 함)은 촬영부(1), 객체 카운팅 장치(2) 및 출력부(3)를 포함할 수 있다.Referring to Figure 1, an object counting system utilizing multiple object tracking according to an embodiment of the present invention (hereinafter referred to as 'object counting system') includes a photographing unit 1, an object counting device 2, and an output unit 3. ) may include.

먼저, 촬영부(1)는 계수하기 위한 객체를 촬영하는 것으로, 이동식 또는 비이동식으로 설치될 수 있다. 한편, 본 발명에서는 이동성을 가지는 객체, 즉 가축을 계수하고자 하기 때문에 비이동식으로 설치되어 이동하는 가축을 촬영하도록 하는 것이 바람직하나, 이에 한정되지는 않는다.First, the photographing unit 1 photographs objects for counting, and may be installed movably or non-movably. Meanwhile, in the present invention, since it is intended to count mobile objects, that is, livestock, it is preferable to install it in a non-movable manner to photograph moving livestock, but it is not limited to this.

보다 구체적으로, 촬영부(1)는 이동성을 가지는 객체일 경우 비이동식으로 설치되고, 비이동성을 가지는 객체일 경우 이동식으로 설치되는 것이 바람직할 수 있다.More specifically, it may be desirable for the photographing unit 1 to be installed in a non-movable manner when it is a mobile object, and to be installed movably when it is a non-movable object.

이와 같이 촬영부(1)는 이동식 또는 비이동식으로 설치되어, 이동을 통해 객체를 촬영함으로써, 복수의 객체가 한 번에 나오도록 전체 영역을 촬영할 필요가 없고 특정 영역만을 촬영하는 것으로 계수가 가능하도록 할 수 있어, 적용성, 활용성, 설치성 등이 우수할 수 있다.In this way, the photographing unit 1 is installed in a movable or non-movable manner, so that by photographing objects through movement, there is no need to photograph the entire area so that multiple objects appear at once, and counting is possible by photographing only specific regions. It can be excellent in applicability, usability, installation, etc.

즉, 이동에 의해 촬영부(1)로부터 촬영되는 특정 영역에 들어오는 객체를 계수하도록 할 수 있는 것이다.In other words, it is possible to count objects entering a specific area photographed by the photographing unit 1 by moving them.

촬영부(1)는 본 발명에서 가축인 객체를 촬영할 수 있도록 객체가 이동하는 통로에 설치되어 객체를 촬영할 수 있는데, 탑뷰(Top View)로 객체를 촬영하도록 통로 상부에 설치될 수 있다.In the present invention, the photographing unit 1 is installed in a passage through which an object moves so as to be able to photograph an object, which is a livestock animal, and can be installed at the upper part of the passage to photograph an object from a top view.

이러한 촬영부(1)는 실시간 카메라(CCTV, closed circuit television)가 적용될 수 있으나, 이에 한정되지는 않는다.This photographing unit 1 may be a real-time camera (CCTV, closed circuit television), but is not limited thereto.

*객체 카운팅 장치(2)는 촬영부(1)로부터 객체가 촬영된 영상을 수신 받아, 영상으로부터 일정 시간 간격으로 복수의 프레임을 추출하고, 추출된 복수의 프레임에서 각각 하나 이상의 객체를 검출하고 검출된 객체를 추적하여 객체 수를 카운팅 할 수 있다.*The object counting device 2 receives an image in which an object is captured from the photographing unit 1, extracts a plurality of frames from the image at regular time intervals, and detects and detects one or more objects in each of the extracted plurality of frames. You can count the number of objects by tracking them.

이때, 객체 카운팅 장치(2)는 객체 추적 시, 복수의 프레임에서 현재 프레임의 객체를 기준으로 이전 프레임의 객체를 추적할 수 있는데, 이는 현재 시점을 기준으로 이전 시점의 객체를 추적하는 것이다.At this time, when tracking an object, the object counting device 2 may track the object of the previous frame based on the object of the current frame in a plurality of frames, which means tracking the object of the previous frame based on the current viewpoint.

이는 이전(과거) 시점을 기준으로 현재 시점에 대한 객체를 추적하는 것과 달리 새로 발견되는 객체가 없어, 추적 과정을 간소화 시킬 수 있으며, 이에 효율이 우수하도록 할 수 있다.Unlike tracking objects for the current point in time based on a previous (past) point in time, this can simplify the tracking process as no newly discovered objects are found, and thus improve efficiency.

구체적으로, 도 2를 참조하면, 객체 카운팅 장치(2)는 데이터 베이스(20), 프레임 추출부(21), 객체 검출부(22), 객체 추적부(23) 및 객체 카운팅부(24)를 포함할 수 있다.Specifically, referring to FIG. 2, the object counting device 2 includes a database 20, a frame extraction unit 21, an object detection unit 22, an object tracking unit 23, and an object counting unit 24. can do.

데이터 베이스(20)는 촬영부(1)로부터 수신 받은 영상, 추출된 복수의 프레임, 객체 카운팅 정보 등을 저장할 수 있으며, 이에 한정되지 않고, 본 발명의 시스템에서 사용되고 생성되는 모든 정보들을 저장할 수 있다.The database 20 can store images received from the photographing unit 1, multiple extracted frames, object counting information, etc., but is not limited to this, and can store all information used and generated in the system of the present invention. .

프레임 추출부(21)는 촬영부(1)로부터 수신 받은 영상으로부터 복수의 프레임을 추출할 수 있는데, 일정 시간 간격으로 프레임을 추출하는 것이 바람직하다.The frame extractor 21 can extract a plurality of frames from the image received from the photographing unit 1, and it is preferable to extract the frames at regular time intervals.

프레임 추출부(21)는 파이썬 OpenCV 라이브러리를 이용하여 캡쳐를 통해 영상에서 복수의 프레임을 추출할 수 있으나, 이에 한정되지는 않는다.The frame extractor 21 can extract multiple frames from an image through capture using the Python OpenCV library, but is not limited to this.

객체 검출부(22)는 프레임별로 바운딩 박스(bounding box, BB)를 통해 객체를 검출하고, 검출된 객체에 식별정보를 부여할 수 있다.The object detector 22 may detect an object through a bounding box (BB) for each frame and provide identification information to the detected object.

이를 위해 객체 검출부(22)는 YOLO 모델, SSD 모델 R-CNN 모델 등이 적용될 수 있으나, 이에 한정되지는 않는다. 하기에서 YOLO 모델이 적용된 경우를 예로 들어, 객체 검출부(22)에 대해 보다 자세하게 설명하기로 한다.For this purpose, the object detection unit 22 may apply a YOLO model, SSD model, R-CNN model, etc., but is not limited thereto. Below, the object detection unit 22 will be described in more detail, taking the case where the YOLO model is applied as an example.

객체 검출부(22)는 YOLO 모델이 적용되어 프레임에서 계수하고자 하는 객체(가축)에 대한 바운딩 박스(BB)를 추출하는 것으로, 객체를 검출할 수 있다.The object detection unit 22 can detect the object by applying the YOLO model to extract a bounding box (BB) for the object (livestock) to be counted in the frame.

YOLO 모델은 이미지 내에서 객체를 빠짐없이 탐지하여 위치 정보를 바운딩 박스로 표시하는 모델로서, 하나의 신경망을 Grid 방식을 통해 전체 이미지에 적용하여 주변 정보까지 처리할 수 있어, 기존 분류기 기반 객체 탐지 기법인 R-CNN, Fast R-CNN 등에 비하여 객체 검출 정확도가 우수하고, 매우 효율적이고 빠르게 객체를 탐지할 수 있어 실시간 객체 탐지가 가능할 수 있다.The YOLO model is a model that detects all objects in an image and displays location information as a bounding box. By applying a single neural network to the entire image through the grid method, it can even process surrounding information, replacing existing classifier-based object detection techniques. It has superior object detection accuracy compared to R-CNN, Fast R-CNN, etc., and can detect objects very efficiently and quickly, enabling real-time object detection.

보다 구체적으로, YOLO 모델은 하나의 합성곱 신경망(Convolutional neural network, CNN)으로 이루어질 수 있고, 합성곱 신경망(CNN)은 컨볼루션 레이어(convolution layer, Conv)와 전결합 레이어(fully connected layer, FC)로 구성될 수 있다.More specifically, the YOLO model may be composed of a convolutional neural network (CNN), which consists of a convolution layer (Conv) and a fully connected layer (FC). ) can be composed of.

컨볼루션 레이어(Conv)는 하나 이상으로 형성되어, 프레임의 특징을 추출하는 것으로, 프레임을 S x S 그리드(Grid)로 나누고, 프레임에 대해 가중치를 적용하여 합성곱 연상을 통해 특징맵(feature map)을 생성할 수 있다. 이러한 하나의 컨볼루션 레이어(Conv)는 프레임의 픽셀 또는 그리드 셀을 대상으로 위치를 변경하면서 여러 번 반복하여 적용되어 프레임에 대해 특징을 추출할 수 있다.One or more convolutional layers (Conv) are formed to extract the features of the frame. The frame is divided into an S x S grid, weights are applied to the frame, and a feature map is created through convolution association. ) can be created. This single convolutional layer (Conv) can be applied repeatedly several times while changing the position of the pixel or grid cell of the frame to extract features for the frame.

여기서 사용되는 가중치들의 그룹을 가중치 커널(kernel)이라고 지칭할 수 있으며, 가중치 커널은 n x m x d의 3차원 행렬로 구성될 수 있는데, 프레임을 지정된 간격으로 순회하며 합성곱 연산을 통해 특징맵을 생성할 수 있다. The group of weights used here can be referred to as a weight kernel, and the weight kernel can be composed of a three-dimensional matrix of n x m x d, and a feature map can be generated through a convolution operation by traversing the frame at specified intervals. there is.

이때, 프레임이 복수의 채널(예를 들어, HSV의 3개의 채널)을 갖는 이미지라면, 가중치 커널은 프레임의 각 채널을 순회하며 합성곱 계산을 한 후, 채널 별 특징맵을 생성할 수 있다.At this time, if the frame is an image with multiple channels (e.g., 3 channels of HSV), the weight kernel may traverse each channel of the frame, perform convolution calculations, and then generate a feature map for each channel.

여기서, n은 프레임의 특정 크기의 행, m은 프레임의 특정 크기의 열, d는 프레임의 채널을 나타낼 수 있다.Here, n may represent a row of a specific size in the frame, m may represent a column of a specific size in the frame, and d may represent a channel of the frame.

전결합 레이어(FC)는 생성된 특징맵을 이용하여 객체에 대한 하나 이상의 바운딩 박스와 클래스 확률을 예측할 수 있다.The pre-combined layer (FC) can predict one or more bounding boxes and class probabilities for an object using the generated feature map.

여기서 바운딩 박스(bounding box)는 (x,y,w,h,c) 좌표로 구성될 수 있는데, x,y는 바운딩 박스의 중심 좌표 값, w,h는 바운딩 박스의 너비와 높이 값, c는 신뢰 점수(confidence score)이다.Here, the bounding box can be composed of (x,y,w,h,c) coordinates, where x,y is the center coordinate value of the bounding box, w,h are the width and height values of the bounding box, and c is the confidence score.

또한 클래스 확률은 그리드 셀 안에 객체가 있다는 조건 하에 그 객체가 어떤 클랜스(class)인지에 대한 조건부 확률이다.Additionally, the class probability is the conditional probability of what class the object is under the condition that there is an object in the grid cell.

또한 전결합 레이어(FC)는 바운딩 박스의 좌표와 클래스 확률을 이용하여 실제 객체에 대한 바운딩 박스를 선택해 낼 수 있으며, 계수하고자 하는 객체(가축)에 대한 바운딩 박스(BB)만을 추출할 수 있다. 이때, class specific confidence score와 IOU(Intersection over Union)을 이용하여 가축에 해당하는 객체에 대한 바운딩 박스(BB)만을 추출할 수 있다.In addition, the fully combined layer (FC) can select the bounding box for the actual object using the coordinates and class probability of the bounding box, and can extract only the bounding box (BB) for the object (livestock) to be counted. At this time, only the bounding box (BB) for the object corresponding to livestock can be extracted using the class specific confidence score and IOU (Intersection over Union).

class specific confidence score는 바운딩 박스의 신뢰 점수(confidence score)와 클래스 확률(class probability)을 곱하는 것으로 구할 수 있고, IOU(Intersection over Union)는 교집합 영역 넓이/합집합 영역 넓이로 구할 수 있다.The class specific confidence score can be obtained by multiplying the confidence score of the bounding box and the class probability, and IOU (Intersection over Union) can be obtained by the intersection area area/union area area.

이와 같이 객체 검출부(22)는 상기 YOLO 모델을 통해 계수하고자 하는 가축에 해당하는 객체에 대한 바운딩 박스(BB)를 추출할 수 있다.In this way, the object detector 22 can extract the bounding box BB for the object corresponding to the livestock to be counted through the YOLO model.

또한 객체 검출부(22)는 객체에 대한 바운딩 박스(BB)를 통해 객체를 검출함에 따라, 바운딩 박스(BB) 별로 식별정보를 라벨링하는 것으로, 검출된 객체에 식별정보를 부여할 수 있다.Additionally, as the object detection unit 22 detects an object through a bounding box (BB) for the object, identification information can be assigned to the detected object by labeling identification information for each bounding box (BB).

객체 추적부(23)는 동일 객체를 판정하여 중복 계수를 방지하기 위해, 도 3에 나타난 바와 같이, 객체가 검출된 복수의 프레임 중 t시점의 현재 프레임의 객체를 기준으로 t-1시점의 이전 프레임의 객체를 추적할 수 있는데, 현재 프레임의 객체에 대해 이전 프레임의 객체에서 동일한 객체를 추정하고 매칭시켜 객체 추적정보를 생성할 수 있다.In order to determine the same object and prevent duplicate counting, the object tracking unit 23 determines the same object and, as shown in FIG. 3, moves the previous data at time t-1 based on the object in the current frame at time t among the plurality of frames in which the object is detected. Objects in a frame can be tracked, and object tracking information can be generated by estimating and matching the same object from the object in the previous frame for the object in the current frame.

도 4를 참조하면, 객체 추적부(23)는 객체 임베딩부(230), 매칭점수 추출부(231) 및 객체 매칭부(232)를 포함할 수 있다.Referring to FIG. 4, the object tracking unit 23 may include an object embedding unit 230, a matching score extracting unit 231, and an object matching unit 232.

객체 임베딩부(230)는 현재 프레임과 이전 프레임에서 각각 검출된 객체에 따른 객체 영역을 추출하여 현재 객체 특징맵과 이전 객체 특징맵을 생성할 수 있다. 즉, 현재 프레임의 객체와 이전 프레임의 객체를 동일한 객체끼리 매칭시킬 수 있도록 검출된 객체에 대한 특징을 추출하는 과정이라고 할 수 있다.The object embedding unit 230 may generate a current object feature map and a previous object feature map by extracting object areas according to objects detected in the current frame and the previous frame, respectively. In other words, it can be said to be a process of extracting features of the detected object so that the same object can be matched between the object in the current frame and the object in the previous frame.

도 5를 참조하면, 객체 임베딩부(230)는 패치 분할부(230a), 특징 추출부(230b) 및 차원 축소부(230c)를 포함할 수 있다.Referring to FIG. 5, the object embedding unit 230 may include a patch division unit 230a, a feature extraction unit 230b, and a dimension reduction unit 230c.

패치 분할부(230a)는 현재 프레임과 이전 프레임에서 각각 바운딩 박스(BB)를 이용하여 객체 영역을 추출하고, 추출된 객체 영역을 복수의 패치(Patch)로 나눌 수 있다.The patch division unit 230a may extract an object area using a bounding box (BB) from the current frame and the previous frame, respectively, and divide the extracted object area into a plurality of patches.

먼저, 패치 분할부(230a)는 현재 프레임과 이전 프레임에서 바운딩 박스(BB)로 검출된 객체를 제외한 나머지 영역을 '0'으로 채우는 것으로, 객체 영역을 추출할 수 있다.First, the patch division unit 230a can extract the object area by filling the remaining area with '0' excluding the object detected as the bounding box BB in the current frame and the previous frame.

그 다음 패치 분할부(230a)는 객체 영역이 추출된 현재 프레임과 이전 프레임을 각각 가로세로 동일한 크기로 나누어 복수의 패치(Patch)로 구성할 수 있다.Next, the patch division unit 230a may divide the current frame and the previous frame from which the object area is extracted into equal sizes horizontally and vertically, respectively, and configure them into a plurality of patches.

예를 들어, 가로세로를 각각 8 조각으로 나누어 8 x 8 = 64개의 패치를 생성할 수 있으나, 이에 한정되지 않고, 상황에 따라 패치의 크기와 수는 다양하게 가변 될 수 있다.For example, 8 x 8 = 64 patches can be created by dividing each width and height into 8 pieces, but this is not limited, and the size and number of patches can be varied depending on the situation.

특징 추출부(230b)는 현재 프레임과 이전 프레임에서 각각 생성된 복수의 패치에 대해 각 패치별로 특징을 추출하여 특징맵을 생성할 수 있다. 현재 프레임과 이전 프레임은 동일하게 처리되므로, 현재 프레임을 기준으로 설명하기로 한다.The feature extraction unit 230b may generate a feature map by extracting features for each patch for a plurality of patches generated in the current frame and the previous frame. Since the current frame and the previous frame are processed identically, the description will be based on the current frame.

특징 추출부(230b)는 복수의 패치로 조각난 현재 프레임을 밀집 레이어(Dense layer, Dense)에 통과시켜 각 패치별로 특징을 추출하여 특징맵을 생성할 수 있다. 이때 특징맵은 256 사이즈로 출력될 수 있으나, 이에 한정되지는 않는다.The feature extraction unit 230b may generate a feature map by passing the current frame fragmented into a plurality of patches through a dense layer (Dense layer) and extracting features for each patch. At this time, the feature map may be output in size 256, but is not limited to this.

여기서 특징 추출부(230b)는 멀티 레이어 퍼셉트론 블록(Multi-Layer Perceptron Block)이 적용될 수 있으며, 멀티 레이어 퍼셉트론 블록(Multi-Layer Perceptron Block)은 밀집 레이어(Dense layer, Dense), 활성화 함수(Activation function), 밀집 레이어(Dense layer, Dense)로 구성될 수 있으나, 이에 한정되지는 않는다.Here, a multi-layer perceptron block may be applied to the feature extraction unit 230b, and the multi-layer perceptron block may be a dense layer (Dense layer) and an activation function. ), may be composed of a dense layer (Dense layer, Dense), but is not limited to this.

밀집 레이어(Dense layer, Dense)는 입력값(input)이 입력되면 특징을 추출하고 다중 곱과 합산과 같은 복잡한 연산을 통해 학습과 추론 과정을 실행하여 결과값(output)이 출력되도록 할 수 있다. 입력값(input)은 패치, 결과값(output)은 특징맵일 수 있다.Dense layer (Dense) extracts features when input is input, executes learning and inference processes through complex operations such as multiple multiplication and summation, and outputs the result (output). The input may be a patch, and the result may be a feature map.

차원 축소부(230c)는 현재 프레임과 이전 프레임의 패치별 특징맵을 각각 하나의 특징맵으로 압축할 수 있다. 이에 현재 프레임에 대해서는 현재 객체 특징맵, 이전 프레임에 대해서는 이전 객체 특징맵으로 생성할 수 있다.The dimension reduction unit 230c may compress the feature maps for each patch of the current frame and the previous frame into one feature map. Accordingly, the current object feature map can be created for the current frame, and the previous object feature map can be created for the previous frame.

즉, 차원 축소부(230c)는 64개의 256 사이즈의 특징맵을 각 요소별로 처리하여 1개의 256 사이즈의 특징맵으로 차원을 축소할 수 있는 것이다. 상기에서 말하는 1개의 256 사이즈의 특징맵이 현재 객체 특징맵과 이전 객체 특징맵일 수 있다.In other words, the dimension reduction unit 230c can process 64 feature maps of size 256 for each element and reduce the dimension to one feature map of size 256. One feature map of size 256 mentioned above may be the current object feature map and the previous object feature map.

이를 위해, 차원 축소부(230c)는 풀링 레이어(Pooling layer)가 적용될 수 있는데, 풀링 레이어(Pooling)는 생성된 특징맵에 대해서 공간적 해상도를 감소하는 역할을 하는 것으로, 특징맵의 차원을 축소하는 기능을 하며, 이를 통해 분석 문제이 복잡도를 감소시킬 수 있다.To this end, a pooling layer may be applied to the dimension reduction unit 230c. The pooling layer serves to reduce the spatial resolution of the generated feature map, reducing the dimension of the feature map. function, and through this, the complexity of the analysis problem can be reduced.

상기 풀링 레이어(Pooling layer)는 특징맵의 부분의 값들에 대해 최대치를 취하는 최대풀링 연산자가 적용된 최대풀링 레이어(MaxPooling layer, MaxPooling) 평균치를 취하는 평균풀링연산자가 적용된 평균풀링 레이어(AveragePooling layer, AveragePooling) 등이 있으며, 본 실시예에서는 평균풀링 레이어(AveragePooling layer, AveragePooling)가 바람직할 수 있으나, 이에 한정되지는 않는다.The pooling layer is a maximum pooling layer (MaxPooling layer, MaxPooling) to which a maximum pooling operator is applied that takes the maximum value for the values of the portion of the feature map, and an average pooling layer (AveragePooling layer, AveragePooling) to which an average pooling operator is applied to take the average value. etc., and in this embodiment, an average pooling layer (AveragePooling layer) may be preferable, but is not limited thereto.

매칭점수 추출부(231)는 객체 임베딩부(230)로부터 얻어진 현재 객체 특징맵과 이전 객체 특징맵을 이용하여 현재 프레임과 이전 프레임의 객체 간의 매칭점수를 추출할 수 있다.The matching score extraction unit 231 may extract a matching score between objects in the current frame and the previous frame using the current object feature map and the previous object feature map obtained from the object embedding unit 230.

도 6을 참조하면, t 시점의 현재 프레임에서 4개(n)의 객체(a,b,c,d)가 검출되고, t-1 시점의 이전프레임에서 4개의 객체(a',b',c',d')가 검출되었다고 가정하면, 도 7과 같이 현재 프레임에서 4개의 객체(a,b,c,d)를 기준으로 4개의 객체(a',b',c',d')와의 매칭점수를 각각 모두 구할 수 있다. 이에 현재 프레임의 객체(a,b,c,d)와 이전 프레임의 객체(a',b',c',d')에서 매칭점수가 높은 객체를 동일한 객체로 추정하여, 추정된 객체끼리 매칭시킬 수 있도록 한다.Referring to FIG. 6, four (n) objects (a, b, c, d) are detected in the current frame at time t, and four objects (a', b', Assuming that c', d') is detected, four objects (a', b', c', d') are detected based on the four objects (a, b, c, d) in the current frame, as shown in Figure 7. You can obtain all matching scores for each. Accordingly, the object with a high matching score among the objects (a, b, c, d) of the current frame and the objects (a', b', c', d') of the previous frame is estimated to be the same object, and the estimated objects are matched. Make it possible to do it.

보다 구체적으로, 매칭점수 추출부(231)는 현재 객체 특징맵과 이전 객체 특징맵을 이용하여 현재 프레임의 객체(a,b,c,d) 별로 이전 프레임의 객체(a',b',c',d') 간과의 매칭 점수를 하기 수학식 1을 통해 추출하여, 현재 프레임의 객체(a,b,c,d) 별로 매칭 점수 데이터(eⁱ)를 생성할 수 있다.More specifically, the matching score extraction unit 231 uses the current object feature map and the previous object feature map to determine the objects (a', b', c) of the previous frame for each object (a, b, c, d) of the current frame. ', d'), the matching score can be extracted through Equation 1 below, and matching score data (e ⁱ ) can be generated for each object (a, b, c, d) of the current frame.

[수학식 1][Equation 1]

[수학식 2][Equation 2]

여기서, Q_i는 현재 프레임의 i번째 객체, K_j는 이전 프레임의 j번째 객체, n은 총 객체 수이고, T는 전치를 의미한다.Here, Q _i is the ith object of the current frame, K _j is the jth object of the previous frame, n is the total number of objects, and T means transpose.

객체 매칭부(232)는 매칭점수를 기반으로 현재 프레임의 객체에 대해 동일한 객체로 추정되는 이전 프레임의 객체를 매칭시켜 객체 추적정보를 생성할 수 있다.The object matching unit 232 may generate object tracking information by matching the object of the current frame with the object of the previous frame, which is estimated to be the same object, based on the matching score.

이때, 객체 매칭부(232)는 소프트맥스(softmax), argmax 등과 같은 분류 계층을 사용하여 현재프레임과 이전프레임간의 객체를 매칭시킬 수 있는데, softmax는 확률을 추출하는 함수이고, argmax는 최대값을 추출하는 함수이다.At this time, the object matching unit 232 can match objects between the current frame and the previous frame using a classification layer such as softmax, argmax, etc., where softmax is a function for extracting the probability, and argmax is the maximum value. This is an extraction function.

보다 구체적으로, 객체 매칭부(232)는 현재 프레임의 객체에 대해 하기 수학식 3 및 4를 통해 매칭점수가 높은 이전 프레임의 객체를 매칭시킬 수 있다.More specifically, the object matching unit 232 may match an object of a previous frame with a high matching score to an object of the current frame using Equations 3 and 4 below.

[수학식 3][Equation 3]

여기서, w_i는 현재 프레임의 i번째 객체의 확률값, eⁱ는 현재 프레임의 i번째 객체의 매칭 점수 데이터이다.Here, w _i is the probability value of the ith object of the current frame, and e ⁱ is the matching score data of the ith object of the current frame.

[수학식 4][Equation 4]

여기서, O_i는 현재 프레임의 i번째 객체에 매칭되는 이전 프레임의 객체, w_i는 현재 프레임의 i번째 객체의 확률값, V_j는 이전 프레임의 j번째 객체이다.Here, O _i is the object of the previous frame that matches the ith object of the current frame, w _i is the probability value of the ith object of the current frame, and V _j is the jth object of the previous frame.

객체 카운팅부(24)는 객체 추적부(23)로부터 생성된 객체 추적정보를 이용하여 객체 수를 카운팅 할 수 있고, 이에 객체 카운팅 정보를 생성할 수 있다.The object counting unit 24 can count the number of objects using the object tracking information generated by the object tracking unit 23 and thus generate object counting information.

예를 들어, 도 8과 같이 촬영부(1)로부터 촬영된 특정영역이 구획선(DL)에 의해 계수영역(CA)과 관찰영역(OA)으로 구분된다고 가정하면, 객체 카운팅부(24)는 객체 추정정보에 따라 관찰영역(OA)에서 계수영역(CA)으로 이동했다고 판단되는 객체에 대해 카운팅 하는 것으로 객체를 계수할 수 있다. 이때 계수영역(CA)에서 관찰영역(OA)으로 이동했다고 판단되는 객체에 대해서는 카운팅을 하지 않아 중복 카운팅을 방지할 수 있고, 이에 계수 정확도가 우수할 수 있다.For example, assuming that a specific area photographed by the imaging unit 1 is divided into a counting area (CA) and an observation area (OA) by a dividing line (DL) as shown in FIG. 8, the object counting unit 24 Objects can be counted by counting objects that are determined to have moved from the observation area (OA) to the counting area (CA) according to the estimated information. At this time, objects that are determined to have moved from the counting area (CA) to the observation area (OA) are not counted, thereby preventing duplicate counting, and thus counting accuracy can be excellent.

객체 카운팅부(24)에서 이루어지는 카운팅 방법은 상기 예에 한정되지 않고, 다양하게 이루어질 수 있다.The counting method performed by the object counting unit 24 is not limited to the above example and may be performed in various ways.

출력부(3)는 디스플레이 또는 단말 등으로 구성되어, 객체 카운팅 장치(2)로부터 객체 카운팅 정보를 수신받아 화면에 객체 카운팅 정보를 출력할 수 있다. 이에 사용자가 객체 카운팅 정보를 확인하도록 할 수 있다.The output unit 3 is composed of a display or a terminal, and can receive object counting information from the object counting device 2 and output the object counting information on the screen. Accordingly, the user can check object counting information.

또한 출력부(3)는 상기 기능에 한정되지 않고, 촬영부(1)로부터 촬영되는 영상 등 다양한 정보들을 출력하여 표시할 수 있으며, 제어값 등을 입력 받는 것으로 촬영부(1), 객체 카운팅 장치(2)를 제어하는 등 다양한 기능을 수행할 수도 있다.In addition, the output unit 3 is not limited to the above functions, and can output and display various information, such as images captured from the photographing unit 1. By receiving control values, etc., the photographing unit 1 and the object counting device It can also perform various functions, such as controlling (2).

상기에서 설명한 바와 같이, 본 발명의 실시예에 따른 다중 객체 추적을 활용한 객체 계수 시스템은 계수를 하고자 하는 객체가 촬영된 영상을 기반으로 현재 시점을 기준으로 이전 시점의 객체를 추적하는 다중 객체 추적 기술을 활용함으로써, 동일 개체의 중복 계수를 방지하여 정확한 계수가 가능할 수 있다.As described above, the object counting system utilizing multi-object tracking according to an embodiment of the present invention is a multi-object tracking system that tracks objects at previous viewpoints based on the current viewpoint based on images in which the object to be counted is captured. By utilizing technology, accurate counting can be possible by preventing duplicate counting of the same object.

이상으로 첨부된 도면을 참조하여 본 발명의 실시예를 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고 다른 구체적인 형태로 실시할 수 있다는 것을 이해할 수 있을 것이다. 따라서 이상에서 기술한 실시예는 모든 면에서 예시적인 것이며 한정적이 아닌 것이다.Although embodiments of the present invention have been described above with reference to the attached drawings, those skilled in the art will understand that the present invention can be implemented in other specific forms without changing the technical idea or essential features of the present invention. You will be able to understand it. Therefore, the embodiments described above are illustrative in all respects and are not restrictive.

1: 촬영부
2: 객체 카운팅 장치
20: 데이터베이스
21: 프레임 추출부
22: 객체 검출부
23: 객체 추적부
230: 객체 임베딩부
230a: 패치 분할부
230b: 특징 추출부
230c: 차원 축소부
231: 매칭점수 추출부
232: 객체 매칭부
24: 객체 카운팅부1: Filming Department
2: Object counting device
20: Database
21: frame extractor
22: object detection unit
23: Object tracking unit
230: Object embedding unit
230a: patch divider
230b: feature extraction unit
230c: Dimension reduction unit
231: Matching score extraction unit
232: Object matching unit
24: Object counting unit

Claims

A photographing unit that is installed movably or non-movably to photograph objects for counting, and
Receives an image in which an object is captured from the photographing unit, extracts a plurality of frames from the image at regular time intervals, detects one or more objects in each of the extracted plurality of frames, and tracks the detected object to count the number of objects. Including an object counting device that
The object counting device,
When tracking the object, the object in the previous frame is tracked based on the object in the current frame in a plurality of frames,
a frame extraction unit that extracts a plurality of frames from the image at regular time intervals;
an object detection unit that detects an object through a bounding box for each frame and provides identification information to the detected object;
an object tracking unit that generates object tracking information by estimating and matching the same object from the object in the previous frame with respect to the object in the current frame;
An object counting unit that counts the number of objects using the object tracking information,
The object counting unit,
By distinguishing between the observation area and the counting area, objects moving from the observation area to the counting area are counted through the object tracking information, and objects that are determined to have moved from the counting area to the observation area are not counted,
The object tracking unit,
an object embedding unit that generates a current object feature map and a previous object feature map by extracting object areas according to objects detected in the current frame and the previous frame, respectively;
A matching score extractor that extracts a matching score between objects in the current frame and the previous frame using the current object feature map and the previous object feature map for the object, and
An object matching unit that generates the object tracking information by matching the object of the previous frame, which is estimated to be the same object, with the object of the current frame based on the matching score,
The object embedding unit,
a patch division unit that extracts an object area from the current frame and the previous frame using the bounding box, and divides the extracted object area into a plurality of patches;
A feature extraction unit that extracts features for each patch and generates a feature map, and
Compressing the feature map for each patch into one feature map, and comprising a dimension reduction unit that generates the current object feature map for the current frame and the previous object feature map for the previous frame,
The matching score extraction unit,
Extracting the matching score between objects of the previous frame for each object of the current frame through Equation 1 below to generate matching score data (e ⁱ ) for each object of the current frame,
The patch division part,
The object area is extracted by filling the remaining area with '0', excluding the object detected as a bounding box in the current frame and the previous frame, and dividing the current frame and the previous frame from which the object area was extracted into the same size horizontally and vertically, respectively, into a plurality of Composed of patches,
The object matching unit,
An object counting system using multiple object tracking, characterized in that matching the object of the previous frame with a high matching score to the object of the current frame using Equations 3 and 4 below.
[Equation 1]

[Equation 2]

(Here, Q _i is the i-th object of the current frame, K _j is the j-th object of the previous frame, n is the total number of objects, and T means transpose.)
[Equation 3]

(Here, w _i is the probability value of the ith object of the current frame, and e ⁱ is the matching score data of the ith object of the current frame.)
[Equation 4]

(Here, O _i is the object of the previous frame that matches the ith object of the current frame, w _i is the probability value of the ith object of the current frame, and V _j is the jth object of the previous frame.)