KR102376479B1

KR102376479B1 - Method, device and system for controlling for automatic recognition of object based on artificial intelligence

Info

Publication number: KR102376479B1
Application number: KR1020210165486A
Authority: KR
Inventors: 노진식
Original assignee: (주)혜성에스앤피; 노진식
Priority date: 2021-11-26
Filing date: 2021-11-26
Publication date: 2022-03-18

Abstract

Provided is a CCTV control method for artificial intelligence (AI)-based automatic object recognition to enable a manager to perform video monitoring effectively. According to one embodiment of the present invention, a method of controlling a CCTV for AI-based automatic object recognition is executed by a device. The method comprises the following steps: acquiring first video information generated by capturing a first area from a first CCTV when the first area is being photographed through the first CCTV; extracting an image of the video most recently acquired from the first video information as a first image; generating a first input signal by encoding the first image; inputting the first input signal to an artificial neural network and acquiring a first output signal on the basis of a result of the input of the artificial neural network; detecting whether an object exists in the first area on the basis of the first output signal; detecting a type of object located in the first area on the basis of the first output signal when the object exists in the first area; recognizing a state in which a first object is located in the first area when the object located in the first area is detected as the first object; and transmitting the first video information acquired after extraction of the first image to a manager terminal when it is recognized that the first object is located in the first area, so that the first video information is displayed on a screen of the manager terminal.

Description

CCTV control method, device and system for automatic object recognition based on artificial intelligence {METHOD, DEVICE AND SYSTEM FOR CONTROLLING FOR AUTOMATIC RECOGNITION OF OBJECT BASED ON ARTIFICIAL INTELLIGENCE}

아래 실시예들은 인공지능을 기반으로 객체 자동 인식을 위해 CCTV를 제어하는 기술에 관한 것이다.The following embodiments relate to a technology for controlling a CCTV for automatic object recognition based on artificial intelligence.

CCTV(Closed-Circuit Television)는 도난 방지, 침입 방지 등을 위해 그 설치가 날로 늘어가고 있는 추세이다. 공원이나 골목 등에도 CCTV 설치의 민원이 쇄도하고 있으며, 이러한 CCTV는 범죄 예방이나 자연 재해 등의 감지에도 큰 역할을 하고 있다.CCTV (Closed-Circuit Television) is increasingly installed for theft prevention and intrusion prevention. Complaints about CCTV installation are flooding in parks and alleys, and these CCTVs play a big role in crime prevention and detection of natural disasters.

하지만, 기존 CCTV의 가장 큰 단점은 범죄나 도난, 침입이 발생하면 주로 사후적인 수단으로 활용된다는 것이다. 즉, 도난/침입 등이 발생한 순간에는 CCTV를 통한 촬영 만으로 이에 즉각 대처하는 것이 미흡한 실정이다.However, the biggest drawback of the existing CCTV is that it is mainly used as a post-mortem measure when a crime, theft, or intrusion occurs. That is, at the moment of theft/intrusion, etc., it is insufficient to immediately deal with it only by shooting through CCTV.

물론 24시간 모니터 요원이 CCTV를 통해 촬영된 영상에 대해 감시를 할 수 있으나, 수많은 CCTV를 통해 촬영된 영상을 24시간 내내 모니터링한다는 것은 불가능에 가깝다. 그리고 주택이나 아파트 단지 등에 설치된 CCTV는 대부분 사후적인 수단에 불과하며, 모니터 요원이 배치되어 있지 않다.Of course, a 24-hour monitor agent can monitor the video recorded through CCTV, but it is almost impossible to monitor the video recorded through numerous CCTVs 24 hours a day. In addition, CCTV installed in houses or apartment complexes is mostly only a post-mortem measure, and there is no monitor agent.

따라서, 특정 구역에 객체가 인식되지 않은 경우, CCTV를 통해 촬영된 영상을 관리자에게 제공하지 않고, 특정 구역에 객체가 인식된 경우에만, 객체가 인식된 시점부터 CCTV를 통해 촬영된 영상을 관리자에게 제공하여, 영상 모니터링을 효율적으로 수행할 수 있는 방안에 대한 연구가 요구된다.Therefore, when an object is not recognized in a specific area, the video captured through CCTV is not provided to the manager, but only when the object is recognized in a specific area Therefore, there is a need for research on a method to efficiently perform image monitoring.

한국등록특허 제10-1470314호Korean Patent No. 10-1470314 한국등록특허 제10-1470316호Korean Patent No. 10-1470316 한국등록특허 제10-1765722호Korean Patent Registration No. 10-1765722 한국등록특허 제10-1942808호Korean Patent No. 10-1942808

일실시예에 따르면, 제1 CCTV를 통해 제1 구역에 대한 촬영이 수행되고 있는 경우, 제1 CCTV로부터 제1 구역의 촬영으로 생성된 제1 영상 정보를 획득하고, 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 추출하고, 제1 이미지를 인코딩 하여 제1 입력 신호를 생성하고, 제1 입력 신호를 인공 신경망에 입력하고, 인공 신경망의 입력에 대한 결과에 기초하여, 제1 출력 신호를 획득하고, 제1 출력 신호를 기초로, 제1 구역에 객체가 존재하는지 여부를 검출하고, 제1 구역에 객체가 존재하는 것으로 검출되면, 제1 출력 신호를 기초로, 제1 구역 내에 위치하는 객체의 종류를 검출하고, 제1 구역 내에 위치하는 객체가 제1 객체로 검출되면, 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식하고, 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식되면, 제1 이미지의 추출 이후 획득되는 제1 영상 정보를 관리자 단말로 전송하여, 제1 영상 정보가 관리자 단말의 화면에 표시되도록 제어하는, 인공지능 기반 객체 자동 인식을 위한 CCTV 제어 방법, 장치 및 시스템을 제공하기 위한 것을 그 목적으로 한다.According to an embodiment, when shooting for the first area is being performed through the first CCTV, the first image information generated by the shooting of the first area is obtained from the first CCTV, and from the first image information, the most recent extracts the image of the obtained image as a first image, encodes the first image to generate a first input signal, inputs the first input signal to the artificial neural network, and based on the result of the input of the artificial neural network, Obtain a first output signal, and detect whether an object exists in the first region based on the first output signal, and if it is detected that the object exists in the first region, based on the first output signal, The type of the object located in the first area is detected, and when the object located in the first area is detected as the first object, it is recognized as a state that the first object is located in the first area, and the first object is located in the first area CCTV for automatic recognition of objects based on artificial intelligence, which controls so that the first image information is displayed on the screen of the manager terminal by transmitting the first image information obtained after extraction of the first image to the manager terminal when recognized as being located It aims to provide a control method, apparatus and system.

본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 명확하게 이해될 수 있을 것이다.The object of the present invention is not limited to the object mentioned above, and other objects not mentioned will be clearly understood from the description below.

일실시예에 따르면, 장치에 의해 수행되는, 인공지능을 기반으로 객체 자동 인식을 위해 CCTV를 제어하는 방법에 있어서, 제1 CCTV를 통해 제1 구역에 대한 촬영이 수행되고 있는 경우, 상기 제1 CCTV로부터 상기 제1 구역의 촬영으로 생성된 제1 영상 정보를 획득하는 단계; 상기 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 추출하는 단계; 상기 제1 이미지를 인코딩 하여 제1 입력 신호를 생성하는 단계; 상기 제1 입력 신호를 인공 신경망에 입력하고, 상기 인공 신경망의 입력에 대한 결과에 기초하여, 제1 출력 신호를 획득하는 단계; 상기 제1 출력 신호를 기초로, 상기 제1 구역에 객체가 존재하는지 여부를 검출하는 단계; 상기 제1 구역에 객체가 존재하는 것으로 검출되면, 상기 제1 출력 신호를 기초로, 상기 제1 구역 내에 위치하는 객체의 종류를 검출하는 단계; 상기 제1 구역 내에 위치하는 객체가 제1 객체로 검출되면, 상기 제1 구역 내에 상기 제1 객체가 위치하고 있는 상태로 인식하는 단계; 및 상기 제1 구역 내에 상기 제1 객체가 위치하고 있는 상태로 인식되면, 상기 제1 이미지의 추출 이후 획득되는 상기 제1 영상 정보를 관리자 단말로 전송하여, 상기 제1 영상 정보가 상기 관리자 단말의 화면에 표시되도록 제어하는 단계를 포함하는, 인공지능 기반 객체 자동 인식을 위한 CCTV 제어 방법이 제공된다.According to an embodiment, in a method of controlling a CCTV for automatic object recognition based on artificial intelligence, performed by a device, when shooting for a first area is performed through a first CCTV, the first obtaining first image information generated by shooting of the first area from CCTV; extracting an image of the most recently acquired image from the first image information as a first image; generating a first input signal by encoding the first image; inputting the first input signal to an artificial neural network, and obtaining a first output signal based on a result of the input of the artificial neural network; detecting whether an object exists in the first area based on the first output signal; detecting a type of an object located in the first area based on the first output signal when it is detected that the object is present in the first area; recognizing that the first object is located in the first area when the object located in the first area is detected as the first object; and when it is recognized that the first object is located in the first area, the first image information obtained after extraction of the first image is transmitted to the manager terminal, and the first image information is displayed on the screen of the manager terminal There is provided a CCTV control method for automatic recognition of objects based on artificial intelligence, comprising the step of controlling to be displayed on the .

상기 인공지능 기반 객체 자동 인식을 위한 CCTV 제어 방법은, 상기 제1 구역 내에 상기 제1 객체가 위치하고 있는 상태로 인식되면, 상기 제1 이미지의 추출 이후 획득되는 상기 제1 영상 정보를 기초로, 상기 제1 객체의 이동을 추적하여 분석하는 단계; 상기 제1 객체의 이동을 추적하여 분석한 결과, 상기 제1 객체가 상기 제1 구역을 벗어나 제2 구역 방향으로 이동한 것이 확인되면, 상기 제2 구역에 대한 촬영을 수행하고 있는 제2 CCTV를 확인하는 단계; 상기 제2 CCTV를 통해 상기 제2 구역에 대한 촬영이 수행되고 있는 경우, 상기 제2 CCTV로부터 상기 제2 구역의 촬영으로 생성된 제2 영상 정보를 획득하는 단계; 및 상기 제2 영상 정보를 상기 관리자 단말로 전송하여, 상기 제1 영상 정보에 이어서 상기 제2 영상 정보가 상기 관리자 단말의 화면에 표시되도록 제어하는 단계를 더 포함할 수 있다.In the CCTV control method for automatic object recognition based on artificial intelligence, when the first object is recognized as being located in the first area, based on the first image information obtained after the extraction of the first image, the tracking and analyzing the movement of the first object; As a result of tracking and analyzing the movement of the first object, if it is confirmed that the first object has moved out of the first area in the direction of the second area, a second CCTV that is filming the second area checking; acquiring second image information generated by shooting of the second area from the second CCTV when the second area is being photographed through the second CCTV; and transmitting the second image information to the manager terminal, and controlling the second image information to be displayed on the screen of the manager terminal subsequent to the first image information.

상기 제1 영상 정보를 획득하는 단계는, 상기 제1 CCTV 중 상기 제1 구역의 제1 방향에 제1 카메라가 설치되어 있고, 상기 제1 구역의 제2 방향에 제2 카메라가 설치되어 있고, 상기 제1 구역의 제3 방향에 제3 카메라가 설치되어 있는 경우, 상기 제1 카메라만 동작하여 촬영을 수행하도록 제어하는 단계; 상기 제1 카메라를 통해 상기 제1 방향에서 상기 제1 구역에 대한 촬영이 수행되고 있는 경우, 상기 제1 카메라로부터 상기 제1 방향에서 상기 제1 구역을 촬영하여 생성된 제1-1 영상 정보를 획득하는 단계; 상기 제1-1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1-1 이미지로 추출하는 단계; 상기 제1-1 이미지를 인코딩 하여 제1-1 입력 신호를 생성하는 단계; 상기 제1-1 입력 신호를 인공 신경망에 입력하고, 상기 인공 신경망의 입력에 대한 결과에 기초하여, 제1-1 출력 신호를 획득하는 단계; 상기 제1-1 출력 신호를 기초로, 상기 제1 구역에 객체가 존재하는지 여부를 검출하는 단계; 상기 제1 구역에 객체가 존재하지 않는 것으로 검출되면, 상기 제1 카메라만 동작하여 촬영을 수행하도록 제어하는 단계; 상기 제1 구역에 객체가 존재하는 것으로 검출되면, 상기 제1-1 출력 신호를 기초로, 상기 제1 구역 내에 위치하는 객체의 수를 검출하는 단계; 상기 제1 구역 내에 위치하는 객체의 수가 미리 설정된 기준치 보다 작은 것으로 확인되면, 상기 제1 카메라 및 상기 제2 카메라가 동작하여 촬영을 수행하도록 제어하는 단계; 및 상기 제1 구역 내에 위치하는 객체의 수가 상기 기준치 보다 큰 것으로 확인되면, 상기 제1 카메라, 상기 제2 카메라 및 상기 제3 카메라가 동작하여 촬영을 수행하도록 제어하는 단계를 포함할 수 있다.In the obtaining of the first image information, a first camera is installed in a first direction of the first zone among the first CCTVs, and a second camera is installed in a second direction of the first zone, when a third camera is installed in a third direction of the first zone, controlling only the first camera to operate to perform photographing; When the first area is photographed in the first direction through the first camera, 1-1 image information generated by photographing the first area in the first direction from the first camera obtaining; extracting an image of the most recently acquired image from the 1-1 image information as a 1-1 image; generating a 1-1 input signal by encoding the 1-1 image; inputting the 1-1 input signal to an artificial neural network, and obtaining a 1-1 output signal based on a result of the input of the artificial neural network; detecting whether an object exists in the first area based on the 1-1 output signal; when it is detected that no object exists in the first area, controlling only the first camera to operate to perform photographing; detecting the number of objects located in the first area based on the 1-1 output signal when it is detected that an object exists in the first area; when it is determined that the number of objects located in the first area is smaller than a preset reference value, controlling the first camera and the second camera to operate to perform photographing; and controlling the first camera, the second camera, and the third camera to operate to perform photographing when it is determined that the number of objects located in the first area is greater than the reference value.

일실시예에 따르면, 특정 구역에 객체가 인식되지 않은 경우, CCTV를 통해 촬영된 영상을 관리자에게 제공하지 않고, 특정 구역에 객체가 인식된 경우에만, 객체가 인식된 시점부터 CCTV를 통해 촬영된 영상을 관리자에게 제공하여, 영상 모니터링을 효율적으로 수행할 수 있는 효과가 있다.According to one embodiment, when an object is not recognized in a specific area, the image captured through CCTV is not provided to the manager, and only when the object is recognized in a specific area, from the point in time when the object is recognized, By providing an image to a manager, there is an effect that image monitoring can be performed efficiently.

한편, 실시예들에 따른 효과들은 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 해당 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.On the other hand, the effects according to the embodiments are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those of ordinary skill in the art from the following description.

도 1은 일실시예에 따른 시스템의 구성을 개략적으로 나타낸 도면이다.
도 2는 일실시예에 따른 인공지능을 기반으로 객체 자동 인식을 위한 CCTV를 제어하는 과정을 설명하기 위한 순서도이다.
도 3은 일실시예에 따른 복수의 CCTV를 통해 이동하는 객체를 추적하는 과정을 설명하기 위한 순서도이다.
도 4는 일실시예에 따른 제1 CCTV를 도시한 도면이다.
도 5는 일실시예에 따른 객체 수에 따라 CCTV 카메라의 동작을 제어하는 과정을 설명하기 위한 순서도이다.
도 6는 일실시예에 따른 요청 정보에 대응하는 결과 영상 정보를 획득하는 과정을 설명하기 위한 순서도이다.
도 7은 일실시예에 따른 제2 결과 영상 정보 및 제3 결과 영상 정보를 추출하고 생성하는 과정을 설명하기 위한 순서도이다.
도 8은 일실시예에 따른 인공 신경망을 설명하기 위한 도면이다.
도 9는 일실시예에 따른 장치의 구성의 예시도이다.1 is a diagram schematically showing the configuration of a system according to an embodiment.
2 is a flowchart illustrating a process of controlling a CCTV for automatic object recognition based on artificial intelligence according to an embodiment.
3 is a flowchart illustrating a process of tracking an object moving through a plurality of CCTVs according to an embodiment.
4 is a view showing a first CCTV according to an embodiment.
5 is a flowchart for explaining a process of controlling the operation of a CCTV camera according to the number of objects according to an embodiment.
6 is a flowchart illustrating a process of obtaining result image information corresponding to request information according to an exemplary embodiment.
7 is a flowchart illustrating a process of extracting and generating second result image information and third result image information according to an exemplary embodiment.
8 is a diagram for explaining an artificial neural network according to an embodiment.
9 is an exemplary diagram of a configuration of an apparatus according to an embodiment.

이하에서, 첨부된 도면을 참조하여 실시예들을 상세하게 설명한다. 그러나, 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, since various changes may be made to the embodiments, the scope of the patent application is not limited or limited by these embodiments. It should be understood that all modifications, equivalents and substitutes for the embodiments are included in the scope of the rights.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for purposes of illustration only, and may be changed and implemented in various forms. Accordingly, the embodiments are not limited to the specific disclosure form, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical spirit.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Although terms such as first or second may be used to describe various elements, these terms should be interpreted only for the purpose of distinguishing one element from another. For example, a first component may be termed a second component, and similarly, a second component may also be termed a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being “connected to” another component, it may be directly connected or connected to the other component, but it should be understood that another component may exist in between.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the examples are used for the purpose of description only, and should not be construed as limiting. The singular expression includes the plural expression unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that a feature, number, step, operation, component, part, or a combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiment belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In addition, in the description with reference to the accompanying drawings, the same components are given the same reference numerals regardless of the reference numerals, and the overlapping description thereof will be omitted. In describing the embodiment, if it is determined that a detailed description of a related known technology may unnecessarily obscure the gist of the embodiment, the detailed description thereof will be omitted.

실시예들은 퍼스널 컴퓨터, 랩톱 컴퓨터, 태블릿 컴퓨터, 스마트 폰, 텔레비전, 스마트 가전 기기, 지능형 자동차, 키오스크, 웨어러블 장치 등 다양한 형태의 제품으로 구현될 수 있다.The embodiments may be implemented in various types of products, such as personal computers, laptop computers, tablet computers, smart phones, televisions, smart home appliances, intelligent cars, kiosks, wearable devices, and the like.

도 1은 일실시예에 따른 시스템의 구성을 개략적으로 나타낸 도면이다.1 is a diagram schematically showing the configuration of a system according to an embodiment.

도 1을 참조하면, 일실시예에 따른 시스템은 복수의 CCTV(100), 관리자 단말(200) 및 장치(300)를 포함할 수 있다.Referring to FIG. 1 , a system according to an embodiment may include a plurality of CCTVs 100 , a manager terminal 200 , and a device 300 .

먼저, 복수의 CCTV(100)는 특정 장소에 구역 별로 설치되어, 설치된 구역에 대한 촬영을 수행하여 영상 정보를 생성하는 기기로, 제1 구역에 설치된 제1 CCTV(110), 제2 구역에 설치된 제2 CCTV(120) 등을 포함할 수 있다.First, a plurality of CCTVs 100 are installed in a specific place for each zone, and are devices that generate image information by photographing the installed zones, and are installed in the first CCTV 110 installed in the first zone and the second zone. It may include a second CCTV 120 and the like.

복수의 CCTV(100) 각각은 복수의 카메라로 구성될 수 있으며, 감시 구역 내의 이동 객체 검출 기능을 가지는 모션 디텍터 카메라, 열 감지를 가지는 적외선 카메라, 번호판 영상을 촬영하는 고화질 RGB 카메라, 보다 신뢰성이 있는 레이더가 부착된 카메라, 자동추적을 수행하기 위한 팬틸트 카메라, IP 통신을 수행할 수 있는 네트워크 카메라 등의 조합으로 이루어질 수 있다.Each of the plurality of CCTV 100 may be composed of a plurality of cameras, and a motion detector camera having a function of detecting a moving object within the monitoring area, an infrared camera having a heat detection, a high-definition RGB camera taking a license plate image, a more reliable It may consist of a combination of a camera with a radar attached, a pan-tilt camera for performing automatic tracking, and a network camera capable of performing IP communication.

복수의 CCTV(100)는 장치(300)와 네트워크를 통해 연결될 수 있으며, 장치(300)로부터 수신된 제어 신호에 의해 동작할 수 있다.A plurality of CCTVs 100 may be connected to the device 300 through a network, and may operate according to a control signal received from the device 300 .

관리자 단말(200)은 복수의 CCTV(100)에서 촬영되는 영상을 확인하는 관리자가 사용하는 단말로, 장치(300)와 유무선으로 통신하도록 구성될 수 있다.The manager terminal 200 is a terminal used by a manager who checks images captured by a plurality of CCTVs 100 , and may be configured to communicate with the device 300 by wire or wireless.

장치(300)는 장치(300)를 이용하여 서비스를 제공하는 자 내지 단체가 보유한 자체 서버일수도 있고, 클라우드 서버일 수도 있고, 분산된 노드(node)들의 p2p(peer-to-peer) 집합일 수도 있다. 장치(300)는 통상의 컴퓨터가 가지는 연산 기능, 저장/참조 기능, 입출력 기능 및 제어 기능을 전부 또는 일부 수행하도록 구성될 수 있다. The device 300 may be a server owned by a person or organization that provides services using the device 300, a cloud server, or a peer-to-peer (p2p) set of distributed nodes. may be The device 300 may be configured to perform all or part of an arithmetic function, a storage/referencing function, an input/output function, and a control function of a typical computer.

장치(300)는 복수의 CCTV(100)와 유무선으로 통신하도록 구성될 수 있으며, 복수의 CCTV(100) 각각의 동작을 제어하여, 촬영 여부, 촬영 각도, 촬영 방향, 영상 저장 여부 등에 대해 제어할 수 있다.The device 300 may be configured to communicate with a plurality of CCTVs 100 by wire or wireless, and control the operation of each of the plurality of CCTVs 100 to control whether to shoot, a shooting angle, a shooting direction, whether to store an image, etc. can

장치(300)는 관리자 단말(200)과 유무선으로 통신하도록 구성될 수 있으며, 관리자 단말(200)의 동작을 제어하여, 관리자 단말(200)의 화면에 어느 정보를 표시할 것인지에 대해 제어할 수 있다.The device 300 may be configured to communicate with the manager terminal 200 by wire or wireless, and may control the operation of the manager terminal 200 to control which information to display on the screen of the manager terminal 200 . .

도 2는 일실시예에 따른 인공지능을 기반으로 객체 자동 인식을 위한 CCTV를 제어하는 과정을 설명하기 위한 순서도이다.2 is a flowchart illustrating a process of controlling a CCTV for automatic object recognition based on artificial intelligence according to an embodiment.

도 2를 참조하면, 먼저, S201 단계에서, 장치(300)는 제1 CCTV(110)를 통해 제1 구역에 대한 촬영이 수행되고 있는 경우, 제1 CCTV(110)로부터 제1 구역의 촬영으로 생성된 제1 영상 정보를 획득할 수 있다. 이때, 제1 CCTV(110)는 제1 구역에 대한 촬영을 수행하고 있으며, 장치(300)는 제1 구역의 촬영으로 생성된 제1 영상 정보를 실시간으로 획득할 수 있다.Referring to FIG. 2 , first, in step S201 , when the device 300 is shooting for the first area through the first CCTV 110 , the first area is captured from the first CCTV 110 . The generated first image information may be acquired. In this case, the first CCTV 110 is capturing the first area, and the device 300 may acquire the first image information generated by the photographing of the first area in real time.

S202 단계에서, 장치(300)는 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 추출할 수 있다. 이때, 장치(300)는 미리 설정된 기간마다 실시간으로 획득되고 있는 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 추출할 수 있다.In step S202 , the device 300 may extract an image of an image most recently acquired from the first image information as the first image. In this case, the device 300 may extract an image of the most recently acquired image from the first image information acquired in real time for each preset period as the first image.

예를 들어, 장치(300)는 제1 영상 정보를 실시간으로 획득하고 있는 상태에서, 미리 설정된 기간마다, 제1 영상 정보에서 현재 시점에 촬영된 영상의 이미지를 제1 이미지로 추출할 수 있다.For example, in a state in which the first image information is being acquired in real time, the device 300 may extract an image of an image captured at the current time from the first image information as the first image every preset period.

S203 단계에서, 장치(300)는 제1 이미지를 인코딩 하여 제1 입력 신호를 생성할 수 있다.In step S203 , the device 300 may generate a first input signal by encoding the first image.

구체적으로, 장치(300)는 제1 이미지의 픽셀을 색 정보로 인코딩 하여 제1 입력 신호를 생성할 수 있다. 색 정보는 RGB 색상 정보, 명도 정보, 채도 정보를 포함할 수 있으나, 이에 국한하지 않는다. 장치(300)는 색 정보를 수치화된 값으로 환산할 수 있으며, 이 값을 포함한 데이터 시트 형태로 이미지를 인코딩할 수 있다.Specifically, the device 300 may generate the first input signal by encoding the pixels of the first image with color information. The color information may include, but is not limited to, RGB color information, brightness information, and saturation information. The device 300 may convert color information into a numerical value, and may encode an image in the form of a data sheet including this value.

S204 단계에서, 장치(300)는 제1 입력 신호를 인공 신경망에 입력할 수 있다.In step S204 , the device 300 may input the first input signal to the artificial neural network.

일실시예에 따르면, 인공 신경망은 컨볼루션 신경망으로 구현되어, 컨볼루션 신경망은 특징 추출 신경망과 분류 신경망으로 구성되어 있으며, 특징 추출 신경망은 입력 신호를 컨볼루션 계층과 풀링 계층을 차례로 쌓아 진행한다. 컨볼루션 계층은 컨볼루션 연산, 컨볼루션 필터 및 활성함수를 포함하고 있다. 컨볼루션 필터의 계산은 대상 입력의 행렬 크기에 따라 조절되나 일반적으로 9X9 행렬을 사용한다. 활성 함수는 일반적으로 ReLU 함수, 시그모이드 함수, 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 풀링 계층은 입력의 행렬 크기를 줄이는 역할을 하는 계층으로, 특정 영역의 픽셀을 묶어 대표값을 추출하는 방식을 사용한다. 풀링 계층의 연산에는 일반적으로 평균값이나 최대값을 많이 사용하나 이에 한정하지 않는다. 해당 연산은 정방 행렬을 사용하여 진행되는데, 일반적으로 9X9 행렬을 사용한다. 컨볼루션 계층과 풀링 계층은 해당 입력이 차이를 유지한 상태에서 충분히 작아질 때까지 번갈아 반복 진행된다.According to an embodiment, the artificial neural network is implemented as a convolutional neural network, and the convolutional neural network is composed of a feature extraction neural network and a classification neural network, and the feature extraction neural network stacks an input signal with a convolutional layer and a pooling layer sequentially. The convolution layer includes a convolution operation, a convolution filter, and an activation function. The calculation of the convolution filter is adjusted according to the matrix size of the target input, but a 9X9 matrix is generally used. The activation function generally uses, but is not limited to, a ReLU function, a sigmoid function, and a tanh function. The pooling layer is a layer that reduces the size of the input matrix, and uses a method of extracting representative values by tying pixels in a specific area. In general, the average value or the maximum value is often used for the calculation of the pooling layer, but is not limited thereto. The operation is performed using a square matrix, which is usually a 9x9 matrix. The convolutional layer and the pooling layer are repeated alternately until the corresponding input becomes small enough while maintaining the difference.

일실시예에 따르면, 분류 신경망은 은닉층과 출력층을 가지고 있다. CCTV의 영상 처리 방법을 위한 컨볼루션 신경망에서는 일반적으로 은닉층이 3개 이상 존재하며, 각 은닉층의 노드는 100개로 지정하나 경우에 따라 그 이상으로 정할 수 있다. 은닉층의 활성함수는 ReLU 함수, 시그모이드 함수 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 컨볼루션 신경망의 출력층 노드는 총 50개로 할 수 있다. 컨볼루션 신경망에 대한 자세한 설명은 도 8을 참조하여 후술한다.According to one embodiment, the classification neural network has a hidden layer and an output layer. In a convolutional neural network for an image processing method of CCTV, there are generally three or more hidden layers, and 100 nodes for each hidden layer are specified, but more can be specified in some cases. The activation function of the hidden layer uses a ReLU function, a sigmoid function, and a tanh function, but is not limited thereto. A total of 50 output layer nodes of a convolutional neural network can be made. A detailed description of the convolutional neural network will be described later with reference to FIG. 8 .

S205 단계에서, 장치(300)는 컨볼루션 신경망인 인공 신경망의 입력의 대한 결과에 기초하여, 인공 신경망의 출력값인 제1 출력 신호를 획득할 수 있다.In step S205 , the device 300 may obtain a first output signal that is an output value of the artificial neural network based on the result of the input of the artificial neural network, which is the convolutional neural network.

일실시예에 따르면, 컨볼루션 신경망의 50개 출력층 노드는 상위 25개의 출력층 노드와 하위 25개의 출력층 노드를 포함할 수 있다. 컨볼루션 신경망의 50개 출력층 노드 중 상위 25개의 출력층 노드는 객체 인식 확률을 지시할 수 있다. 하위 25개의 노드는 상위 25개의 노드에 대응하는 객체 종류를 지시할 수 있다. 컨볼루션 신경망의 출력에 관한 자세한 설명은 도 8을 참조하여 후술한다.According to an embodiment, 50 output layer nodes of the convolutional neural network may include upper 25 output layer nodes and lower 25 output layer nodes. The top 25 output layer nodes among 50 output layer nodes of the convolutional neural network may indicate the object recognition probability. The lower 25 nodes may indicate object types corresponding to the upper 25 nodes. A detailed description of the output of the convolutional neural network will be described later with reference to FIG. 8 .

S206 단계에서, 장치(300)는 제1 출력 신호를 기초로, 제1 구역에 객체가 존재하는지 여부를 검출할 수 있다. 이때, 장치(300)는 제1 출력 신호를 기초로, 객체 인식 확률에 대한 출력값을 확인하여, 제1 구역 내의 객체 인식 확률을 분석하고, 특정 지점에서의 객체 인식 확률이 미리 설정된 기준값 보다 큰 것으로 확인되면, 제1 구역에 객체가 존재하는 것으로 검출할 수 있다.In step S206 , the device 300 may detect whether an object exists in the first region based on the first output signal. At this time, the device 300 checks the output value for the object recognition probability based on the first output signal, analyzes the object recognition probability in the first area, and determines that the object recognition probability at a specific point is greater than a preset reference value. If confirmed, it may be detected that the object is present in the first area.

S207 단계에서, 장치(300)는 객체 존재 여부에 대한 검출 결과를 통해, 제1 구역에 객체가 존재하는지 여부를 확인할 수 있다.In step S207 , the device 300 may determine whether the object exists in the first area through the detection result of whether the object exists.

S207 단계에서 제1 구역에 객체가 존재하지 않는 것으로 검출되면, S202 단계로 되돌아가, 장치(300)는 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 다시 추출할 수 있다. 이때, 장치(300)는 제1 CCTV(110)로부터 제1 구역의 촬영으로 생성된 영상 정보를 계속 획득할 수 있다.If it is detected that the object does not exist in the first area in step S207, the process returns to step S202, and the device 300 extracts the image of the most recently acquired image from the first image information as the first image again. . At this time, the device 300 may continue to acquire the image information generated by the shooting of the first area from the first CCTV (110).

S207 단계에서 제1 구역에 객체가 존재하는 것으로 검출되면, S208 단계에서, 장치(300)는 제1 출력 신호를 기초로, 제1 구역 내에 위치하는 객체의 종류를 검출할 수 있다. 여기서, 객체의 종류는 사람, 동물, 차량 등 다양한 형태의 객체 종류를 포함할 수 있다.If it is detected that the object exists in the first area in step S207 , in step S208 , the device 300 may detect the type of the object located in the first area based on the first output signal. Here, the type of object may include various types of object types such as people, animals, and vehicles.

이때, 장치(300)는 제1 출력 신호를 기초로, 객체 종류에 대한 출력값을 확인하여, 제1 구역 내에 위치하는 객체의 종류를 검출할 수 있다. 예를 들어, 장치(300)는 제1 출력 신호를 기초로, 객체 종류에 대한 출력값이 “1”로 확인되면, 제1 구역 내에 위치하는 객체의 종류를 “사람”으로 검출하고, 객체 종류에 대한 출력값이 “2”로 확인되면, 제1 구역 내에 위치하는 객체의 종류를 “차량”으로 검출할 수 있다.In this case, the device 300 may detect the type of the object located in the first area by checking the output value of the object type based on the first output signal. For example, when the output value of the object type is “1” based on the first output signal, the device 300 detects the type of the object located in the first area as “person” and determines the type of the object. If the output value for the '2' is confirmed, the type of the object located in the first area may be detected as "vehicle".

S209 단계에서, 장치(300)는 제1 구역 내에 위치하는 객체가 제1 객체로 검출되면, 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식할 수 있다. 예를 들어, 장치(300)는 제1 구역 내에 위치하는 객체가 “사람”으로 검출되면, 제1 구역 내에 “사람”이 위치하고 있는 상태로 인식할 수 있다. In step S209 , when the object located in the first area is detected as the first object, the device 300 may recognize that the first object is located in the first area. For example, when an object located in the first region is detected as a “person”, the device 300 may recognize that a “person” is located in the first region.

S210 단계에서, 장치(300)는 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식되면, 제1 이미지의 추출 이후 획득되는 제1 영상 정보를 관리자 단말(200)로 전송하여, 제1 영상 정보가 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다. 이때, 장치(300)는 제1 구역에서 제1 객체가 인식된 것을 알려주는 알림 메시지를 관리자 단말(200)로 더 전송하여, 알림 메시지가 먼저 팝업으로 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다.In step S210, when the device 300 is recognized as a state in which the first object is located in the first area, the first image information obtained after extraction of the first image is transmitted to the manager terminal 200, and the first image information can be controlled to be displayed on the screen of the manager terminal 200 . At this time, the device 300 further transmits a notification message indicating that the first object is recognized in the first zone to the manager terminal 200, and controls the notification message to be displayed on the screen of the manager terminal 200 as a pop-up first. can do.

즉, 장치(300)는 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식되면, 제1 객체가 인식된 시점부터 획득되는 제1 영상 정보가 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다.That is, when it is recognized that the first object is located in the first area, the device 300 may control so that the first image information obtained from the point in time when the first object is recognized is displayed on the screen of the manager terminal 200 . there is.

도 3은 일실시예에 따른 복수의 CCTV를 통해 이동하는 객체를 추적하는 과정을 설명하기 위한 순서도이다.3 is a flowchart illustrating a process of tracking an object moving through a plurality of CCTVs according to an embodiment.

도 3을 참조하면, 먼저, S301 단계에서, 장치(300)는 제1 구역 내에 위치하는 객체가 제1 객체로 검출되면, 제1 구역 내에 제1 객체가 위치하고 있는 상태로 인식할 수 있다.Referring to FIG. 3 , first, in step S301 , when an object located in the first area is detected as the first object, the device 300 may recognize that the first object is located in the first area.

S302 단계에서, 장치(300)는 제1 이미지의 추출 이후 획득되는 제1 영상 정보를 기초로, 제1 객체의 이동을 추적하여 분석할 수 있다.In step S302 , the device 300 may track and analyze the movement of the first object based on the first image information obtained after the extraction of the first image.

구체적으로, 장치(300)는 제1 CCTV(110)로부터 제1 영상 정보를 실시간으로 계속 획득하고 있는 상태에서, 제1 시점에 제1 구역 내에 제1 객체가 있는 것으로 인식되면, 제1 시점에 획득된 제1 영상 정보를 기초로, 제1 구역 내에서 제1 객체의 위치를 확인하여 분석할 수 있으며, 제1 시점 이후에 획득되는 제1 영상 정보를 기초로, 제1 객체의 이동을 추적하여 분석할 수 있다.Specifically, in a state in which the device 300 continues to acquire the first image information from the first CCTV 110 in real time, when it is recognized that there is a first object in the first area at the first time point, at the first time point Based on the obtained first image information, the position of the first object in the first area may be identified and analyzed, and the movement of the first object is tracked based on the first image information obtained after the first time point. can be analyzed.

S303 단계에서, 장치(300)는 제1 객체의 이동을 추적하여 분석한 결과, 제1 객체가 제1 구역을 벗어나는지 여부를 확인할 수 있다.In step S303 , as a result of tracking and analyzing the movement of the first object, the device 300 may determine whether the first object leaves the first area.

S303 단계에서 제1 객체가 제1 구역을 벗어나지 않은 것으로 확인되면, S302 단계로 되돌아가, 장치(300)는 제1 영상 정보를 기초로, 제1 객체의 이동을 계속 추적하여 분석할 수 있다.If it is confirmed in step S303 that the first object does not deviate from the first area, the process returns to step S302, and the device 300 may continue to track and analyze the movement of the first object based on the first image information.

S303 단계에서 제1 객체가 제1 구역을 벗어난 것으로 확인되면, S304 단계에서, 장치(300)는 제1 객체가 제1 구역을 벗어나 제2 구역 방향으로 이동한 것을 확인할 수 있다.If it is determined in step S303 that the first object has moved out of the first zone, in step S304 , the device 300 may determine that the first object has moved out of the first zone in the direction of the second zone.

예를 들어, 제1 구역의 우측 구역이 제2 구역으로 설정되어 있는 경우, 장치(300)는 제1 구역 내에 위치하고 있는 제1 객체의 이동을 추적한 결과, 제1 객체가 우측 방향으로 이동하다 제1 구역을 벗어난 것이 확인되면, 제1 객체가 제1 구역에서 제2 구역이 있는 방향으로 이동한 것으로 확인할 수 있다.For example, when the right area of the first area is set as the second area, the device 300 tracks the movement of the first object located in the first area, and as a result, the first object moves in the right direction When it is confirmed that the first area is out of the first area, it may be confirmed that the first object has moved from the first area to the second area.

S305 단계에서, 장치(300)는 제2 구역에 대한 촬영을 수행하고 있는 제2 CCTV(120)를 확인할 수 있다.In step S305 , the device 300 may check the second CCTV 120 that is shooting the second area.

S306 단계에서, 장치(300)는 제2 CCTV(120)를 통해 제2 구역에 대한 촬영이 수행되고 있는 경우, 제2 CCTV(120)로부터 제2 구역의 촬영으로 생성된 제2 영상 정보를 획득할 수 있다. 이때, 제2 CCTV(120)는 제2 구역에 대한 촬영을 수행하고 있으며, 장치(300)는 제2 구역의 촬영으로 생성된 제2 영상 정보를 실시간으로 획득할 수 있다.In step S306 , the device 300 acquires the second image information generated by the shooting of the second area from the second CCTV 120 when the second area is being photographed through the second CCTV 120 . can do. In this case, the second CCTV 120 is capturing the second area, and the device 300 may acquire the second image information generated by the photographing of the second area in real time.

S307 단계에서, 장치(300)는 제2 영상 정보를 관리자 단말(200)로 전송하여, 제1 영상 정보에 이어서 제2 영상 정보가 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다.In operation S307 , the device 300 may transmit the second image information to the manager terminal 200 , and control so that the second image information following the first image information is displayed on the screen of the manager terminal 200 .

장치(300)는 제1 구역 및 제2 구역의 위치를 기반으로, 제1 영상 정보에 이어서 제2 영상 정보가 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다.The device 300 may control to display the second image information subsequent to the first image information on the screen of the manager terminal 200 based on the positions of the first and second regions.

예를 들어, 제1 구역이 제2 구역의 좌측 방향에 위치하는 경우, 장치(300)는 제1 영상 정보가 좌측으로 밀려나면서, 동시에 제2 영상 정보가 우측에서부터 들어오도록 처리하여, 제1 영상 정보에 이어서 제2 영상 정보가 관리자 단말(200)의 화면에 표시되도록 제어할 수 있다.For example, when the first zone is located in the left direction of the second zone, the device 300 processes the first image information to be pushed to the left and the second image information to come in from the right at the same time, so that the first image Second image information may be controlled to be displayed on the screen of the manager terminal 200 following the information.

도 4는 일실시예에 따른 제1 CCTV를 도시한 도면이다.4 is a view showing a first CCTV according to an embodiment.

도 4를 참조하면, 제1 CCTV(110)는 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)를 포함할 수 있다.Referring to FIG. 4 , the first CCTV 110 may include a first camera 111 , a second camera 112 , a third camera 113 , and a fourth camera 114 .

제1 카메라(111)는 제1 구역의 제1 방향에 설치되어 있어, 제1 방향에서 제1 구역에 대한 촬영을 수행할 수 있다. 예를 들어, 제1 카메라(111)는 제1 구역의 남쪽 방향에 설치되어 있어, 남쪽 방향에서 제1 구역에 대한 촬영을 수행할 수 있다.The first camera 111 is installed in a first direction of the first area, and thus may perform photographing of the first area in the first direction. For example, since the first camera 111 is installed in the south direction of the first area, it is possible to photograph the first area in the south direction.

제2 카메라(112)는 제1 구역의 제2 방향에 설치되어 있어, 제2 방향에서 제1 구역에 대한 촬영을 수행할 수 있다. 예를 들어, 제2 카메라(112)는 제1 구역의 서쪽 방향에 설치되어 있어, 서쪽 방향에서 제1 구역에 대한 촬영을 수행할 수 있다.The second camera 112 is installed in the second direction of the first area, so that the first area may be photographed in the second direction. For example, the second camera 112 is installed in the west direction of the first area, so that the first area may be photographed in the west direction.

제3 카메라(113)는 제1 구역의 제3 방향에 설치되어 있어, 제3 방향에서 제1 구역에 대한 촬영을 수행할 수 있다. 예를 들어, 제3 카메라(113)는 제1 구역의 동쪽 방향에 설치되어 있어, 동쪽 방향에서 제1 구역에 대한 촬영을 수행할 수 있다.The third camera 113 is installed in the third direction of the first area, and thus may perform photographing of the first area in the third direction. For example, the third camera 113 is installed in the east direction of the first area, so that the first area may be photographed in the east direction.

제4 카메라(114)는 제1 구역의 제4 방향에 설치되어 있어, 제4 방향에서 제1 구역에 대한 촬영을 수행할 수 있다. 예를 들어, 제4 카메라(114)는 제1 구역의 북쪽 방향에 설치되어 있어, 북쪽 방향에서 제1 구역에 대한 촬영을 수행할 수 있다.The fourth camera 114 is installed in the fourth direction of the first area, so that the first area may be photographed in the fourth direction. For example, since the fourth camera 114 is installed in the north direction of the first area, it is possible to photograph the first area in the north direction.

일실시예에 따르면, 제1 CCTV(110)는 실시예에 따라 상이한 수의 카메라로 구현될 수 있다. 예를 들어, 제1 CCTV(110)는 제1 카메라(111)로 하나로 구현될 수 있고, 제1 카메라(111) 및 제2 카메라(112)로만 구현될 수 있고, 제1 카메라(111), 제2 카메라(112) 및 제3 카메라(113)로 구현될 수 있고, 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)를 모두 포함하여 구현될 수 있다.According to one embodiment, the first CCTV 110 may be implemented with a different number of cameras depending on the embodiment. For example, the first CCTV 110 may be implemented as one first camera 111, may be implemented only with the first camera 111 and the second camera 112, the first camera 111, It may be implemented with the second camera 112 and the third camera 113 , including all of the first camera 111 , the second camera 112 , the third camera 113 and the fourth camera 114 . can be implemented.

또한, 제1 CCTV(110)는 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114) 이외에, 다른 방향에 설치된 카메라를 더 포함하여 구현될 수도 있다.In addition, the first CCTV 110 may be implemented by further including a camera installed in a different direction in addition to the first camera 111 , the second camera 112 , the third camera 113 and the fourth camera 114 . there is.

도 5는 일실시예에 따른 객체 수에 따라 CCTV 카메라의 동작을 제어하는 과정을 설명하기 위한 순서도이다.5 is a flowchart for explaining a process of controlling the operation of a CCTV camera according to the number of objects according to an embodiment.

도 5를 참조하면, 먼저, S501 단계에서, 장치(300)는 제1 CCTV(110) 중 제1 카메라(111)만 동작하여 촬영을 수행하도록 제어할 수 있다. 예를 들어, 제1 CCTV(110)에 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)가 포함되어 있는 경우, 장치(300)는 제1 카메라(111)만 동작하여 촬영을 수행하도록 제어하고, 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)에 대해 동작하지 않도록 제어할 수 있다.Referring to FIG. 5 , first, in step S501 , the device 300 may control only the first camera 111 of the first CCTVs 110 to operate and take pictures. For example, if the first CCTV 110 includes the first camera 111 , the second camera 112 , the third camera 113 and the fourth camera 114 , the device 300 is the first It is possible to control that only one camera 111 is operated to perform photographing, and control not to operate with respect to the second camera 112 , the third camera 113 , and the fourth camera 114 .

S502 단계에서, 장치(300)는 제1 카메라(111)를 통해 제1 방향에서 제1 구역에 대한 촬영이 수행되고 있는 경우, 제1 카메라(111)로부터 제1 방향에서 제1 구역을 촬영하여 생성된 제1-1 영상 정보를 획득할 수 있다. 이때, 제1 카메라(111)는 제1 방향에서 제1 구역에 대한 촬영을 수행하고 있으며, 장치(300)는 제1 방향에서 제1 구역의 촬영으로 생성된 제1-1 영상 정보를 실시간으로 획득할 수 있다.In step S502 , the device 300 captures the first area in the first direction from the first camera 111 when the first area is being photographed in the first direction through the first camera 111 . The generated 1-1 image information may be acquired. In this case, the first camera 111 is capturing the first area in the first direction, and the device 300 records the 1-1 image information generated by capturing the first area in the first direction in real time. can be obtained

이후, 장치(300)는 제1-1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1-1 이미지로 추출하고, 제1-1 이미지를 인코딩 하여 제1-1 입력 신호를 생성하고, 제1-1 입력 신호를 인공 신경망에 입력하고, 인공 신경망의 입력에 대한 결과에 기초하여, 제1-1 출력 신호를 획득할 수 있다.Thereafter, the device 300 extracts the image of the most recently acquired image from the 1-1 image information as the 1-1 image, encodes the 1-1 image to generate a 1-1 input signal, The 1-1 input signal may be input to the artificial neural network, and the 1-1 output signal may be obtained based on a result of the input of the artificial neural network.

S503 단계에서, 장치(300)는 제1-1 출력 신호를 기초로, 제1 구역에 객체가 존재하는지 여부를 검출할 수 있다.In step S503 , the device 300 may detect whether an object exists in the first region based on the 1-1 output signal.

S503 단계에서 제1 구역에 객체가 존재하지 않는 것으로 검출되면, S501 단계로 되돌아가, 장치(300)는 제1 카메라(111)만 동작하여 촬영을 수행하도록 제어할 수 있다.If it is detected that the object does not exist in the first area in step S503, the process returns to step S501, and the device 300 may control only the first camera 111 to operate to perform photographing.

S503 단계에서 제1 구역에 객체가 존재하는 것으로 검출되면, S504 단계에서, 장치(300)는 제1 구역 내에 위치하는 객체의 수를 검출할 수 있다.If it is detected that the object exists in the first area in step S503 , in step S504 , the device 300 may detect the number of objects located in the first area.

예를 들어, 장치(300)는 제1-1 출력 신호를 기초로, 제1 구역 내에 제1 객체가 존재하는 것으로 검출되면, 제1-1 출력 신호를 기초로, 제1 구역 내에 제1 객체가 얼마나 많이 존재하는지를 확인하여 제1 객체의 수를 검출할 수 있다. 이를 위해, 인공 신경망은 입력에 대한 결과로, 객체의 인식 확률 및 객체의 종류 뿐만 아니라 객체의 수까지 포함하는 정보를 출력할 수 있다.For example, if the device 300 detects that the first object exists in the first region based on the 1-1 output signal, based on the 1-1 output signal, the first object is located in the first region It is possible to detect the number of first objects by checking how many are present. To this end, the artificial neural network may output, as a result of the input, information including not only the object recognition probability and object type, but also the number of objects.

S505 단계에서, 장치(300)는 제1 구역 내에 위치하는 객체의 수가 기준치 보다 작은지 여부를 확인할 수 있다. 여기서, 기준치는 실시예에 따라 상이하게 설정될 수 있다.In step S505 , the device 300 may determine whether the number of objects located in the first area is smaller than a reference value. Here, the reference value may be set differently depending on the embodiment.

S505 단계에서 제1 구역 내에 위치하는 객체의 수가 기준치 보다 작은 것으로 확인되면, S506 단계에서, 장치(300)는 제1 CCTV(110) 중 제1 카메라(111) 및 제2 카메라(112)가 동작하여 촬영을 수행하도록 제어할 수 있다.If it is confirmed in step S505 that the number of objects located in the first zone is smaller than the reference value, in step S506, the device 300 operates the first camera 111 and the second camera 112 of the first CCTV 110 to control the shooting.

예를 들어, 제1 CCTV(110)에 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)가 포함되어 있는 경우, 제1 카메라(111)만 동작하고 있는 상태였는데, 장치(300)는 제1 구역 내에 위치하는 객체의 수가 기준치 보다 작은 것으로 확인되면, 제1 카메라(111) 뿐만 아니라 제2 카메라(112)도 동작하여 촬영을 수행하도록 제어하고, 제3 카메라(113) 및 제4 카메라(114)에 대해 동작하지 않도록 제어할 수 있다.For example, when the first CCTV 110 includes the first camera 111 , the second camera 112 , the third camera 113 and the fourth camera 114 , the first camera 111 . When it is confirmed that the number of objects located in the first area is smaller than the reference value, the device 300 operates not only the first camera 111 but also the second camera 112 to perform a photographing operation. and control not to operate with respect to the third camera 113 and the fourth camera 114 .

S505 단계에서 제1 구역 내에 위치하는 객체의 수가 기준치 보다 큰 것으로 확인되면, S507 단계에서, 장치(300)는 제1 CCTV(110) 중 제1 카메라(111), 제2 카메라(112) 및 제3 카메라(113)가 동작하여 촬영을 수행하도록 제어할 수 있다.If it is confirmed in step S505 that the number of objects located in the first zone is greater than the reference value, in step S507, the device 300 includes the first camera 111, the second camera 112 and the second of the first CCTV 110. 3 It is possible to control the camera 113 to operate to perform photographing.

예를 들어, 제1 CCTV(110)에 제1 카메라(111), 제2 카메라(112), 제3 카메라(113) 및 제4 카메라(114)가 포함되어 있는 경우, 제1 카메라(111)만 동작하고 있는 상태였는데, 장치(300)는 제1 구역 내에 위치하는 객체의 수가 기준치 보다 큰 것으로 확인되면, 제1 카메라(111) 뿐만 아니라 제2 카메라(112) 및 제3 카메라(113)도 동작하여 촬영을 수행하도록 제어하고, 제3 카메라(113)에 대해 동작하지 않도록 제어할 수 있다.For example, when the first CCTV 110 includes the first camera 111 , the second camera 112 , the third camera 113 and the fourth camera 114 , the first camera 111 . When it is confirmed that the number of objects located in the first area is greater than the reference value, the device 300 not only the first camera 111 but also the second camera 112 and the third camera 113 It is possible to control the operation to perform photographing, and control not to operate with respect to the third camera 113 .

일실시예에 따르면, 장치(300)는 제2 카메라(112)를 통해 제2 방향에서 제1 구역에 대한 촬영이 수행되고 있는 경우, 제2 카메라(112)로부터 제2 방향에서 제1 구역을 촬영하여 생성된 제1-2 영상 정보를 획득할 수 있다. 이때, 제2 카메라(112)는 제2 방향에서 제1 구역에 대한 촬영을 수행하고 있으며, 장치(300)는 제2 방향에서 제1 구역의 촬영으로 생성된 제1-2 영상 정보를 실시간으로 획득할 수 있다.According to an embodiment, the device 300 captures the first area in the second direction from the second camera 112 when the first area is photographed in the second direction through the second camera 112 . The first-second image information generated by photographing may be acquired. At this time, the second camera 112 is capturing the first area in the second direction, and the device 300 records 1-2 image information generated by capturing the first area in the second direction in real time. can be obtained

또한, 장치(300)는 제3 카메라(113)를 통해 제3 방향에서 제1 구역에 대한 촬영이 수행되고 있는 경우, 제3 카메라(113)로부터 제3 방향에서 제1 구역을 촬영하여 생성된 제1-3 영상 정보를 획득할 수 있다. 이때, 제3 카메라(113)는 제3 방향에서 제1 구역에 대한 촬영을 수행하고 있으며, 장치(300)는 제3 방향에서 제1 구역의 촬영으로 생성된 제1-3 영상 정보를 실시간으로 획득할 수 있다.In addition, when the device 300 captures the first area in the third direction through the third camera 113 , the device 300 captures the first area in the third direction from the third camera 113 The 1-3 image information may be acquired. In this case, the third camera 113 is capturing the first area in the third direction, and the device 300 records the 1-3 image information generated by capturing the first area in the third direction in real time. can be obtained

한편, S503 단계에서 제1 구역에 객체가 존재하지 않는 것으로 검출되면, 제1 CCTV(110) 중 제1 카메라(111)만 동작하고 있으므로, 장치(300)는 제1 카메라(111)로부터 제1-1 영상 정보를 획득할 수 있다. 제1-1 영상 정보만 획득되는 경우, 장치(300)는 제1-1 영상 정보를 제1 영상 정보로 설정할 수 있다.On the other hand, if it is detected that the object does not exist in the first area in step S503, since only the first camera 111 of the first CCTV 110 is operating, the device 300 receives the first from the first camera 111. -1 Image information can be obtained. When only the 1-1 image information is obtained, the device 300 may set the 1-1 image information as the first image information.

S506 단계 이후, S508 단계에서, 장치(300)는 제1 CCTV(110) 중 제1 카메라(111) 및 제2 카메라(112)가 동작하고 있으므로, 제1 카메라(111)로부터 제1-1 영상 정보를 획득하고, 제2 카메라(112)로부터 제1-2 영상 정보를 획득할 수 있다. 제1-1 영상 정보 및 제1-2 영상 정보가 획득되는 경우, 장치(300)는 제1-1 영상 정보 및 제1-2 영상 정보를 조합하여, 제1 영상 정보를 생성할 수 있다. 이때, 장치(300)는 제1-1 영상 정보가 제1 방향에서 제1 구역을 촬영하여 생성된 영상 정보이고, 제1-2 영상 정보가 제2 방향에서 제1 구역을 촬영하여 생성된 영상 정보이므로, 제1 방향 및 제2 방향을 고려하여, 제1-1 영상 정보 및 제1-2 영상 정보를 조합할 수 있다.After step S506, in step S508, the device 300 has the first camera 111 and the second camera 112 of the first CCTV 110 in operation, so the 1-1 image from the first camera 111 is Information may be acquired, and 1-2 th image information may be acquired from the second camera 112 . When the 1-1 image information and the 1-2 image information are obtained, the device 300 may generate the first image information by combining the 1-1 image information and the 1-2 image information. In this case, in the device 300 , the 1-1 image information is image information generated by photographing the first area in the first direction, and the 1-2 image information is an image generated by photographing the first area in the second direction. Since it is information, the 1-1 image information and the 1-2 th image information may be combined in consideration of the first direction and the second direction.

S507 단계 이후, S509 단계에서, 장치(300)는 제1 CCTV(110) 중 제1 카메라(111), 제2 카메라(112) 및 제3 카메라(113)가 동작하고 있으므로, 제1 카메라(111)로부터 제1-1 영상 정보를 획득하고, 제2 카메라(112)로부터 제1-2 영상 정보를 획득하고, 제3 카메라(113)로부터 제1-3 영상 정보를 획득할 수 있다. 제1-1 영상 정보 제1-2 영상 정보 및 제1-3 영상 정보가 획득되는 경우, 장치(300)는 제1-1 영상 정보, 제1-2 영상 정보 및 제1-3 영상 정보를 조합하여, 제1 영상 정보를 생성할 수 있다. 이때, 장치(300)는 제1-1 영상 정보가 제1 방향에서 제1 구역을 촬영하여 생성된 영상 정보이고, 제1-2 영상 정보가 제2 방향에서 제1 구역을 촬영하여 생성된 영상 정보이고, 제1-3 영상 정보가 제3 방향에서 제1 구역을 촬영하여 생성된 영상 정보이므로, 제1 방향, 제2 방향 및 제3 방향을 고려하여, 제1-1 영상 정보, 제1-2 영상 정보 및 제1-3 영상 정보를 조합할 수 있다.After step S507, in step S509, the device 300 has the first camera 111, the second camera 112, and the third camera 113 of the first CCTV 110 are operating, so the first camera 111 ) may obtain 1-1 image information, obtain 1-2 image information from the second camera 112 , and obtain 1-3 image information from the third camera 113 . 1-1 image information When the 1-1 image information and the 1-3 image information are obtained, the device 300 stores the 1-1 image information, the 1-2 image information, and the 1-3 image information In combination, the first image information may be generated. In this case, in the device 300 , the 1-1 image information is image information generated by photographing the first area in the first direction, and the 1-2 image information is an image generated by photographing the first area in the second direction. information, and since the 1-3 image information is image information generated by photographing the first area in the third direction, considering the first direction, the second direction, and the third direction, the 1-1 image information, the first -2 image information and 1-3 image information can be combined.

S508 단계 및 S509 단계 이후, S503 단계로 되돌아가, 장치(300)는 제1 구역에 객체가 존재하는지 여부를 검출할 수 있다.After steps S508 and S509, returning to step S503, the device 300 may detect whether an object exists in the first area.

구체적으로, 제1-1 영상 정보 및 제1-2 영상 정보를 조합하여, 제1 영상 정보를 생성하거나, 제1-1 영상 정보, 제1-2 영상 정보 및 제1-3 영상 정보를 조합하여, 제1 영상 정보를 생성한 이후, 장치(300)는 제1 영상 정보에서 가장 최근에 획득된 영상의 이미지를 제1 이미지로 추출하고, 제1 이미지를 인코딩 하여 제1 입력 신호를 생성하고, 제1 입력 신호를 인공 신경망에 입력하고, 인공 신경망의 입력에 대한 결과에 기초하여, 제1 출력 신호를 획득한 후, 제1 출력 신호를 기초로, 제1 구역에 객체가 존재하는지 여부를 검출할 수 있다.Specifically, the first image information is generated by combining the 1-1 image information and the 1-2 image information, or the 1-1 image information, the 1-2 image information, and the 1-3 image information are combined Thus, after generating the first image information, the device 300 extracts an image of the most recently acquired image from the first image information as a first image, encodes the first image to generate a first input signal, , after inputting the first input signal to the artificial neural network, obtaining the first output signal based on the result of the input of the artificial neural network, and determining whether an object exists in the first region based on the first output signal can be detected.

도 6는 일실시예에 따른 요청 정보에 대응하는 결과 영상 정보를 획득하는 과정을 설명하기 위한 순서도이다.6 is a flowchart illustrating a process of obtaining result image information corresponding to request information according to an exemplary embodiment.

도 6을 참조하면, 먼저, S601 단계에서, 장치(300)는 복수의 CCTV(100)로부터 영상 정보를 수집할 수 있다. 이때, 영상 정보는 복수의 CCTV(100) 각각에서 촬영된 영상, 촬영된 시점, 복수의 CCTV(100) 각각의 위치정보, 복수의 CCTV(100) 각각에 대한 고유의 식별 정보를 포함할 수 있다.Referring to FIG. 6 , first, in step S601 , the device 300 may collect image information from a plurality of CCTVs 100 . At this time, the image information may include an image captured by each of the plurality of CCTVs 100, the time taken, location information of each of the plurality of CCTVs 100, and unique identification information for each of the plurality of CCTVs 100. .

S602 단계에서, 장치(300)는 인공 지능 모듈을 통하여 영상 정보에 대응하는 태그 정보를 생성할 수 있다. 이때, 태그 정보는 영상 정보에 나타나는 객체 정보(예를 들면, 남자 2명, 자동차 1대, 건물 2개 등), 객체의 상태 정보(예를 들면, 이동중, 정지중, 돌발상황발생 등), 환경 정보(날씨 정보 등)을 포함할 수 있다.In step S602, the device 300 may generate tag information corresponding to the image information through the artificial intelligence module. At this time, the tag information includes object information (for example, two men, one car, two buildings, etc.) that appears in the image information, information on the state of the object (for example, moving, stopping, unexpected situation, etc.), It may include environmental information (weather information, etc.).

인공 지능 모듈은 머신러닝의 한 분야인 딥러닝(Deep Learning) 기법을 이용하여 영상 정보에서 객체에 대한 정보가 도출될 수 있도록 학습을 수행할 수 있다. The artificial intelligence module can perform learning so that information about an object can be derived from image information using a deep learning technique, which is a field of machine learning.

또한, 인공 지능 모듈은 딥러닝을 통하여 함수에서의 복수 개의 입력들의 가중치(weight)를 학습을 통하여 산출할 수 있다.In addition, the artificial intelligence module may calculate the weight of a plurality of inputs in the function through learning through deep learning.

또한, 이러한 학습을 위하여 활용되는 인공지능망 모델로는 RNN(Recurrent Neural Network), DNN(Deep Neural Network) 및 DRNN(Dynamic Recurrent Neural Network) 등 다양한 모델들을 활용할 수 있을 것이다. 여기서 RNN은 현재의 데이터와 과거의 데이터를 동시에 고려하는 딥러닝 기법으로서, 순환 신경망(RNN)은 인공 신경망을 구성하는 유닛 사이의 연결이 방향성 사이클(directed cycle)을 구성하는 신경망을 나타낸다. 나아가, 순환 신경망(RNN)을 구성할 수 있는 구조에는 다양한 방식이 사용될 수 있는데, 예컨대, 완전순환망(Fully Recurrent Network), 홉필드망(Hopfield Network), 엘만망(Elman Network), ESN(Echo state network), LSTM(Long short term memory network), 양방향(Bi-directional) RNN, CTRNN(Continuous-time RNN), 계층적 RNN, 2차 RNN 등이 대표적인 예이다. 또한, 순환 신경망(RNN)을 학습시키기 위한 방법으로서, 경사 하강법, Hessian Free Optimization, Global Optimization Method 등의 방식이 사용될 수 있다.In addition, various models such as RNN (Recurrent Neural Network), DNN (Deep Neural Network), and DRNN (Dynamic Recurrent Neural Network) may be used as an AI network model used for such learning. Here, RNN is a deep learning technique that considers current data and past data simultaneously. Recurrent neural network (RNN) refers to a neural network in which connections between units constituting an artificial neural network constitute a directed cycle. Furthermore, various methods may be used for a structure capable of constructing a recurrent neural network (RNN), for example, a fully recurrent network, a hopfield network, an Elman network, an ESN (Echo). state network), long short term memory network (LSTM), bi-directional RNN, continuous-time RNN (CTRNN), hierarchical RNN, and secondary RNN are representative examples. In addition, as a method for learning a recurrent neural network (RNN), methods such as gradient descent, Hessian Free Optimization, and Global Optimization Method may be used.

또한, 인공지능망 모델로 패턴 인식 등에 잘 활용되는 SVM(Supported Vector Machine) 신경망 알고리즘을 활용하여 기계학습모델을 생성할 수 있다. 여기서, SVM 즉, 서포트 벡터 머신은 기계학습의 분야 중 하나로 패턴인식, 자료분석을 위한 지도 학습 모델이며, 주로 분류와 회귀분석을 위해 사용될 수 있고, 두 카테고리 중 어느 하나에 속한 데이터의 집합이 주어졌을 때, SVM 알고리즘은 주어진 데이터 집합을 바탕으로 하여 새로운 데이터가 어느 카테고리에 속할지 판단하는 비확률적 이진 선형 분류 모델을 만들 수 있다. In addition, a machine learning model can be created by using a Supported Vector Machine (SVM) neural network algorithm that is well used for pattern recognition as an artificial intelligence network model. Here, SVM, that is, support vector machine, is a supervised learning model for pattern recognition and data analysis as one of the fields of machine learning, and can be mainly used for classification and regression analysis. When lost, the SVM algorithm can create a non-stochastic binary linear classification model that determines which category the new data belongs to based on the given data set.

이를 통하여, 인공 지능 모듈은 보다 정확하게 영상 정보에서 태그 정보를 생성할 수 있다.Through this, the artificial intelligence module can more accurately generate tag information from image information.

S603 단계에서, 장치(300)는 영상 정보 및 태그 정보를 데이터베이스에 저장할 수 있다. 이때, 데이터베이스는 장치(300)에 포함되는 구성으로써, 다양한 정보를 구분하여 저장할 수 있다.In step S603 , the device 300 may store image information and tag information in a database. In this case, the database is a component included in the device 300 , and various pieces of information may be classified and stored.

S604 단계에서, 장치(300)는 사용자 단말로부터 영상을 검색하기 위한 제1 요청 정보를 수신할 수 있다. 여기서, 제1 요청 정보는 제1 객체 정보, 제1 시간 정보 및 제1 장소 정보를 포함할 수 있다. 즉, 장치(300)는 복수의 CCTV(100)에서 촬영된 영상 중에서 사용자가 원하는 영상을 검색하기 위하여, 사용자가 사용하는 사용자 단말로부터 제1 요청 정보를 수신할 수 있다. 이를 위해, 장치(300)는 사용자 단말과 유무선으로 통신하도록 구성될 수 있으며, 사용자 단말로부터 정보를 수신하고, 수신된 정보에 상응하여 사용자 단말의 동작을 제어하여, 사용자 단말의 화면에 어느 정보를 표시할 것인지에 대해 제어할 수 있다.In step S604 , the device 300 may receive first request information for searching for an image from the user terminal. Here, the first request information may include first object information, first time information, and first place information. That is, the device 300 may receive the first request information from the user terminal used by the user in order to search for an image desired by the user from among the images captured by the plurality of CCTVs 100 . To this end, the device 300 may be configured to communicate with the user terminal by wire or wireless, receives information from the user terminal, controls the operation of the user terminal according to the received information, and displays which information on the screen of the user terminal. You can control what is displayed.

제1 요청 정보는 사용자가 원하는 객체, 객체의 상태에 대한 제1 객체 정보, 특정 시간을 선택하는 제1 시간 정보, 특정 장소를 선택하는 제1 장소 정보를 포함할 수 있다. 예를 들어, 사용자는 A라는 장소에 지나가는 남자를 확인하고 싶은 경우, 제1 객체 정보로써, 남자, 이동중이라는 정보와 제1 장소 정보로써, A라는 정보를 사용자 단말을 통하여 장치(300)에 전달할 수 있다. 이때, 제1 시간 정보는 특정되지 않을 수 있으며 이때는 모든 시간을 검색하도록 요청할 수 있다.The first request information may include an object desired by the user, first object information about the state of the object, first time information for selecting a specific time, and first place information for selecting a specific place. For example, when a user wants to check a man passing by a place A, the first object information, the man, the moving information, and the first place information, the information A, are transmitted to the device 300 through the user terminal. can In this case, the first time information may not be specified, and in this case, it may be requested to search all times.

S605 단계에서, 장치(300)는 데이터베이스에 저장된 영상 정보 중에서 제1 요청 정보에 대응하는 제1 결과 영상 정보를 추출할 수 있다. 이때, 태그 정보와 제1 요청 정보가 완벽히 매칭되지 않을 수 있다. 따라서, 장치(300)는 태그 정보에 포함된 단어와 제1 요청 정보에 포함된 단어를 워드임베딩(Word-Embedding) 기반으로 단어간의 유사도를 산출하고, 유사도를 기반으로 제1 결과 영상 정보를 추출할 수도 있다. 이때, 워드임베딩(Word-Embedding)이란, 단어 간 유사도 및 중요도 파악을 위해 단어를 저차원의 실수 벡터로 맵핑하여 의미적으로 비슷한 단어를 가깝게 배치하는 자연어 처리 모델링 기술로써, 딥러닝 기반으로 구성할 수 있다.In step S605 , the device 300 may extract first result image information corresponding to the first request information from among the image information stored in the database. In this case, the tag information and the first request information may not completely match. Accordingly, the device 300 calculates the similarity between the words included in the tag information and the words included in the first request information based on word-embedding, and extracts the first result image information based on the similarity. You may. At this time, word-embedding is a natural language processing modeling technology that maps words to low-dimensional real vectors to determine the similarity and importance between words and places semantically similar words closely. can

제1 결과 영상 정보는 제1 요청 정보에 대응하는 모든 영상 정보일 수 있으나, 이 경우, 사용자 단말로 전달하는 데에 네트워크 리소스를 많이 사용한다는 문제가 생길 수 있고, 사용자로 하여금 또 다시 직접 검색을 해야한다는 번거로움이 발생할 수 있다.The first result image information may be all image information corresponding to the first request information. It can be cumbersome to do.

따라서, 장치(300)는 제1 요청 정보에 대응하는 모든 영상 정보를 제1 요청 정보와 태그 정보와의 매칭 점수가 높은 순으로 정렬하고, 상위 10개 내외만을 제1 결과 영상 정보로 추출할 수도 있다.Accordingly, the device 300 may sort all the image information corresponding to the first requested information in the order of the highest matching score between the first requested information and the tag information, and extract only the top 10 as the first result image information. there is.

S606 단계에서, 장치(300)는 제1 결과 영상 정보를 사용자 단말로 전송할 수 있다.In step S606 , the device 300 may transmit the first result image information to the user terminal.

도 7은 일실시예에 따른 제2 결과 영상 정보 및 제3 결과 영상 정보를 추출하고 생성하는 과정을 설명하기 위한 순서도이다.7 is a flowchart illustrating a process of extracting and generating second result image information and third result image information according to an exemplary embodiment.

도 7을 참조하면, 먼저, S701 단계에서, 장치(300)는 제1 결과 영상 정보에 포함되는 태그 정보를 기반으로 제2 요청 정보를 생성할 수 있다. 이때, 제2 요청 정보는 제2 객체 정보, 제2 시간 정보 및 제2 장소 정보를 포함할 수 있다.Referring to FIG. 7 , first, in step S701 , the device 300 may generate second request information based on tag information included in the first result image information. In this case, the second request information may include second object information, second time information, and second place information.

일실시예에 따르면, 제1 결과 영상 정보는 사용자의 제1 요청 정보를 기반으로 추출된 것인데, 사용자가 원하는 영상을 검색하기 위한 키워드가 제1 요청 정보에 포함되지 않아 원하는 영상이 검색되지 않을 우려가 있다.According to one embodiment, the first result image information is extracted based on the user's first request information, and there is a concern that the desired image may not be searched because a keyword for searching for the image desired by the user is not included in the first request information there is

따라서, 사용자가 원하는 영상을 보다 정확하고 면밀하게 검색하기 위하여, 장치(300)는 사용자의 제1 요청 정보를 기반으로 추출한 제1 결과 영상 정보의 태그 정보를 기반으로 제2 객체 정보, 제2 시간 정보 및 제2 장소 정보를 포함하는 제2 요청 정보를 생성할 수 있다.Accordingly, in order to more accurately and closely search for an image desired by the user, the device 300 performs the second object information, the second time based on the tag information of the first result image information extracted based on the user's first request information. The second request information including the information and the second place information may be generated.

예를 들어, A라는 장소에 지나가는 파란옷을 입은 여자를 찾고 싶은데, 사용자는 검은 옷을 입은 남자라고 착각하여, 제1 요청 정보로써 A, 검은옷, 남자, 이동중을 장치(300)로 전달할 수 있다. 이때, 제1 요청 정보와 100% 매칭되는 영상이 있기도 하겠지만, 일부만이 매칭되는 영상도 제1 결과 영상 정보로써 추출될 수 있다. 이때, 일부만이 매칭되는 제1 결과 영상 정보에는 태그 정보로써, A, 파란옷, 여자, 이동중이라고 저장되어 있을 수 있다. 이때, 제2 요청 정보는 A, 파란옷, 여자, 이동중으로 생성될 수 있다. 이를 통하여, 사용자는 착각으로 제1 요청 정보를 전달하였지만, 실질적으로 원하는 영상을 얻을 수 있다.For example, if you want to find a woman wearing blue clothes passing by place A, the user may mistake it for a man in black clothes, and as the first request information, A, black clothes, men, and moving can be transmitted to the device 300 . there is. At this time, although there may be an image that 100% matches the first requested information, an image that only partially matches may be extracted as the first result image information. In this case, in the first result image information to which only a portion is matched, as tag information, A, blue clothes, women, and moving information may be stored. In this case, the second request information may be generated as A, blue clothes, woman, and moving. Through this, although the user has delivered the first request information by mistake, it is possible to obtain a substantially desired image.

S702 단계에서, 장치(300)는 제1 결과 영상 정보를 촬영한 제1 CCTV(110)의 위치 정보를 기반으로 기 설정된 임계 거리 범위 내에 위치하는 제2 CCTV(120)를 선택할 수 있다. 이는 후술하는 바와 같이 이동하는 객체의 이동에 따라 전후 사정이 담긴 영상 정보를 추출하기 위한 것으로써, 임계 거리 범위는 객체의 이동 속도를 기반으로 설정될 수 있다. 예를 들어, 객체가 사람일 때의 임계 거리 범위는 객체가 자동차일 때의 임계 거리 범위 보다 좁을 수 있다.In step S702, the device 300 may select the second CCTV 120 located within a preset threshold distance range based on the location information of the first CCTV 110 that has captured the first result image information. This is to extract image information containing context and circumstances according to the movement of the moving object as will be described later, and the critical distance range may be set based on the moving speed of the object. For example, the critical distance range when the object is a person may be narrower than the critical distance range when the object is a car.

S703 단계에서, 장치(300)는 제2 CCTV(120)로부터 획득한 영상 정보 중에서 제2 요청 정보에 대응하는 제2 결과 영상 정보를 추출할 수 있다.In step S703 , the device 300 may extract second resultant image information corresponding to the second request information from among the image information obtained from the second CCTV 120 .

제2 요청 정보를 기반으로 데이터베이스에 저장된 모든 영상 정보에서 제2 결과 영상 정보를 추출하는 것은 매우 비효율적일 수 있다. 이는 의미없는 자료의 확장으로 이어질 수 있기 때문이다. 따라서, 제1 CCTV(110)와의 거리로 1차적으로 필터링한 후 제2 요청 정보를 기반으로 제2 결과 영상 정보를 추출하는 것이 바람직하다.It may be very inefficient to extract the second result image information from all image information stored in the database based on the second request information. This is because it can lead to the expansion of meaningless data. Therefore, it is preferable to first filter by the distance from the first CCTV 110 and then extract the second result image information based on the second request information.

S704 단계에서, 장치(300)는 제1 결과 영상 정보를 기반으로 제1 객체 정보에 상응하는 객체의 이동 방향을 산출할 수 있다.In step S704 , the device 300 may calculate a movement direction of the object corresponding to the first object information based on the first result image information.

이후, 장치(300)는 객체의 이동 방향 및 제2 CCTV(120)의 위치 정보를 기반으로 제3 결과 영상 정보를 생성할 수 있다.Thereafter, the device 300 may generate the third result image information based on the moving direction of the object and the location information of the second CCTV 120 .

구체적으로, S705 단계에서, 장치(300)는 객체의 이동 방향을 기반으로 제1 CCTV(110)와 제2 CCTV(120)의 위치를 비교하여, 제2 CCTV(120)가 객체의 이동 방향을 중심으로 제1 CCTV(110)의 이전에 위치하는지 여부를 확인할 수 있다. 이는 사용자가 원하는 객체가 촬영된 영상만을 집중적으로 추출하기 위함이다.Specifically, in step S705, the device 300 compares the positions of the first CCTV 110 and the second CCTV 120 based on the movement direction of the object, and the second CCTV 120 determines the movement direction of the object. It can be confirmed whether the center is located before the first CCTV (110). This is to intensively extract only the image in which the object desired by the user is captured.

S705 단계에서 제2 CCTV(120)의 위치가 객체의 이동 방향을 중심으로 제1 CCTV(110)의 이전에 위치하는 것으로 확인되면, S706 단계에서, 장치(300)는 제2 결과 영상 정보 중 제1 시간 정보에서 기 설정된 쉬프트 시간 범위 만큼 과거로 쉬프트된 영상만을 추출하여 제3 결과 영상 정보를 생성할 수 있다. 이때, 쉬프트 시간 범위는 객체의 이동 속도를 기반으로 설정될 수 있다. 예를 들어, 객체가 사람일 때의 쉬프트 시간 범위는 객체가 자동차일 때의 상기 쉬프트 시간 범위 보다 짧을 수 있다.If it is confirmed in step S705 that the location of the second CCTV 120 is located before the first CCTV 110 with respect to the moving direction of the object, in step S706, the device 300 performs the second result of the image information. Third result image information may be generated by extracting only an image shifted to the past by a preset shift time range from the 1-time information. In this case, the shift time range may be set based on the moving speed of the object. For example, the shift time range when the object is a person may be shorter than the shift time range when the object is a car.

S705 단계에서 제2 CCTV(120)의 위치가 객체의 이동 방향을 중심으로 제1 CCTV(110)의 이후에 위치하는 것으로 확인되면, S707 단계에서, 장치(300)는 제2 결과 영상 정보 중 제1 시간 정보에서 쉬프트 시간 범위 만큼 앞으로 쉬프트된 영상만을 추출하여 제3 결과 영상 정보를 생성할 수 있다.If it is confirmed in step S705 that the position of the second CCTV 120 is located after the first CCTV 110 with respect to the moving direction of the object, in step S707, the device 300 performs the second result of the image information. Third result image information may be generated by extracting only the image shifted forward by the shift time range from the 1-time information.

장치(300)는 제3 결과 영상 정보가 생성되면, 제1 결과 영상 정보, 제2 결과 영상 정보 및 제3 결과 영상 정보를 사용자 단말로 전송할 수 있다.When the third result image information is generated, the device 300 may transmit the first result image information, the second result image information, and the third result image information to the user terminal.

도 8은 일실시예에 따른 인공 신경망을 설명하기 위한 도면이다.8 is a diagram for explaining an artificial neural network according to an embodiment.

일실시예에 따르면, 장치(300)는 CCTV를 통한 촬영으로 생성된 영상의 이미지를 인코딩 하여 입력 신호를 생성할 수 있으며, 인공 신경망(801)은 입력 신호를 입력으로 하고, 객체 인식 확률(802) 및 객체 종류(803)를 출력으로 할 수 있다.According to an embodiment, the device 300 may generate an input signal by encoding an image of a video generated by shooting through CCTV, and the artificial neural network 801 receives the input signal as an input, and the object recognition probability 802 ) and the object type 803 can be output.

일실시예에 따른 인코딩은 이미지의 픽셀 별 색 정보를 수치화된 데이터 시트 형태로 저장하는 방식으로 이뤄질 수 있는데, 색 정보는 하나의 픽셀이 가지고 있는 RGB 색상, 명도 정보, 채도 정보를 포함할 수 있으나, 이에 국한하지 않는다.Encoding according to an embodiment may be performed by storing color information for each pixel of an image in the form of a digitized data sheet, and the color information may include RGB color, brightness information, and saturation information of one pixel. , but not limited thereto.

일실시예에 따르면, 인공 신경망(801)은 컨볼루션 신경망으로 구현되어, 인공 신경망(801)은 특징 추출 신경망(810)과 분류 신경망(820)으로 구성될 수 있으며, 특징 추출 신경망(810)은 영상 정보에서 추출된 이미지에서 객체와 배경을 분리하는 작업을 수행할 수 있으며, 분류 신경망(820)은 그로부터 객체를 인식하여, 객체 인식 확률 및 객체 종류를 파악하는 작업을 수행하도록 할 수 있다. 특징 추출 신경망(810)이 객체와 배경을 구분하는 방법은, 이미지를 인코딩한 제1 입력 신호의 데이터 시트로부터 색 정보의 각 값들의 변화가 한 픽셀을 포함하는 8개의 픽셀 중 6개 이상에서 30% 이상의 변화가 생긴 것으로 감지되는 픽셀들의 묶음을 객체와 배경의 경계로 삼을 수 있으나, 이에 국한하지 않는다.According to an embodiment, the artificial neural network 801 is implemented as a convolutional neural network, and the artificial neural network 801 may be composed of a feature extraction neural network 810 and a classification neural network 820, and the feature extraction neural network 810 is An operation of separating an object and a background from an image extracted from image information may be performed, and the classification neural network 820 may recognize an object therefrom and perform an operation of identifying an object recognition probability and an object type. In the method for the feature extraction neural network 810 to distinguish an object and a background, the change in each value of color information from the data sheet of the first input signal encoding the image is 30 in 6 or more of 8 pixels including one pixel. A group of pixels detected as having a change of % or more may be used as the boundary between the object and the background, but is not limited thereto.

일실시예에 따르면, 특징 추출 신경망(810)은 입력 신호를 컨볼루션 계층과 풀링 계층을 차례로 쌓아 진행한다. 컨볼루션 계층은 컨볼루션 연산, 컨볼루션 필터 및 활성함수를 포함하고 있다. 컨볼루션 필터의 계산은 대상 입력의 행렬 크기에 따라 조절되나 일반적으로 9X9 행렬을 사용한다. 활성 함수는 일반적으로 ReLU 함수, 시그모이드 함수, 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 풀링 계층은 입력의 행렬 크기를 줄이는 역할을 하는 계층으로, 특정 영역의 픽셀을 묶어 대표값을 추출하는 방식을 사용한다. 풀링 계층의 연산에는 일반적으로 평균값이나 최대값을 많이 사용하나 이에 한정하지 않는다. 해당 연산은 정방 행렬을 사용하여 진행되는데, 일반적으로 9X9 행렬을 사용한다. 컨볼루션 계층과 풀링 계층은 해당 입력이 차이를 유지한 상태에서 충분히 작아질 때까지 번갈아 반복 진행된다.According to an embodiment, the feature extraction neural network 810 stacks the input signal by sequentially stacking a convolutional layer and a pooling layer. The convolution layer includes a convolution operation, a convolution filter, and an activation function. The calculation of the convolution filter is adjusted according to the matrix size of the target input, but a 9X9 matrix is generally used. The activation function generally uses, but is not limited to, a ReLU function, a sigmoid function, and a tanh function. The pooling layer is a layer that reduces the size of the input matrix, and uses a method of extracting representative values by tying pixels in a specific area. In general, the average value or the maximum value is often used for the calculation of the pooling layer, but is not limited thereto. The operation is performed using a square matrix, which is usually a 9x9 matrix. The convolutional layer and the pooling layer are repeated alternately until the corresponding input becomes small enough while maintaining the difference.

일실시예에 따른 분류 신경망(820)은 특징 추출 신경망(810)을 통해 배경으로부터 구분된 객체를 그 형태 및 연속성에 따라 종류를 구분하고, 객체 인식 확률(802) 및 객체 종류(803)를 파악할 수 있다. 객체의 비교를 위해 데이터베이스에 저장된 정보들을 활용할 수 있다. 분류 신경망(820)은 객체 인식 확률(802)을 파악하는 작업을 우선으로 하며, 파악된 객체의 형태 및 크기에 따라 객체 종류(803) 파악을 용이하도록 할 수 있다.The classification neural network 820 according to an embodiment classifies the object separated from the background through the feature extraction neural network 810 according to its shape and continuity, and determines the object recognition probability 802 and the object type 803 . can Information stored in the database can be used for object comparison. The classification neural network 820 prioritizes the task of identifying the object recognition probability 802 , and may facilitate the recognition of the object type 803 according to the shape and size of the identified object.

일실시예에 따르면, 분류 신경망(820)은 은닉층과 출력층을 가지고 있다. 장치(300) 내의 인공 신경망(801)에서는 일반적으로 은닉층이 5개 이상 존재하며, 각 은닉층의 노드는 80개로 지정하나 경우에 따라 그 이상으로 정할 수 있다. 은닉층의 활성함수는 ReLU 함수, 시그모이드 함수 및 tanh 함수 등을 사용하나 이에 한정하지 않는다. 인공 신경망(801)의 출력층 노드는 총 50개로 할 수 있다.According to an embodiment, the classification neural network 820 has a hidden layer and an output layer. In the artificial neural network 801 in the device 300 , there are generally 5 or more hidden layers, and 80 nodes of each hidden layer are designated, but more may be specified in some cases. The activation function of the hidden layer uses a ReLU function, a sigmoid function, and a tanh function, but is not limited thereto. The number of output layer nodes of the artificial neural network 801 may be 50 in total.

일실시예에 따른 인공 신경망(801)의 출력은 출력층의 50개 노드 중 상위 25개의 노드는 객체 인식 확률(802)을 지시할 수 있고, 하위 25개의 노드는 상위 노드에 각각 대응하는 객체 종류(803)를 지시할 수 있다. 상위 25개의 노드와 하위 25개의 노드를 대응시키는 방식은 상위 n번 째 노드와 하위 n번 째 노드를 대응시키는 방식으로, 전체에서 n번 째 노드가 전체에서 25+n번 째 노드에 대응하는 방식으로 진행될 수 있다. 예를 들어, 1번 째 노드는 26번 째 노드에 대응하며, 2번 째 노드는 27번 째 노드에, 10번 째 노드는 35번 째 노드에, 25번 재 노드는 50번 째 노드에 대응하는 방식으로 진행될 수 있다. 객체 인식 확률(802)은 확률에 대응하는 코드 정보로 출력될 수 있으나, 이에 국한하지 않는다. 인공 신경망(801)의 50개의 출력층 노드 중 출력값이 없는 출력층 노드는 숫자 ‘0’을 그 출력값으로 출력할 수 있다. 상위 25개의 노드 중에 이 숫자 ‘0’이 포함되는 노드들은 해당하는 객체 인식 확률이 없는 것으로 간주될 수 있다. 만약 분류된 객체 종류가 25개 이상일 경우, 남은 객체 종류는 미리 생성한 출력값이 모두 처리된 후 이어서 자동으로 처리될 수 있다.In the output of the artificial neural network 801 according to an embodiment, the upper 25 nodes among 50 nodes of the output layer may indicate the object recognition probability 802, and the lower 25 nodes may indicate object types ( 803) can be indicated. The method of matching the top 25 nodes with the bottom 25 nodes is a method of matching the top nth node and the bottom nth node, and the nth node in the whole corresponds to the 25+nth node in the whole can proceed with For example, the 1st node corresponds to the 26th node, the 2nd node corresponds to the 27th node, the 10th node corresponds to the 35th node, and the 25th node corresponds to the 50th node. can proceed in this way. The object recognition probability 802 may be output as code information corresponding to the probability, but is not limited thereto. An output layer node having no output value among 50 output layer nodes of the artificial neural network 801 may output the number '0' as its output value. Among the top 25 nodes, nodes including this number '0' may be regarded as having no corresponding object recognition probability. If there are more than 25 classified object types, the remaining object types may be automatically processed subsequently after all output values generated in advance are processed.

일실시예에 따르면, 인공 신경망(801)은 객체 인식 확률(802) 및 객체 종류(803)를 확인하여, 객체의 종류 별로 객체 인식 확률을 확인할 수 있으며, 객체 인식 확률이 미리 설정된 기준값 보다 큰 것으로 확인되면, 객체가 존재하는 것으로 출력값을 생성할 수 있다.According to an embodiment, the artificial neural network 801 may check the object recognition probability 802 and the object type 803 to determine the object recognition probability for each object type, and the object recognition probability is greater than a preset reference value. If confirmed, we can generate an output that the object exists.

예를 들어, 인공 신경망(801)은 제1 구역의 촬영으로 생성된 제1 영상 정보에서 추출된 이미지를 입력 받은 후, 객체의 종류 별로 객체 인식 확률을 분석한 결과, 제1 객체의 객체 인식 확률이 기준값 보다 큰 것으로 확인되면, 제1 구역에 제1 객체가 존재하는 것으로 출력값을 생성할 수 있다. 이때, 인공 신경망(801)은 제1 구역에 제1 객체가 존재하는 것으로 확인되면, 제1 영상 정보에서 추출된 이미지를 기초로, 제1 구역 내에 객체가 얼마나 많이 존재하는지 분석하여, 제1 구역에 존재하는 제1 객체의 수를 출력값으로 생성할 수도 있다.For example, the artificial neural network 801 receives an image extracted from the first image information generated by photographing of the first area, and then analyzes the object recognition probability for each type of object. As a result, the object recognition probability of the first object If it is confirmed that the value is greater than this reference value, an output value may be generated as the presence of the first object in the first area. At this time, when it is confirmed that the first object exists in the first area, the artificial neural network 801 analyzes how many objects exist in the first area based on the image extracted from the first image information, and the first area It is also possible to generate the number of first objects existing in the output value.

일실시예에 따르면, 인공 신경망(801)은 관리자가 인공 신경망(801)에 따른 객체 파악의 문제점 발견 시 관리자에 의해 입력된 수정 정답에 의해 생성되는 제1 학습 신호를 전달받아 학습할 수 있다. 인공 신경망(801)에 따른 객체 파악의 문제점은 객체 인식 확률(802) 및 객체 종류(803)에 문제가 있는 경우를 의미할 수 있다.According to an embodiment, the artificial neural network 801 may learn by receiving the first learning signal generated by the corrected correct answer input by the administrator when the administrator discovers a problem in object identification according to the artificial neural network 801 . The problem of object recognition according to the artificial neural network 801 may mean that there is a problem in the object recognition probability 802 and the object type 803 .

일실시예에 따른 제1 학습 신호는 정답과 출력값의 오차를 바탕으로 만들어지며, 경우에 따라 델타를 이용하는 SGD나 배치 방식 혹은 역전파 알고리즘을 따르는 방식을 사용할 수 있다. 제1 학습 신호에 의해 인공 신경망(801)은 기존의 가중치를 수정해 학습을 수행하며, 경우에 따라 모멘텀을 사용할 수 있다. 오차의 계산에는 비용함수가 사용될 수 있는데, 비용함수로 Cross entropy 함수를 사용할 수 있다.The first learning signal according to an embodiment is created based on the error between the correct answer and the output value, and in some cases, SGD using delta, a batch method, or a method following a backpropagation algorithm may be used. According to the first learning signal, the artificial neural network 801 performs learning by modifying the existing weights, and in some cases, momentum may be used. A cost function can be used to calculate the error, and a cross entropy function can be used as the cost function.

도 9는 일실시예에 따른 장치의 구성의 예시도이다.9 is an exemplary diagram of a configuration of an apparatus according to an embodiment.

일실시예에 따른 장치(300)는 프로세서(310) 및 메모리(320)를 포함한다. 프로세서(310)는 도 1 내지 도 8을 참조하여 전술된 적어도 하나의 장치들을 포함하거나, 도 1 내지 도 8을 참조하여 전술된 적어도 하나의 방법을 수행할 수 있다. 장치(300)를 이용하는 자 또는 단체는 도 1 내지 도 8을 참조하여 전술된 방법들 일부 또는 전부와 관련된 서비스를 제공할 수 있다.The device 300 according to an embodiment includes a processor 310 and a memory 320 . The processor 310 may include at least one of the devices described above with reference to FIGS. 1 to 8 , or perform at least one method described above with reference to FIGS. 1 to 8 . A person or organization using the apparatus 300 may provide a service related to some or all of the methods described above with reference to FIGS. 1 to 8 .

메모리(320)는 전술된 방법들과 관련된 정보를 저장하거나 전술된 방법들을 구현하는 프로그램을 저장할 수 있다. 메모리(320)는 휘발성 메모리 또는 비휘발성 메모리일 수 있다.The memory 320 may store information related to the above-described methods or a program for implementing the above-described methods. The memory 320 may be a volatile memory or a non-volatile memory.

프로세서(310)는 프로그램을 실행하고, 장치(300)를 제어할 수 있다. 프로세서(310)에 의하여 실행되는 프로그램의 코드는 메모리(320)에 저장될 수 있다. 장치(300)는 입출력 장치(도면 미 표시)를 통하여 외부 장치(예를 들어, 퍼스널 컴퓨터 또는 네트워크)에 연결되고, 유무선 통신을 통해 데이터를 교환할 수 있다.The processor 310 may execute a program and control the device 300 . The code of the program executed by the processor 310 may be stored in the memory 320 . The device 300 may be connected to an external device (eg, a personal computer or a network) through an input/output device (not shown), and may exchange data through wired/wireless communication.

장치(300)는 인공 신경망을 학습시키거나, 학습된 인공 신경망을 이용하는데 사용될 수 있다. 메모리(320)는 학습 중인 또는 학습된 인공 신경망을 포함할 수 있다. 프로세서(310)는 메모리(320)에 저장된 인공 신경망 알고리즘을 학습시키거나 실행시킬 수 있다. 인공 신경망을 학습시키는 장치(300)와 학습된 인공 신경망을 이용하는 장치(300)는 동일할 수도 있고 개별적일 수도 있다.The device 300 may be used to train an artificial neural network or to use a learned artificial neural network. The memory 320 may include a learning or learned artificial neural network. The processor 310 may learn or execute an artificial neural network algorithm stored in the memory 320 . The apparatus 300 for learning an artificial neural network and the apparatus 300 for using the learned artificial neural network may be the same or may be separate.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented by a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the apparatus, methods, and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA) array), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or apparatus, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited drawings, those skilled in the art may apply various technical modifications and variations based on the above. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

In a method of controlling a CCTV for automatic object recognition based on artificial intelligence, performed by a device,
acquiring first image information generated by shooting of the first area from the first CCTV when the first area is being photographed through the first CCTV;
extracting an image of the most recently acquired image from the first image information as a first image;
generating a first input signal by encoding the first image;
inputting the first input signal to an artificial neural network, and obtaining a first output signal based on a result of the input of the artificial neural network;
detecting whether an object exists in the first area based on the first output signal;
detecting a type of an object located in the first area based on the first output signal when it is detected that the object is present in the first area;
recognizing that the first object is located in the first area when the object located in the first area is detected as the first object; and
When it is recognized that the first object is located in the first area, the first image information obtained after extraction of the first image is transmitted to the manager terminal, and the first image information is displayed on the screen of the manager terminal control to be displayed;
The step of obtaining the first image information includes:
Among the first CCTVs, a first camera is installed in a first direction of the first zone, a second camera is installed in a second direction of the first zone, and a third camera is installed in a third direction of the first zone. when a camera is installed, controlling only the first camera to operate to perform photographing;
When the first area is photographed in the first direction through the first camera, 1-1 image information generated by photographing the first area in the first direction from the first camera obtaining;
extracting an image of the most recently acquired image from the 1-1 image information as a 1-1 image;
generating a 1-1 input signal by encoding the 1-1 image;
inputting the 1-1 input signal to an artificial neural network, and obtaining a 1-1 output signal based on a result of the input of the artificial neural network;
detecting whether an object exists in the first area based on the 1-1 output signal;
when it is detected that no object exists in the first area, controlling only the first camera to operate to perform photographing;
detecting the number of objects located in the first area based on the 1-1 output signal when it is detected that an object exists in the first area;
when it is determined that the number of objects located in the first area is smaller than a preset reference value, controlling the first camera and the second camera to operate to perform photographing;
controlling the first camera, the second camera, and the third camera to operate to perform photographing when it is confirmed that the number of objects located in the first area is greater than the reference value;
When the first area is photographed in the second direction through the second camera, 1-2 th image information generated by photographing the first area in the second direction from the second camera obtaining;
When the first area is photographed in the third direction through the third camera, 1-3 image information generated by photographing the first area in the third direction from the third camera obtaining;
setting the 1-1 image information as the first image information when only the 1-1 image information is obtained;
generating the first image information by combining the 1-1 image information and the 1-2 th image information when the 1-1 image information and the 1-2 image information are obtained; and
When the 1-1 image information, the 1-2 image information, and the 1-3 image information are obtained, the 1-1 image information, the 1-2 image information, and the 1-3 image Combining the information, comprising the step of generating the first image information,
CCTV control method for automatic object recognition based on artificial intelligence.

The method of claim 1,
when it is recognized that the first object is located in the first area, tracking and analyzing the movement of the first object based on the first image information obtained after extraction of the first image;
As a result of tracking and analyzing the movement of the first object, if it is confirmed that the first object has moved out of the first area and moved in the direction of the second area, a second CCTV that is filming the second area checking;
acquiring second image information generated by shooting of the second area from the second CCTV when the second area is being photographed through the second CCTV; and
Transmitting the second image information to the manager terminal, further comprising the step of controlling the second image information to be displayed on the screen of the manager terminal subsequent to the first image information,
CCTV control method for automatic object recognition based on artificial intelligence.

delete