KR20220050960A

KR20220050960A - Target detection method, apparatus, electronic device and storage medium

Info

Publication number: KR20220050960A
Application number: KR1020227009421A
Authority: KR
Inventors: 쿤 얀; 쿤린 양; 준 호우; 슈아이 이
Original assignee: 베이징 센스타임 테크놀로지 디벨롭먼트 컴퍼니 리미티드
Priority date: 2019-10-29
Filing date: 2019-12-20
Publication date: 2022-04-25
Also published as: CN110796649B; WO2021082231A1; TWI772757B; CN110796649A; TW202117595A; JP2022549728A

Abstract

본 발명은 목표 검출 방법, 장치, 전자 기기 및 기억 매체에 관한 것으로, 상기 방법은 처리 대상인 검출 이미지를 취득하는 것과, 상기 검출 이미지의 이미지 특징에 기초하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하는 것과, 상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 것과, 상기 대상 특징에 기초하여 상기 목표 검출 대상의 카테고리를 특정하는 것을 포함하는 목표 검출 방법에 관한 것이다.The present invention relates to a target detection method, apparatus, electronic device, and storage medium, the method comprising: acquiring a detection image to be processed; based on the image characteristic of the detection image, a size characteristic and a corner corresponding to the target detection object specifying a feature, extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature, and specifying a category of the target detection target based on the target feature It relates to a target detection method comprising:

Description

Target detection method, apparatus, electronic device and storage medium

본 발명은 2019년 10월 29일에 중국 특허청에 출원된 제201911038042.3호 「목표 검출 방법 및 장치, 전자 기기, 및 기억 매체」를 발명의 명칭으로 한 중국 특허출원의 우선권을 주장하고, 그 내용 전체가 인용에 의해 본 발명에 포함된다.The present invention claims the priority of the Chinese patent application with the title of invention No. 201911038042.3 "target detection method and apparatus, electronic device, and storage medium", filed with the Chinese Intellectual Property Office on October 29, 2019, the entire contents of which is incorporated herein by reference.

본 발명은 컴퓨터 비전 기술 분야에 관한 것으로, 특히 목표 검출 방법 및 장치, 전자 기기 및 기억 매체에 관한 것이다.The present invention relates to the field of computer vision technology, and more particularly to a target detection method and apparatus, an electronic device and a storage medium.

컴퓨터 비전은 컴퓨터 및 관련 장치를 사용하여 생체 시각을 시뮬레이션하는 기술이고, 수집된 이미지 또는 비디오를 처리함으로써, 대응하는 장면의 3차원 정보를 취득할 수 있다. 컴퓨터 비전의 하나의 적용예로서, 수집된 이미지 또는 비디오를 사용하여 목표 검출을 행하고, 목표 검출 대상의 카테고리 및 이미지에서의 위치를 특정할 수 있다.Computer vision is a technology for simulating a living body vision using a computer and related devices, and by processing the collected image or video, it is possible to acquire three-dimensional information of a corresponding scene. As one application of computer vision, target detection may be performed using a collected image or video, and the category of the target detection object and its location in the image may be specified.

현재, 목표 검출 기술에서는 뉴럴 네트워크를 사용하여 목표 검출 대상의 카테고리 및 위치 결정을 위한 검출 프레임을 직접 특정할 수 있다.Currently, in target detection technology, a detection frame for determining a category and a position of a target detection target can be directly specified using a neural network.

본 발명은 목표 검출의 기술적 해결 수단을 제안한다.The present invention proposes a technical solution of target detection.

본 발명의 일 측면에 의하면, 처리 대상인 검출 이미지를 취득하는 것과, 상기 검출 이미지의 이미지 특징에 기초하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하는 것과, 상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 것과, 상기 대상 특징에 기초하여 상기 목표 검출 대상의 카테고리를 특정하는 것을 포함하는 목표 검출 방법을 제공한다.According to one aspect of the present invention, acquiring a detection image as a processing target, specifying a size characteristic and a corner characteristic corresponding to a target detection target based on an image characteristic of the detection image, the size characteristic and the corner characteristic to provide a target detection method, comprising extracting a target feature corresponding to the target detection target from the image feature based on , and specifying a category of the target detection target based on the target feature.

하나의 가능한 실시형태에서는, 상기 검출 이미지의 이미지 특징에 기초하여 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하는 것은 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻는 것과, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻는 것을 포함한다.In one possible embodiment, specifying a size feature and a corner feature corresponding to a target detection object based on the image feature of the detection image performs one or more steps of convolution processing on the detection image, and the image of the detection image obtaining a feature, and performing corner pulling processing on the image feature of the detection image, and obtaining a size feature and a corner feature corresponding to a target detection object.

하나의 가능한 실시형태에서는, 상기 합성곱 처리는 업 샘플링 처리와 다운 샘플링 처리를 포함하고, 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻는 것은 상기 검출 이미지에 대해 1단 이상의 다운 샘플링 처리를 행하고, 1단 이상의 다운 샘플링 처리 후의 제1 특징맵을 얻는 것과, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵에 기초하여, 1단 이상의 업 샘플링 처리 후의 제2 특징맵을 얻는 것과, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵과 상기 1단 이상의 업 샘플링 처리 후의 제2 특징맵에 기초하여, 상기 검출 이미지의 이미지 특징을 얻는 것을 포함한다.In one possible embodiment, the convolution processing includes up-sampling processing and down-sampling processing, performing one or more stages of convolution processing on the detected image, and obtaining image characteristics of the detection image is performed on the detection image. to the second stage after one or more stages of upsampling processing, based on the first feature map after one or more stages of downsampling processing, and obtaining a first feature map after one or more stages of downsampling processing obtaining a feature map; and obtaining image features of the detected image based on the first feature map after the one or more stages of downsampling processing and the second feature map after the one or more stages of upsampling processing.

하나의 가능한 실시형태에서는, 각 단의 상기 다운 샘플링 처리 후에 하나의 제1 특징맵을 출력하고, 각 단의 상기 업 샘플링 처리 후에 하나의 제2 특징 이미지를 출력하고, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵에 기초하여, 1단 이상의 업 샘플링 처리 후의 제2 특징맵을 얻는 것은 상기 1단 이상의 업 샘플링 처리 중 1단째의 업 샘플링 처리에 대해, 상기 1단째의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후의 제1 특징맵을, 상기 1단째의 업 샘플링 처리에 대한 입력으로 하는 것과, 상기 1단째의 업 샘플링 처리 후에 출력된 제2 특징맵을 얻는 것과, 상기 1단 이상의 업 샘플링 처리 중 N단째의 업 샘플링 처리에 대해, 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵 및 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을 상기 N단째의 업 샘플링 처리에 대한 입력으로 하는 것과, 상기 N단째의 업 샘플링 처리에 의해 출력된 제2 특징맵을 얻는 것을 포함하고, 여기서 N은 1보다 큰 양의 정수이다.In one possible embodiment, one first feature map is output after the downsampling process of each stage, and one second feature image is output after the upsampling process of each stage, and the one or more stages of downsampling processing Based on the subsequent first feature map, obtaining a second feature map after one or more stages of upsampling processing is the last stage of the first stage downsampling processing for the first stage upsampling processing among the one or more stages of upsampling processing using the first feature map after the downsampling process as an input to the first-stage upsampling process; obtaining a second feature map output after the first-stage upsampling process; A first feature matching the second feature map output after the up-sampling process immediately before the N-th up-sampling process and the second feature map output after the N-th up-sampling process with respect to the N-th up-sampling process using a map as an input to the N-th up-sampling process, and obtaining a second feature map output by the N-th up-sampling process, where N is a positive integer greater than 1.

하나의 가능한 실시형태에서는, 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵 및 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을 상기 N단째의 업 샘플링 처리에 대한 입력으로 하는 것은 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵과, 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을 특징 융합시켜, 상기 N단째의 업 샘플링 처리에 대한 입력을 얻는다.In one possible embodiment, a first feature map matching the second feature map output after the up-sampling process immediately before the N-th stage up-sampling process and the second feature map output after the N-th stage up-sampling process is described above The input to the N-th up-sampling process is a second feature map output after the up-sampling process immediately before the N-th up-sampling process, and a second feature map output after the N-th up-sampling process. The first feature map is subjected to feature fusion to obtain an input for the N-th step up-sampling process.

하나의 가능한 실시형태에서는, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻는 것은 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하여 처리 결과를 얻는 것과, 제1 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징을 얻는 것과, 제1 분기 네트워크와 채널 수가 상이한 제2 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 코너 특징을 얻는 것을 포함한다.In one possible embodiment, corner pulling processing is performed on the image feature of the detection image, and obtaining the size feature and corner feature corresponding to the target detection object is performed by performing corner pulling processing on the image feature of the detection image, resulting in the processing obtaining a size feature corresponding to a target detection target by performing convolution processing on the processing result using a first branch network, and using a second branch network having a different number of channels from the first branch network. performing convolution processing on the processing result, and obtaining a corner feature corresponding to the target detection target.

하나의 가능한 실시형태에서는, 상기 사이즈 특징과 상기 코너 특징에 기초하여 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 것은 상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 검출 이미지에서의 상기 목표 검출 대상의 이미지 영역과 매핑 관계가 있는 특징 영역을 특정하는 것과, 상기 이미지 특징의 특징 영역에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 것을 포함한다.In one possible embodiment, extracting a target feature corresponding to the target detection object in the image feature based on the size feature and the corner feature comprises: based on the size feature and the corner feature, and specifying a feature region having a mapping relationship with an image region of the target detection target, and extracting a target feature corresponding to the target detection target from the feature region of the image feature.

하나의 가능한 실시형태에서는, 상기 목표 검출 대상에 대응하는 코너 특징은 적어도 상기 목표 검출 대상에 대응하는 제1 코너 특징과 제2 코너 특징을 포함하고, 상기 목표 검출 대상에 대응하는 사이즈 특징은 상기 목표 검출 대상의 제1 코너 특징에 대응하는 길이 특징, 폭 특징과, 상기 목표 검출 대상의 제2 코너 특징에 대응하는 길이 특징, 폭 특징을 포함한다.In one possible embodiment, the corner feature corresponding to the target detection object comprises at least a first corner feature and a second corner feature corresponding to the target detection object, and the size feature corresponding to the target detection object is the target and a length feature and a width feature corresponding to the first corner feature of the detection object, and a length feature and a width feature corresponding to a second corner feature of the target detection object.

하나의 가능한 실시형태에서는, 상기 제1 코너 특징에 대응하는 길이 특징, 폭 특징 및 상기 제2 코너 특징에 대응하는 길이 특징, 폭 특징에 기초하여, 상기 검출 이미지에서 상기 목표 검출 대상의 검출 프레임을 특정하는 것과, 겹치는 임의의 2개의 검출 프레임간의 교차 오버 유니온(Intersection-over-Union, IoU)을 특정하는 것과, 미리 설정된 임계값보다 상기 교차 오버 유니온이 큰 경우, 상기 겹치는 임의의 2개의 검출 프레임을 하나의 검출 프레임에 합병하는 것을 추가로 포함한다.In one possible embodiment, the detection frame of the target detection object in the detection image is determined based on a length characteristic and a width characteristic corresponding to the first corner characteristic and a length characteristic and a width characteristic corresponding to the second corner characteristic. specifying, specifying an intersection-over-union (IoU) between any two overlapping detection frames, and when the intersection-over-union is greater than a preset threshold, the overlapping arbitrary two detection frames and merging into one detection frame.

하나의 가능한 실시형태에서는, 상기 대상 특징에 기초하여 상기 목표 검출 대상의 카테고리를 특정하는 것은 상기 대상 특징에 대해 1단 이상의 합성곱 처리를 행하고, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률을 얻는 것과, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률에 기초하여, 상기 미리 설정된 카테고리에서 상기 목표 검출 대상의 카테고리를 특정하는 것을 포함한다.In one possible embodiment, specifying the category of the target detection target based on the target characteristic includes performing one or more stages of convolution processing on the target characteristic, and the probability that the target detection target belongs to one or more preset categories and specifying a category of the target detection object in the preset category based on a probability that the target detection object belongs to one or more preset categories.

본 발명의 일 측면에 의하면, 처리 대상인 검출 이미지를 취득하는 취득 모듈과, 상기 검출 이미지의 이미지 특징에 기초하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하는 특정 모듈과, 상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 추출 모듈과, 상기 대상 특징에 기초하여 상기 목표 검출 대상의 카테고리를 특정하는 분류 모듈을 포함하는 목표 검출 장치를 제공한다.According to one aspect of the present invention, an acquisition module for acquiring a detection image as a processing target, a specific module for specifying size characteristics and corner characteristics corresponding to a target detection target based on image characteristics of the detection image, the size characteristic and an extraction module configured to extract a target feature corresponding to the target detection target from the image feature based on the corner feature, and a classification module configured to specify a category of the target detection target based on the target feature. provide the device.

하나의 가능한 실시형태에서는, 상기 특정 모듈은 구체적으로 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻고, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻는다.In one possible embodiment, the specific module specifically performs one or more steps of convolution processing on the detection image, obtaining image features of the detection image, and performing corner pooling processing on the image features of the detection image, A size feature and a corner feature corresponding to the target detection object are obtained.

하나의 가능한 실시형태에서는 상기 합성곱 처리는 업 샘플링 처리와 다운 샘플링 처리를 포함하고, 상기 특정 모듈은 구체적으로 상기 검출 이미지에 대해 1단 이상의 다운 샘플링 처리를 행하고, 1단 이상의 다운 샘플링 처리 후의 제1 특징맵을 얻고, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵에 기초하여, 1단 이상의 업 샘플링 처리 후의 제2 특징맵을 얻고, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵과 상기 1단 이상의 업 샘플링 처리 후의 제2 특징맵에 기초하여, 상기 검출 이미지의 이미지 특징을 얻는다.In one possible embodiment, the convolution process includes an up-sampling process and a down-sampling process, and the specific module specifically performs one or more stages of down-sampling processing on the detected image, and the second stage after the one or more stages of down-sampling processing. Obtain one feature map, and on the basis of the first feature map after one or more stages of downsampling processing, obtain a second feature map after one or more stages of upsampling processing, the first feature map after one or more stages of downsampling processing and the above Image features of the detected image are obtained based on the second feature map after one or more stages of up-sampling processing.

하나의 가능한 실시형태에서는 각 단의 상기 다운 샘플링 처리 후에 하나의 제1 특징맵을 출력하고, 각 단의 상기 업 샘플링 처리 후에 하나의 제2 특징 이미지를 출력하고, 상기 특정 모듈은 구체적으로 상기 1단 이상의 업 샘플링 처리 중 1단째의 업 샘플링 처리에 대해, 상기 1단 이상의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후의 제1 특징맵을, 상기 1단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 1단째의 업 샘플링 처리 후에 출력된 제2 특징맵을 얻고, 상기 1단 이상의 업 샘플링 처리 중의 N단째의 업 샘플링 처리에 대해, 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵, 및 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을, 상기 N단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 N단째의 업 샘플링 처리에 의해 출력된 제2 특징맵을 얻고, 여기서 N은 1보다 큰 양의 정수이다.In one possible embodiment, one first feature map is output after the down-sampling process of each stage, and one second feature image is output after the up-sampling process of each stage, and the specific module is specifically configured to: With respect to the upsampling process of the first stage among the upsampling processing of more than one stage, a first feature map after the downsampling process of the final stage among the downsampling processing of the stage more than one stage is set as an input to the upsampling process of the first stage; A second feature map output after the first-stage upsampling process is obtained, and for the N-th up-sampling process in the first or more-stage upsampling processing, the second outputted after the upsampling process immediately before the N-th up-sampling process is obtained. A first feature map matching the feature map and the second feature map output after the N-th up-sampling process is input to the N-th up-sampling process, and is output by the N-th up-sampling process A second feature map is obtained, where N is a positive integer greater than 1.

하나의 가능한 실시형태에서는, 상기 특정 모듈은 구체적으로 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵과, 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을 특징 융합시켜 상기 N단째의 업 샘플링 처리에 대한 입력을 얻는다.In one possible embodiment, the specific module specifically matches the second feature map output after the up-sampling process immediately before the N-th stage upsampling process and the second feature map output after the N-th stage up-sampling process The input to the N-th step upsampling process is obtained by feature fusion of the first feature map.

하나의 가능한 실시형태에서는, 상기 특정 모듈은 구체적으로 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 처리 결과를 얻고, 제1 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징을 얻고, 제1 분기 네트워크와 채널 수가 상이한 제2 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 코너 특징을 얻는다.In one possible embodiment, the specific module specifically performs corner pooling processing on the image features of the detection image, obtaining a processing result, and performing convolution processing on the processing result using a first branch network; A size feature corresponding to the target detection object is obtained, and a convolution process is performed on the processing result using a second branch network different in the number of channels from the first branch network to obtain a corner feature corresponding to the target detection object.

하나의 가능한 실시형태에서는, 상기 추출 모듈은 구체적으로 상기 사이즈 특징과 상기 코너 특징에 기초하여 상기 검출 이미지에서의 상기 목표 검출 대상의 이미지 영역과 매핑 관계가 있는 특징 영역을 특정하고, 상기 이미지 특징의 특징 영역에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출한다.In one possible embodiment, the extraction module specifically specifies a feature region having a mapping relationship with an image region of the target detection object in the detection image based on the size feature and the corner feature, A target feature corresponding to the target detection target is extracted from the feature area.

하나의 가능한 실시형태에서는, 상기 제1 코너 특징에서의 대응하는 길이 특징, 폭 특징 및 상기 제2 코너 특징에서의 대응하는 길이 특징, 폭 특징에 기초하여, 상기 검출 이미지에서 상기 목표 검출 대상의 검출 프레임을 특정하고, 겹치는 임의의 2개의 검출 프레임간의 교차 오버 유니온을 결정하고, 미리 설정된 임계값보다 상기 교차 오버 유니온이 큰 경우, 상기 겹치는 임의의 2개의 검출 프레임을 하나의 검출 프레임에 합병하는 합병 모듈을 추가로 포함한다.In one possible embodiment, the detection of the target detection object in the detection image based on a corresponding length feature, a width feature in the first corner feature and a corresponding length feature, a width feature in the second corner feature Merge that specifies a frame, determines a cross-over union between any two overlapping detection frames, and merges the two overlapping detection frames into one detection frame when the cross-over union is greater than a preset threshold Includes additional modules.

하나의 가능한 실시형태에서는, 상기 분류 모듈은 구체적으로 상기 대상 특징에 대해 1단 이상의 합성곱 처리를 행하고, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률을 얻고, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률에 기초하여, 상기 미리 설정된 카테고리에서 상기 목표 검출 대상의 카테고리를 특정한다.In one possible embodiment, the classification module specifically performs one or more steps of convolutional processing on the target feature, obtains a probability that the target detection object belongs to one or more preset categories, and the target detection object includes one or more Based on the probability of belonging to a preset category, the category of the target detection target is specified in the preset category.

본 발명의 일 측면에 의하면, 프로세서와, 프로세서가 실행 가능한 명령을 기억하기 위한 메모리를 포함하고, 상기 프로세서는 상기 목표 검출 방법을 실행하도록 구성되는 전자 기기를 제공한다.According to one aspect of the present invention, there is provided an electronic device comprising a processor and a memory for storing instructions executable by the processor, wherein the processor is configured to execute the target detection method.

본 발명의 일 측면에 의하면, 컴퓨터 프로그램 명령이 기억되어 있는 컴퓨터 판독 가능 기억 매체로서, 상기 컴퓨터 프로그램 명령은 프로세서에 의해 실행되면, 상기 목표 검출 방법을 실현시키는 컴퓨터 판독 가능 기억 매체를 제공한다.According to one aspect of the present invention, there is provided a computer readable storage medium storing computer program instructions, wherein the computer program instructions are executed by a processor to realize the target detection method.

본 발명의 일 측면에 의하면, 컴퓨터 판독 가능 코드를 포함하고, 상기 컴퓨터 판독 가능 코드가 전자 기기에서 동작하면, 상기 전자 기기의 프로세서에 상기 목표 검출 방법을 실현하기 위한 명령을 실행시키는 컴퓨터 프로그램을 제공한다.According to one aspect of the present invention, there is provided a computer program comprising computer readable code, wherein when the computer readable code operates in an electronic device, a processor of the electronic device executes instructions for realizing the target detection method do.

또한, 상술한 개략적인 설명 및 다음의 상세한 설명은 예시적이고 해석적인 것에 불과하고, 본 발명을 한정하는 것은 아님을 이해해야 한다.In addition, it should be understood that the foregoing schematic description and the following detailed description are illustrative and interpretative only, and not limiting of the present invention.

이하, 도면을 참조하면서 예시적인 실시예를 상세히 설명함으로써, 본 발명의 다른 특징 및 측면은 명료해진다.BRIEF DESCRIPTION OF THE DRAWINGS Other features and aspects of the present invention will become apparent by describing exemplary embodiments in detail below with reference to the drawings.

여기서, 본 명세서의 일부로서 포함되는 도면은 본 발명의 실시예에 적합하고, 명세서와 함께 본 발명의 기술적 해결 수단의 설명에 사용된다.
도 1은 본 발명의 실시예에 따른 목표 검출 방법의 흐름도를 나타낸다.
도 2는 본 발명의 실시예에 따른 목표 검출 대상의 검출 프레임의 블록도를 나타낸다.
도 3은 본 발명의 실시예에 따른 뉴럴 네트워크를 사용하여 목표 검출 대상의 카테고리를 얻는 블록도이다.
도 4는 본 발명의 실시예에 따른 목표 검출 장치의 블록도를 나타낸다.
도 5는 본 발명의 실시예에 따른 전자 기기의 일례의 블록도를 나타낸다.Here, the drawings included as a part of the present specification are suitable for the embodiment of the present invention, and together with the specification are used for the description of the technical solution of the present invention.
1 is a flowchart of a target detection method according to an embodiment of the present invention.
2 is a block diagram of a detection frame of a target detection target according to an embodiment of the present invention.
3 is a block diagram of obtaining a category of a target detection target using a neural network according to an embodiment of the present invention.
4 is a block diagram of a target detection apparatus according to an embodiment of the present invention.
5 is a block diagram illustrating an example of an electronic device according to an embodiment of the present invention.

이하에, 도면을 참조하여 본 발명의 다양한 예시적인 실시예, 특징 및 측면을 상세히 설명한다. 도면에서 동일한 부호는 동일하거나 또는 유사한 기능의 요소를 나타낸다. 도면에서 실시예의 다양한 측면을 나타내었지만, 특별히 언급하지 않는 한, 비례에 따라 도면을 그릴 필요는 없다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Various exemplary embodiments, features and aspects of the present invention are described in detail below with reference to the drawings. In the drawings, the same reference numerals indicate elements having the same or similar functions. Although the drawings have shown various aspects of the embodiments, the drawings are not necessarily drawn to scale unless otherwise noted.

여기서의 용어 「예시적」이란, 「예, 실시예로서 사용되는 것 또는 설명적인 것」을 의미한다. 여기서, 「예시적」으로 설명되는 어떠한 실시예는 다른 실시예보다 바람직하거나 또는 우수한 것으로 이해해서는 안된다.As used herein, the term “exemplary” means “an example, used as an example, or explanatory.” Here, any embodiment described as “exemplary” should not be understood as being preferable or superior to other embodiments.

본 명세서에서의 용어 「및/또는」은 관련 대상과의 연관 관계를 기술하는 것이고, 3개의 관계가 존재 가능함을 나타내고, 예를 들면, A 및/또는 B는 A만이 존재하는 경우, A와 B 양방이 존재하는 경우, B만이 존재하는 경우와 같은 3개의 경우를 나타내도 된다. 또한, 본 명세서에서의 용어 「하나 이상」은 복수 중 어느 하나, 또는 복수 중 2개 이상의 임의의 조합을 나타내고, 예를 들면, A, B, C 중 하나 이상을 포함하는 것은 A, B 및 C로 이루어지는 집합에서 선택된 어느 하나 또는 복수의 요소를 포함하는 것을 나타내도 된다. The term "and/or" in the present specification describes a relationship with a related object, and indicates that three relationships can exist, for example, when A and/or B are only A, A and B When both exist, you may show three cases like the case where only B exists. In addition, the term "one or more" in the present specification indicates any one of a plurality or any combination of two or more of the plurality, for example, A, B and C including one or more of A, B, and C It may indicate including any one or a plurality of elements selected from the set consisting of.

또한, 본 발명을 보다 효과적으로 설명하기 위해 이하의 구체적인 실시형태에서 다양한 구체적인 상세를 나타낸다. 당업자라면 본 발명은 어떠한 구체적인 상세가 없어도, 동일하게 실시할 수 있음을 이해해야 한다. 일부 실시예에서는 본 발명의 취지를 강조하기 위해 당업자가 숙지하고 있는 방법, 수단, 요소 및 회로에 대해 상세한 설명을 행하지 않는다. In addition, in order to explain the present invention more effectively, various specific details are set forth in the following specific embodiments. It should be understood by those skilled in the art that the present invention may be practiced without any specific details. In some embodiments, detailed descriptions of methods, means, elements and circuits known to those skilled in the art are not set forth in order to emphasize the spirit of the present invention.

본 발명의 실시예에 따른 목표 검출 방법에 의하면, 먼저, 처리 대상인 검출 이미지를 취득하고, 다음으로 검출 이미지의 이미지 특징에 기초하여 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하고, 사이즈 특징과 코너 특징에 기초하여 이미지 특징으로부터 목표 검출 대상에 대응하는 대상 특징을 추출하고, 그 후, 목표 검출 대상에 대응하는 대상 특징에 기초하여 목표 검출 대상의 카테고리를 특정해도 된다. 이에 의해, 목표 검출의 검출 결과를 얻을 수 있다. 이러한 방법에 의해, 목표 검출 대상에 대응하는 대상 특징을 먼저 특정하고, 그 후, 대응하는 대상 특징에 기초하여 목표 검출 대상을 분류할 수 있다. 이 때문에, 목표 검출에 있어서, 목표 검출 대상의 대상 특징 및 목표 검출 대상의 카테고리를 비병행으로 특정할 수 있고, 보다 정확한 검출 결과를 얻을 수 있고, 검출 결과의 정밀도가 향상된다.According to the target detection method according to the embodiment of the present invention, first, a detection image to be processed is acquired, and then size characteristics and corner characteristics corresponding to the target detection object are specified based on the image characteristics of the detection image, and the size characteristic A target feature corresponding to the target detection target may be extracted from the image feature based on and corner features, and then the category of the target detection target may be specified based on the target feature corresponding to the target detection target. Thereby, the detection result of target detection can be obtained. In this way, it is possible to first specify a target characteristic corresponding to the target detection target, and then classify the target detection target based on the corresponding target characteristic. For this reason, in target detection, the target characteristic of the target detection target and the category of the target detection target can be specified non-parallelly, a more accurate detection result can be obtained, and the precision of a detection result can be improved.

관련 기술에서는 목표 검출은 일반적으로 예비 검출 프레임을 형성하는 앵커를 집중적으로 수집할 필요가 있다. 그러나, 대량의 앵커에는 많은 무효한 앵커가 있기 때문에, 처리 시간 및 저장 공간이 필요하다. 또한, 관련 기술의 목표 검출 프로세스에서, 목표 검출 대상의 검출 프레임과 카테고리는 병행하여 특정되기 때문에, 목표 검출 대상의 카테고리를 특정할 때 검출 프레임의 정보를 고려할 수 없고, 결과로서, 충분히 정확한 검출 결과를 얻을 수 없다. 본 발명의 실시예에 따른 목표 검출 방법에 의하면, 목표 검출 대상의 코너와 사이즈를 특정함으로써, 목표 검출 대상의 대상 특징을 먼저 특정할 수 있다. 대량의 앵커를 수집하는 것에 의한 시간 및 저장 공간의 낭비를 저감시킨다. 또한, 코너와 사이즈에 기초하여 얻어진 대상 특징은 2개의 목표 검출 대상이 겹치는 경우에, 앵커에 의해 특정된 중심점에 따라 상이한 목표 검출 대상을 구별하는 곤란함을 경감시킬 수 있다. 이 때문에, 본 발명의 실시예에 따른 목표 검출 방법에 의하면, 코너 특징에 의해 상이한 목표 검출 대상을 구별할 수 있고, 앵커의 수집에 걸리는 시간 및 저장 공간을 절약할 수 있고, 검출 결과의 취득 효율이 향상되고, 정확성이 높은 검출 결과를 얻을 수 있다.In the related art, target detection generally needs to intensively collect anchors that form a preliminary detection frame. However, since there are many invalid anchors in a large number of anchors, processing time and storage space are required. Further, in the target detection process of the related art, since the detection frame and the category of the target detection object are specified in parallel, the information of the detection frame cannot be taken into account when specifying the category of the target detection object, as a result, a sufficiently accurate detection result can't get According to the target detection method according to the embodiment of the present invention, by specifying the corner and size of the target detection target, it is possible to first specify the target characteristics of the target detection target. Reduce waste of time and storage space by collecting a large amount of anchors. Further, the target feature obtained based on the corner and the size can alleviate the difficulty of distinguishing different target detection objects according to the center point specified by the anchor, in the case where two target detection objects overlap. For this reason, according to the target detection method according to the embodiment of the present invention, it is possible to distinguish different target detection targets according to the corner features, it is possible to save time and storage space for collecting anchors, and to obtain the detection result efficiency. This is improved, and a detection result with high accuracy can be obtained.

본 발명의 실시예에 따른 정보 처리 방법은 목표 검출이 필요한 임의의 장면에 적용하는 것이 가능하다. 예를 들면, 수집된 비디오에 대해 목표 검출을 행하는 장면에 적용하여, 검출 결과에 기초하여 비디오에서의 목표 검출 대상의 궤적을 얻어도 된다. 예를 들면, 보안 장면에 적용하여 검출 결과에 기초하여 용의자를 인식, 추적해도 된다. 이하, 실시예에 의해 본 발명에 따른 목표 검출 방법에 대해 설명한다.The information processing method according to the embodiment of the present invention can be applied to any scene requiring target detection. For example, it may be applied to a scene in which target detection is performed with respect to the collected video, and the trajectory of the target detection target in the video may be obtained based on the detection result. For example, it may be applied to a security scene to recognize and track a suspect based on a detection result. Hereinafter, a target detection method according to the present invention will be described by way of example.

도 1은 본 발명의 실시예에 따른 목표 검출 방법의 흐름도를 나타낸다. 이 목표 검출 방법은 사용자 기기(User Equipment, UE), 휴대 기기, 사용자 단말, 단말, 셀룰러 폰, 무선 전화, 퍼스널 디지털 어시스턴트(Personal Digital Assistant, PDA), 휴대 장치, 계산 장치, 차재 장치, 웨어러블 디바이스 등의 단말 장치, 서버 또는 다른 목표 검출 장치에 의해 실행되어도 된다. 일부 가능한 실시형태에서는 이 목표 검출 방법은 프로세서에 의해 메모리에 기억되어 있는 컴퓨터 판독 가능 명령을 불러내어 실현되어도 된다. 이하, 목표 검출 방법을 실행 주체로 하는 것을 예로 하여 본 발명의 실시예에 따른 목표 검출 장치에 대해 설명한다.1 is a flowchart of a target detection method according to an embodiment of the present invention. This target detection method includes a user equipment (UE), a portable device, a user terminal, a terminal, a cellular phone, a wireless telephone, a personal digital assistant (PDA), a portable device, a computing device, an in-vehicle device, and a wearable device. It may be executed by a terminal device such as a server, or other target detection device. In some possible embodiments, this target detection method may be realized by invoking computer readable instructions stored in a memory by a processor. Hereinafter, a target detection apparatus according to an embodiment of the present invention will be described by taking the target detection method as an execution subject as an example.

도 1에 나타내는 바와 같이, 상기 목표 검출 방법은 이하의 단계를 포함한다.As shown in Fig. 1, the target detection method includes the following steps.

단계(S11): 처리 대상인 검출 이미지를 취득한다.Step S11: Acquire a detection image to be processed.

본 발명의 실시예에서는 목표 검출 장치는 현재 장면을 촬영하고, 처리 대상인 검출 이미지를 취득하는 것이 가능한 이미지 취득 기능을 갖는 것이어도 되고, 다른 장치에 의해 처리 대상인 검출 이미지를 취득하는 것이어도 된다. 검출 이미지는 개별적으로 취득된 이미지여도 되고, 비디오 스트림 내의 이미지 프레임이어도 된다.In the embodiment of the present invention, the target detection device may have an image acquisition function capable of photographing the current scene and acquiring a detection image to be processed, or may acquire a detection image to be processed by another device. The detected image may be an image acquired individually or may be an image frame in a video stream.

여기서, 취득된 처리 대상인 검출 이미지는 예를 들면, 이미지 스케일링, 이미지 강조, 이미지 필터 등의 전처리가 행해진 검출 이미지여도 된다. 예를 들면, 검출 이미지의 애스펙트 비를 변경하지 않고, 검출 이미지의 장변과 단변을 적절한 사이즈로 조정하여, 전처리된 검출 이미지를 얻을 수 있다.Here, the acquired detection image to be processed may be, for example, a detection image subjected to pre-processing such as image scaling, image enhancement, and image filter. For example, a preprocessed detection image can be obtained by adjusting the long and short sides of the detection image to appropriate sizes without changing the aspect ratio of the detection image.

단계(S12): 상기 검출 이미지의 이미지 특징에 기초하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정한다.Step S12: Based on the image characteristics of the detection image, a size characteristic and a corner characteristic corresponding to the target detection object are specified.

본 발명의 실시예에서는 뉴럴 네트워크를 사용하여 검출 이미지의 이미지 특징을 추출할 수 있고, 검출 이미지의 이미지 특징에 기초하여 목표 검출 대상에 대응하는 사이즈 특징을 특정함과 함께 목표 검출 대상에 대응하는 코너 특징을 특정할 수 있다. 여기서, 목표 검출 대상은 검출 이미지에서 검출할 필요가 있는 대상이어도 된다. 예를 들면, 보행자, 차량, 건물, 마크 등의 대상의 이미지이다. 목표 검출 대상의 사이즈 특징은 목표 검출 대상이 위치하는 이미지 영역의 사이즈 특징을 특징화할 수 있다. 예를 들면, 검출 이미지에서의 목표 검출 대상의 이미지 영역은 사각형으로 도시되어 있는 경우, 목표 검출 대상에 대응하는 사이즈 특징은 사각형에 대응하는 길이 특징 및/또는 폭 특징이어도 된다. 코너 특징은 목표 검출 대상이 위치하는 이미지 영역의 대각점의 위치 정보를 특징화할 수 있다.In an embodiment of the present invention, an image feature of a detection image can be extracted using a neural network, and a size feature corresponding to the target detection target is specified based on the image characteristic of the detection image, and a corner corresponding to the target detection target is specified. characteristics can be specified. Here, the target detection target may be a target that needs to be detected in the detection image. For example, it is an image of an object such as a pedestrian, a vehicle, a building, or a mark. The size characteristic of the target detection object may characterize the size characteristic of an image region in which the target detection object is located. For example, when the image region of the target detection object in the detection image is shown as a rectangle, the size characteristic corresponding to the target detection object may be a length characteristic and/or a width characteristic corresponding to the rectangle. The corner feature may characterize position information of a diagonal point of an image region in which the target detection object is located.

하나의 가능한 실시형태에서는, 목표 검출 대상에 대응하는 코너 특징은 적어도 상기 목표 검출 대상에 대응하는 제1 코너 특징과 제2 코너 특징을 포함하고, 상기 목표 검출 대상에 대응하는 사이즈 특징은 상기 목표 검출 대상의 제1 코너 특징에 대응하는 길이 특징, 폭 특징과, 상기 목표 검출 대상의 제2 코너 특징에 대응하는 길이 특징, 폭 특징을 포함한다.In one possible embodiment, the corner feature corresponding to the target detection object comprises at least a first corner feature and a second corner feature corresponding to the target detection object, and the size feature corresponding to the target detection object is the target detection object. and a length feature and a width feature corresponding to a first corner feature of the object, and a length feature and a width feature corresponding to a second corner feature of the target detection object.

이 실시형태에서는 코너는 제1 코너와 제2 코너를 포함해도 되고, 제1 코너와 제2 코너는 한 쌍의 대각점이어도 된다. 이에 따라, 코너 특징은 제1 코너 특징과 제2 코너 특징을 포함해도 된다. 이와 같이, 목표 검출 대상이 위치하는 이미지 영역의 대각점의 위치 정보를 제1 코너 특징과 제2 코너 특징의 조합으로 나타냄으로써, 상이한 목표 검출 대상을 구별하기 곤란하다는 문제의 발생을 저감시킬 수 있다. 이에 따라, 목표 검출 대상에 대응하는 사이즈 특징은 제1 코너 특징에서의 대응하는 길이 특징, 폭 특징과 제2 코너 특징에서의 대응하는 길이 특징, 폭 특징을 포함해도 된다. 이와 같이, 상이한 코너 특징에서의 대응하는 사이즈 특징에 기초하여, 추가로 상이한 목표 검출 대상을 구별할 수 있다. 이에 의해, 목표 검출 대상에 대해 보다 정확한 검출 결과를 얻을 수 있다.In this embodiment, a corner may contain a 1st corner and a 2nd corner, and a pair of diagonal points may be sufficient as a 1st corner and a 2nd corner. Accordingly, the corner feature may include a first corner feature and a second corner feature. In this way, by expressing the position information of the diagonal points of the image region where the target detection target is located as a combination of the first corner feature and the second corner feature, it is possible to reduce the occurrence of a problem in that it is difficult to distinguish different target detection targets. . Accordingly, the size characteristic corresponding to the target detection object may include a corresponding length characteristic in the first corner characteristic, a corresponding length characteristic in the width characteristic and a corresponding length characteristic in the second corner characteristic, and a width characteristic. As such, based on the corresponding size features at different corner features, it is possible to further distinguish different target detection objects. Thereby, a more accurate detection result can be obtained with respect to a target detection object.

여기서, 제1 코너는 목표 검출 대상의 좌측 상단 코너 또는 우측 하단 코너여도 된다. 이에 따라, 제2 코너는 목표 검출 대상에 대응하는 우측 상단 코너 또는 좌측 하단 코너여도 된다. 제1 코너와 제2 코너는 대각점이어도 되고, 즉, 제1 코너를 좌측 상단 코너, 제2 코너를 우측 하단 코너로 해도 되고, 제1 코너를 우측 상단 코너, 제2 코너를 좌측 하단 코너로 해도 된다.Here, the first corner may be an upper left corner or a lower right corner of the target detection target. Accordingly, the second corner may be an upper right corner or a lower left corner corresponding to the target detection object. The first corner and the second corner may be diagonal, that is, the first corner may be the upper-left corner, the second corner may be the lower-right corner, the first corner may be the upper-right corner, and the second corner may be the lower-left corner. You can do it.

단계(S13): 상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출한다.Step S13: extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature.

본 발명의 실시예에서는 목표 검출 대상에 대응하는 사이즈 특징에 기초하여 목표 검출 대상에 대응하는 이미지 사이즈를 특정할 수 있고, 목표 검출 대상에 대응하는 코너 특징에 기초하여 목표 검출 대상의 검출 이미지에서의 이미지 위치를 특정할 수 있고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 조합함으로써, 목표 검출 대상에 대응하는 대상 특징을 특정할 수 있다. 이 대상 특징은 검출 이미지에서의 상기 목표 검출 대상의 이미지 영역에 대응하는 특징을 특징화할 수 있고, 이 대상 특징은 목표 검출 대상의 이미지 위치를 나타내는 것이어도 된다.In an embodiment of the present invention, an image size corresponding to a target detection target may be specified based on a size feature corresponding to the target detection target, and an image size corresponding to the target detection target may be specified in the detection image of the target detection target based on a corner feature corresponding to the target detection target. It is possible to specify the image position, and by combining the size feature and the corner feature corresponding to the target detection object, it is possible to specify the object feature corresponding to the target detection object. The target characteristic may characterize a characteristic corresponding to an image region of the target detection object in the detection image, and the target characteristic may indicate an image position of the target detection object.

하나의 가능한 실시형태에서는 목표 검출 대상에 대응하는 대상 특징을 추출할 때, 사이즈 특징과 코너 특징에 기초하여 검출 이미지에서의 목표 검출 대상의 이미지 영역과 매핑 관계가 있는 특징 영역을 특정하고, 그 후, 이미지 특징의 특징 영역으로부터 목표 검출 대상에 대응하는 대상 특징을 추출할 수 있다.In one possible embodiment, when extracting a target feature corresponding to the target detection object, a feature region having a mapping relationship with an image region of the target detection object in the detection image is specified based on the size feature and the corner feature, and then , a target feature corresponding to the target detection target may be extracted from the feature region of the image feature.

이 실시형태에서는 검출 이미지에 대해 특징 추출을 행하고, 검출 이미지의 이미지 특징을 얻을 수 있다. 이 이미지 특징은 특징맵으로 나타내도 된다. 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징에 기초하여, 이 특징맵에서 검출 이미지에서의 목표 검출 대상의 이미지 영역에 대응하는, 사이즈 특징과 코너 특징으로 나타내는 특징 영역을 특정할 수 있다. 특징맵에서 이 특징 영역의 이미지 특징을, 목표 검출 대상에 대응하는 대상 특징으로서 추출해도 된다. 대상 특징에 기초하여, 검출 이미지에서의 목표 검출 대상의 이미지 영역을 특정할 수 있다.In this embodiment, it is possible to perform feature extraction on the detection image to obtain image features of the detection image. This image feature may be represented by a feature map. Based on the size feature and the corner feature corresponding to the target detection object, it is possible to specify a feature area indicated by the size feature and the corner feature that corresponds to the image area of the target detection object in the detection image in the feature map. From the feature map, the image feature of this feature region may be extracted as a target feature corresponding to the target detection target. Based on the target characteristic, an image region of the target detection target in the detection image may be specified.

하나의 가능한 실시형태에서는 제1 코너 특징에서의 대응하는 길이 특징, 폭 특징 및 제2 코너 특징에서의 대응하는 길이 특징, 폭 특징에 기초하여, 상기 검출 이미지에서의 목표 검출 대상의 검출 프레임을 특정할 수 있고, 그 후, 겹치는 임의의 2개의 검출 프레임간의 교차 오버 유니온을 결정하고, 미리 설정된 임계값보다 교차 오버 유니온이 큰 경우, 겹치는 임의의 2개의 검출 프레임을 1개의 검출 프레임에 합병한다.In one possible embodiment, the detection frame of the target detection object in the detection image is specified based on the corresponding length and width features in the first corner feature and the corresponding length and width features in the second corner feature. and then determine a cross-over union between any two overlapping detection frames, and if the cross-over union is larger than a preset threshold value, merge any two overlapping detection frames into one detection frame.

이 실시형태에서는 제1 코너 특징에서의 대응하는 길이 특징 및 폭 특징은 상기 특징맵에서 하나의 특징 영역을 나타내도 된다. 제2 코너 특징에서의 대응하는 길이 특징, 폭 특징은 상기 특징맵에서 하나의 특징 영역을 나타내도 된다. 제1 코너 특징이 나타내는 특징 영역과 제2 코너 특징이 나타내는 특징 영역은 동일한 특징 영역이어도 되고, 상이한 특징 영역이어도 된다. 특징 영역과, 검출 이미지에서의 목표 검출 대상의 이미지 영역간의 매핑 관계에 기초하여, 검출 프레임을 사용하여 검출 이미지에서의 목표 검출 대상의 이미지 영역을 감쌀 수 있다. 이 검출 프레임은 폐쇄 도형이어도 되고, 예를 들면, 정사각형, 직사각형 등의 사각형이다. 검출 프레임은 검출 이미지에서의 목표 검출 대상의 이미지 영역을 특정할 수 있고, 상기 코너 특징은 검출 프레임의 2개의 대각점의 위치를 나타낼 수 있고, 상기 사이즈 특징은 검출 프레임의 길이, 폭을 나타낼 수 있다.In this embodiment, the corresponding length feature and width feature in the first corner feature may represent one feature region in the feature map. Corresponding length features and width features in the second corner feature may represent one feature region in the feature map. The characteristic region indicated by the first corner feature and the feature region indicated by the second corner feature may be the same feature region or different feature regions. Based on the mapping relationship between the feature region and the image region of the target detection object in the detection image, the detection frame may be used to wrap the image region of the target detection object in the detection image. The detection frame may be a closed figure, and is, for example, a rectangle such as a square or a rectangle. The detection frame may specify an image region of the target detection object in the detection image, the corner feature may indicate the positions of two diagonal points of the detection frame, and the size characteristic may indicate the length and width of the detection frame there is.

여기서, 동일한 목표 검출 대상에 대해 복수의 검출 프레임이 존재할 가능성이 있고, 복수의 검출 프레임끼리는 겹칠 가능성이 있다. 이 때문에, 목표 검출 대상의 검출 프레임이 겹치는 경우, 겹치는 임의의 2개의 검출 프레임간의 교차 오버 유니온을 산출할 수 있다. 미리 설정된 임계값으로부터 산출된 교차 오버 유니온이 크면, 이 겹치는 2개의 검출 프레임은 동일한 목표 검출 대상을 표지한 것으로 생각되고, 겹치는 2개의 검출 프레임 중 큰 쪽의 검출 프레임을 목표 검출 대상의 검출 프레임으로 하고, 작은 쪽의 검출 프레임을 삭제하도록 해도 된다. 또는, 겹치는 2개의 검출 프레임을 합병함으로써 하나의 새로운 검출 프레임을 형성하고, 합병 전의 2개의 검출 프레임을 포함하는 새로운 검출 프레임을 목표 검출 대상의 검출 프레임으로 해도 된다. 이와 같이, 얻어진 검출 프레임을 추가로 선별하고, 하나의 목표 검출 대상이 하나의 검출 프레임에 대응하도록 동일한 목표 검출 대상의 검출 프레임을 합병할 수 있다.Here, there is a possibility that a plurality of detection frames exist for the same target detection object, and there is a possibility that the plurality of detection frames overlap each other. For this reason, when the detection frames of the target detection object overlap, the intersection over union between arbitrary two overlapping detection frames can be calculated. If the crossover union calculated from the preset threshold is large, the two overlapping detection frames are considered to mark the same target detection object, and the larger detection frame among the two overlapping detection frames is set as the detection frame of the target detection object. and the smaller detection frame may be deleted. Alternatively, one new detection frame may be formed by merging two overlapping detection frames, and a new detection frame including the two detection frames before merging may be used as the detection frame of the target detection object. In this way, the obtained detection frames may be further selected, and detection frames of the same target detection object may be merged so that one target detection object corresponds to one detection frame.

도 2는 본 발명의 실시예에 따른 목표 검출 대상의 검출 프레임의 블록도를 나타낸다. 제1 코너를 좌측 상단 코너로 하는 예로서, 제1 코너 특징, 제1 코너 특징에서의 대응하는 길이 특징, 폭 특징에 기초하여, 도 2에 나타내는 검출 프레임을 형성할 수 있다.2 is a block diagram of a detection frame of a target detection target according to an embodiment of the present invention. As an example in which the first corner is the upper left corner, the detection frame shown in FIG. 2 may be formed based on the first corner feature, the corresponding length feature in the first corner feature, and the width feature.

단계(S14): 상기 대상 특징에 기초하여 상기 목표 검출 대상의 카테고리를 특정한다.Step S14: Specify the category of the target detection target based on the target characteristic.

본 발명의 실시예에서는 뉴럴 네트워크를 사용하여 추출된 대상 특징에 대해 추가로 특징 추출을 행할 수 있다. 예를 들면, 대상 특징에 대해 합성곱 처리, 정규화 처리 등을 행하고, 목표 검출 대상의 카테고리를 얻을 수 있다. 예를 들면, 이 목표 검출 대상은 차량, 보행자, 건물, 공공 시설 등의 카테고리에 속한다. 이에 의해, 대상 특징에 기초하여 목표 검출 대상의 카테고리를 얻고, 검출 이미지에서의 목표 검출 대상에 대한 목표 검출을 실현할 수 있다.In an embodiment of the present invention, feature extraction may be additionally performed on the extracted target feature using a neural network. For example, a target detection target category can be obtained by performing convolution processing, normalization processing, or the like on the target feature. For example, the target detection target belongs to a category such as a vehicle, a pedestrian, a building, a public facility, and the like. Thereby, it is possible to obtain the category of the target detection target based on the target characteristic, and realize target detection for the target detection target in the detection image.

본 발명의 실시예에서는, 우선, 검출 이미지의 이미지 특징에 의해, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정한다. 다음으로, 사이즈 특징과 코너 특징에 기초하여, 이미지 특징으로부터 목표 검출 대상에 대응하는 대상 특징을 추출한다. 또한, 추출된 대상 특징에 기초하여, 목표 검출 대상의 카테고리를 특정한다. 이와 같이, 목표 검출 대상의 대상 특징의 특정과, 목표 검출 대상의 분류가 비병행으로 행해진다. 목표 검출 대상을 분류할 때, 목표 검출 대상의 대상 특징을 고려할 수 있기 때문에, 보다 정확한 분류 결과를 얻을 수 있고, 목표 검출의 정밀도를 향상시킬 수 있다.In the embodiment of the present invention, first, the size characteristic and the corner characteristic corresponding to the target detection object are specified by the image characteristic of the detection image. Next, a target feature corresponding to the target detection target is extracted from the image feature based on the size feature and the corner feature. Further, the category of the target detection target is specified based on the extracted target feature. In this way, the specification of the target characteristic of the target detection target and the classification of the target detection target are performed non-parallelly. When classifying the target detection target, since the target characteristics of the target detection target can be considered, a more accurate classification result can be obtained, and the precision of target detection can be improved.

하나의 가능한 실시형태에서는 상기 대상 특징에 대해 1단 이상의 합성곱 처리를 행하고, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률을 얻는다. 그 후, 상기 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률에 기초하여, 상기 미리 설정된 카테고리에서 상기 목표 검출 대상의 카테고리를 특정한다.In one possible embodiment, one or more stages of convolution processing are performed on the target feature, and a probability that the target detection object belongs to one or more preset categories is obtained. Then, based on the probability that the target detection object belongs to one or more preset categories, the category of the target detection object is specified in the preset category.

이 실시형태에서는 뉴럴 네트워크를 사용하여 추출된 대상 특징에 대해 1단 이상의 합성곱 처리를 추가로 행할 수 있고, 목표 검출 대상이 하나 이상의 미리 설정된 카테고리에 속할 확률을 얻을 수 있다. 예를 들면, 미리 설정된 카테고리는 보행자, 차량, 건물 등의 카테고리 중 어느 카테고리이다. 대상 특징에 대해 합성곱 처리를 추가로 행함으로써, 목표 검출 대상이 각각 보행자, 차량, 건물 중 복수의 미리 설정된 카테고리에 속할 확률을 얻을 수 있다. 그 후, 확률이 가장 높은 미리 설정된 카테고리를 목표 검출 대상의 카테고리로서 특정할 수 있다.In this embodiment, one or more stages of convolution processing may be additionally performed on the extracted target feature using the neural network, and the probability that the target detection target belongs to one or more preset categories may be obtained. For example, the preset category is any category among categories such as pedestrians, vehicles, and buildings. By further performing convolutional processing on the target feature, it is possible to obtain a probability that the target detection target belongs to a plurality of preset categories among pedestrians, vehicles, and buildings, respectively. Then, the preset category with the highest probability can be specified as the category of the target detection object.

본 발명의 실시예에 따른 목표 검출 방법에 의하면, 목표 검출 대상에 대응하는 대상 특징을 먼저 특정하고, 그 후, 대상 특징을 사용하여 목표 검출 대상을 분류하고, 목표 검출 대상의 카테고리를 특정할 수 있다. 이와 같이, 정확성이 높은 검출 결과를 얻을 수 있다. 본 발명의 실시예에 따른 목표 검출 방법에 의하면, 뉴럴 네트워크를 사용하여 목표 검출 대상의 카테고리를 얻을 수 있다. 이하, 뉴럴 네트워크를 사용하여 목표 검출 대상의 카테고리를 얻는 과정에 대해 설명한다.According to the target detection method according to an embodiment of the present invention, it is possible to first specify a target feature corresponding to the target detection target, then classify the target detection target using the target feature, and specify the category of the target detection target. there is. In this way, a detection result with high accuracy can be obtained. According to the target detection method according to an embodiment of the present invention, a category of a target detection target may be obtained using a neural network. Hereinafter, a process of obtaining a category of a target detection target using a neural network will be described.

하나의 가능한 실시형태에서는 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻고, 그 후, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻을 수 있다.In one possible embodiment, one or more stages of convolution processing are performed on the detection image, image features of the detection image are obtained, and thereafter, corner pooling processing is performed on the image features of the detection image, and Corresponding size features and corner features can be obtained.

이 실시형태에서는 뉴럴 네트워크는 다단의 합성곱층과 각점 풀링층을 포함해도 된다. 검출 이미지를 뉴럴 네트워크에 대한 입력으로 해도 되고, 뉴럴 네트워크를 사용하여 검출 이미지에 대해 다단의 합성곱 처리를 행하고, 검출 이미지의 이미지 특징을 얻을 수 있다. 그 후, 뉴럴 네트워크의 각점 풀링층을 사용하여 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행한다. 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻을 수 있다.In this embodiment, the neural network may include a multi-stage convolutional layer and each point pooling layer. The detected image may be input to the neural network, and multi-stage convolution processing may be performed on the detected image using the neural network to obtain image characteristics of the detected image. Thereafter, corner pooling processing is performed on the image features of the detected image using each point pooling layer of the neural network. A size characteristic and a corner characteristic corresponding to the target detection object can be obtained.

이 실시예의 일례에서는 상기 합성곱 처리는 업 샘플링 처리와 다운 샘플링 처리를 포함하고, 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻는 것은 상기 검출 이미지에 대해 1단 이상의 다운 샘플링 처리를 행하고, 1단 이상의 다운 샘플링 처리 후의 제1 특징맵을 얻는 것과, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵에 기초하여, 1단 이상의 업 샘플링 처리 후의 제2 특징맵을 얻는 것과, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵과 상기 1단 이상의 업 샘플링 처리 후의 제2 특징맵에 기초하여, 상기 검출 이미지의 이미지 특징을 얻는 것을 포함해도 된다.In an example of this embodiment, the convolution processing includes up-sampling processing and down-sampling processing, performing one or more stages of convolution processing on the detection image, and obtaining image features of the detection image is 1 for the detection image Performing one or more stages of downsampling processing to obtain a first feature map after one or more stages of downsampling processing, and based on the first feature map after the one or more stages of downsampling processing, a second feature map after one or more stages of upsampling processing and obtaining image features of the detection image based on the first feature map after the one or more stages of downsampling processing and the second feature map after the one or more stages of upsampling processing.

이 예에서는 합성곱 처리는 업 샘플링 처리와 다운 샘플링 처리를 포함해도 되고, 우선, 뉴럴 네트워크를 사용하여 검출 이미지에 대해 다단의 다운 샘플링 처리를 행하고, 각 단의 다운 샘플링 처리 후의 제1 특징맵을 얻는다. 다음으로, 다단의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후에 취해진 제1 특징맵에 대해 다단의 업 샘플링 처리를 행하고, 각 단의 업 샘플링 처리 후의 제2 특징맵을 얻을 수 있다. 그 후, 다단의 다운 샘플링 처리 후의 제1 특징맵과 다단의 업 샘플링 처리 후의 제2 특징맵에 기초하여 검출 이미지의 이미지 특징을 얻을 수 있다. 예를 들면, 다단의 다운 샘플링 처리 후의 제1 특징맵과 다단의 업 샘플링 처리 후의 제2 특징맵을 특징 융합시켜, 검출 이미지의 이미지 특징을 얻을 수 있다. 여기서, 바이리니어 보간 방식에 의해 업 샘플링 처리를 행하고, 정확한 제2 특징맵을 얻을 수 있다.In this example, the convolution processing may include an up-sampling processing and a down-sampling processing. First, a multi-stage down-sampling processing is performed on the detected image using a neural network, and the first feature map after the down-sampling processing of each stage is obtained. get Next, a multi-stage up-sampling process is performed on the first feature map taken after the last-stage down-sampling process among the multi-stage down-sampling processes, and a second feature map after each stage of the up-sampling process can be obtained. Then, the image features of the detection image can be obtained based on the first feature map after the multi-stage down-sampling process and the second feature map after the multi-stage up-sampling process. For example, the image feature of the detection image can be obtained by fusing the first feature map after the multi-stage down-sampling process and the second feature map after the multi-stage up-sampling process. Here, the upsampling process is performed by the bilinear interpolation method, and an accurate second feature map can be obtained.

이 예에서는, 각 단의 상기 다운 샘플링 처리 후에 하나의 제1 특징맵을 출력하고, 각 단의 상기 업 샘플링 처리 후에 하나의 제2 특징 이미지를 출력하고, 상기 1단 이상의 업 샘플링 처리 중 1단째의 업 샘플링 처리에 대해, 상기 1단 이상의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후의 제1 특징맵을, 상기 1단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 1단째의 업 샘플링 처리 후에 출력된 제2 특징맵을 얻고, 상기 1단 이상의 업 샘플링 처리 중 N단째의 업 샘플링 처리에 대해, 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵 및 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을, 상기 N단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 N단째의 업 샘플링 처리에 의해 출력된 제2 특징맵을 얻고, 여기서 N은 1보다 큰 양의 정수이다.In this example, one first feature map is output after the down-sampling process of each stage, and one second feature image is output after the up-sampling process of each stage, and the first stage of the one or more stages of upsampling processing With respect to the upsampling process of , a first feature map after the downsampling process of the last stage among the downsampling processes of the first stage or more is input to the upsampling process of the first stage, and output after the upsampling process of the first stage a second feature map outputted after the up-sampling process immediately before the N-th up-sampling process for the N-th up-sampling process among the one or more up-sampling processes, and the N-th up-sampling process A first feature map matching the second feature map output after the sampling process is taken as an input to the N-th up-sampling process, and a second feature map output by the N-th up-sampling process is obtained, wherein N is a positive integer greater than 1.

이 예에서는, 검출 이미지에 대해 다단의 다운 샘플링 처리를 행하고, 각 단의 다운 샘플링 처리 후의 제1 특징맵을 얻을 수 있다. 다단의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후에 취해진 제1 특징맵에 대해, 다단의 업 샘플링 처리 중 1단째의 업 샘플링 처리에 의해 이 제1 특징맵을 업 샘플링하고, 1단째의 업 샘플링 처리 후의 제2 특징맵을 얻을 수 있다. 그 후, 1단째의 업 샘플링 처리 후의 제2 특징맵 및 이 제2 특징맵에 매칭하는 제1 특징맵에 기초하여, 2단째의 업 샘플링 처리에 대한 입력을 얻을 수 있다. 예를 들면, 이 제2 특징맵을 제1 특징맵과 융합시켜 2단째의 업 샘플링 처리에 대한 입력을 얻는다. 또는, 이 제1 특징맵에 대해 합성곱 처리를 행한 후, 이 제2 특징맵과 융합시키고, 2단째의 업 샘플링 처리에 대한 입력을 얻는다. 여기서, 이 제2 특징맵에 매칭하는 제1 특징맵은 이 제2 특징맵의 이미지 사이즈에 매칭하는 제1 특징맵이어도 된다. 2단째의 업 샘플링 처리에 의해 입력에 대해 업 샘플링을 행하고, 2단째의 업 샘플링 처리 후의 제2 특징맵을 얻을 수 있다. 그 후, 2단째의 업 샘플링 처리 후의 제2 특징맵 및 이 제2 특징맵에 매칭하는 제1 특징맵에 기초하여, 3단째의 업 샘플링 처리에 대한 입력을 얻는다. 이하 동일하게 하여, N단째의 업 샘플링 처리 후의 제2 특징맵을 얻을 수 있다. 여기서, N은 1보다 큰 양의 정수이다. 이와 같이, 업 샘플링 처리 프로세스에서, 다운 샘플링 처리에 의해 취해진 이미지 특징을 고려할 수 있고, 보다 정확한 이미지 특징을 추출한다.In this example, multiple stages of downsampling processing are performed on the detection image, and the first feature map after each stage of downsampling processing can be obtained. With respect to the first feature map taken after the downsampling process of the last stage of the multi-stage downsampling process, the first feature map is upsampled by the upsampling process of the first stage of the multistage upsampling process, and the first stage of upsampling A second feature map after processing can be obtained. Then, based on the 2nd feature map after the upsampling process of the 1st stage, and the 1st feature map matching this 2nd feature map, the input to the upsampling process of the 2nd stage can be obtained. For example, this second feature map is fused with the first feature map to obtain an input for the second-stage upsampling process. Alternatively, after convolution processing is performed on this first feature map, it is fused with this second feature map to obtain an input for the second-stage upsampling process. Here, the first feature map matching the second feature map may be a first feature map matching the image size of the second feature map. The upsampling of the input is performed by the upsampling process of the 2nd stage, and the 2nd feature map after the upsampling process of the 2nd stage can be obtained. Then, based on the 2nd feature map after the upsampling process of the 2nd stage, and the 1st feature map matching this 2nd feature map, the input to the upsampling process of the 3rd stage is acquired. In the following manner, the second feature map after the upsampling process of the N-th stage can be obtained in the same manner. Here, N is a positive integer greater than 1. In this way, in the up-sampling processing process, the image features taken by the down-sampling processing can be taken into account, and more accurate image features are extracted.

이 실시형태의 일례에서, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 처리 결과를 얻을 수 있다. 그 후, 제1 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징을 얻고, 제1 분기 네트워크와 채널 수가 상이한 제2 분기 네트워크를 사용하여 상기 처리 결과에　대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 코너 특징을 얻는다.In one example of this embodiment, corner pulling processing may be performed on the image feature of the detected image, and a processing result may be obtained. Thereafter, convolution processing is performed on the processing result using the first branch network, a size characteristic corresponding to the target detection object is obtained, and the processing result is obtained using a second branch network different from the first branch network in the number of channels. A convolution process is performed on ? to obtain a corner feature corresponding to the target detection target.

이 예에서는 상기 뉴럴 네트워크는 2개의 분기 네트워크, 즉 제1 분기 네트워크와 제2 분기 네트워크를 포함해도 된다. 뉴럴 네트워크를 사용하여 검출 이미지의 이미지 특징을 추출한 후, 제1 분기 네트워크를 사용하여 검출 이미지의 이미지 특징에 대해 합성곱 처리를 행하고, 제1 분기 네트워크의 특징맵을 얻을 수 있다. 이 특징맵은 4개의 채널에 대응해도 된다. 이 경우, 하나의 채널은 제1 코너의 길이 특징에 대응하고, 하나의 채널은 제1 코너의 폭 특징에 대응하고, 하나의 채널은 제2 코너의 길이 특징에 대응하고, 하나의 채널은 제2 코너의 폭 특징에 대응한다. 이에 따라, 제2 분기 네트워크를 사용하여 검출 이미지의 이미지 특징에 대해 합성곱 처리를 행하고, 제2 분기 네트워크의 특징맵을 얻을 수 있다. 이 특징맵은 2개의 채널에 대응해도 된다. 이 경우에, 하나의 채널은 제1 코너 특징에 대응하고, 제1 코너의 검출 이미지에서의 위치를 나타낼 수 있고, 다른 하나의 채널은 제2 코너 특징에 대응하고, 제2 코너의 검출 이미지에서의 위치를 나타낼 수 있다. 이와 같이 하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징에 기초하여, 목표 검출 대상이 위치하는 이미지 영역을 특정할 수 있고, 상이한 목표 검출 대상을 구별할 수 없을 가능성을 낮출 수 있다.In this example, the neural network may include two branch networks, that is, a first branch network and a second branch network. After extracting the image features of the detection image using the neural network, convolution processing is performed on the image features of the detection image using the first branching network, and a feature map of the first branching network can be obtained. This feature map may correspond to four channels. In this case, one channel corresponds to the length characteristic of the first corner, one channel corresponds to the width characteristic of the first corner, one channel corresponds to the length characteristic of the second corner, and one channel corresponds to the length characteristic of the second corner. 2 Corresponds to the width characteristic of the corner. Accordingly, it is possible to perform convolution processing on the image features of the detected image using the second branch network, and obtain a feature map of the second branch network. This feature map may correspond to two channels. In this case, one channel may correspond to the first corner feature and indicate the position in the detection image of the first corner, and the other channel may correspond to the second corner feature and in the detection image of the second corner. can indicate the location of In this way, it is possible to specify the image area in which the target detection object is located based on the size characteristic and the corner characteristic corresponding to the target detection object, and it is possible to reduce the possibility that different target detection objects cannot be distinguished.

도 3은 본 발명의 실시예에 따른 뉴럴 네트워크를 사용하여 목표 검출 대상의 검출 결과를 얻는 블록도이다.3 is a block diagram of obtaining a detection result of a target detection target using a neural network according to an embodiment of the present invention.

이하, 일례로 상기 뉴럴 네트워크를 사용하여 목표 검출 대상의 카테고리를 얻는 프로세스에 대해 설명한다. 검출 이미지의 애스펙트 비를 변경하지 않고, 검출 이미지의 장변과 단변을 적절한 사이즈, 예를 들면, 검출 이미지의 단변을 800화소로 조정할 수 있다. 그 후, 조정 후의 검출 이미지를 뉴럴 네트워크에 입력한다. 뉴럴 네트워크는 다단의 합성곱층을 포함해도 된다. 우선, 뉴럴 네트워크를 사용하여 검출 이미지에 대해 다운 샘플링 처리를 행할 수 있다. 각 단의 다운 샘플링 처리에 의해 하나의 제1 특징맵을 얻을 수 있고, 4단의 합성곱 처리를 행하고, 4개의 상이한 사이즈의 제1 특징맵을 얻을 수 있다. 각각 C₂, C₃, C₄, C₅로 표기한다. 여기서, C₂의 장변과 단변은 전부 C₃의 2배이고, C₃의 장변과 단변은 전부 C₄의 2배이고, C₄의 장변과 단변은 전부 C₅의 2배이다. 다음으로, C₅에 대해 1*1의 합성곱 커널 계산을 행하고, 새로운 특징맵 F₅를 얻고, F₅의 장변과 단변은 C₅와 동일하다. F₅에 대해 다단의 업 샘플링 처리를 행하고, 각 단의 업 샘플링 처리에 의해 제2 특징맵을 얻을 수 있다. 즉, F₅에 대해 바이리니어 보간의 업 샘플링 처리를 행할 수 있고, 장변과 단변이 전부 2배로 확대된 제2 특징맵을 얻고, 이 제2 특징맵을 F₅'로 표기할 수 있다. C₄에 대해 1*1의 합성곱 커널 계산을 행하고, 새로운 특징맵 C₄'를 얻을 수 있다. C₄'와 C₅'는 사이즈가 동일하다. C₄'와 C₅'의 2개의 특징맵을 가산하고, 2단째의 업 샘플링 처리에 대한 입력 F₄를 얻을 수 있다. 그리고, F₄에 대해 업 샘플링 처리를 행하고, 장변과 단변이 전부 2배로 확대된 제2 특징맵(F₄')를 얻고, C₃에 대해 1*1의 합성곱 커널 계산을 행하고, 새로운 특징맵(C₃')을 얻을 수 있다. C₃'과 F₄'는 사이즈가 동일하다. C₃'과 F₄'의 2개의 특징맵을 가산하고, 3단째의 업 샘플링 처리에 대한 입력(F₃)을 얻을 수 있다. 이하 동일하게 하여, 복수회의 업 샘플링 처리에 의해, 최종단의 업 샘플링 처리 후에 출력된 제2 특징맵(F₂)을 얻을 수 있다. F₂의 장변과 단변은 C₂와 동일하다.Hereinafter, as an example, a process for obtaining a category of a target detection target using the neural network will be described. Without changing the aspect ratio of the detection image, the long side and the short side of the detection image can be adjusted to an appropriate size, for example, the short side of the detection image can be adjusted to 800 pixels. Then, the detected image after adjustment is input to the neural network. The neural network may include multiple convolutional layers. First, a downsampling process can be performed on a detected image using a neural network. One first feature map can be obtained by the downsampling process of each stage, and the convolution process of four stages can be performed, and the first feature maps of four different sizes can be obtained. They are denoted as C ₂ , C ₃ , C ₄ , and C ₅ , respectively. Here, the long and short sides of C ₂ are all twice that of C ₃ , the long and short sides of C ₃ are all twice that of C ₄ , and the long and short sides of C ₄ are all twice that of C ₅ . Next, a 1*1 convolution kernel is calculated for C ₅ , a new feature map F ₅ is obtained, and the long and short sides of F ₅ are the same as those of C ₅ . _A multistage upsampling process is performed with respect to F5, and a 2nd feature map can be obtained by the upsampling process of each stage. That is, the bilinear interpolation upsampling process can be performed with respect to F ₅ , and a second feature map in which both the long and short sides are enlarged twice can be obtained, and this second feature map can be expressed as F ₅ ′. A 1*1 convolution kernel is calculated for C ₄ , and a new feature map C ₄ ′ can be obtained. C ₄ ' and C ₅ ' are the same size. By adding the two feature maps of C ₄ ' and C ₅ ', an input F ₄ for upsampling processing in the second stage can be obtained. Then, an upsampling process is performed on F ₄ , a second feature map F ₄ ′ in which both the long and short sides are doubled is obtained, and a 1*1 convolution kernel is calculated for C ₃ , and a new feature is obtained. A map (C ₃ ') can be obtained. C ₃ ' and F ₄ ' are the same size. By adding the two feature maps C ₃ ′ and F ₄ ′, an input F ₃ to the upsampling process of the third stage can be obtained. Hereinafter, the _2nd feature map F2 outputted after the upsampling process of the last stage can be obtained similarly by multiple times of upsampling process. The long and short sides of F ₂ are the same as those of C ₂ .

그리고, 제2 특징맵(F2)에 대해 코너 풀링 처리를 행하고, 처리 결과를 얻는다. 이 처리 결과는 각각 제1 분기 네트워크와 제2 분기 네트워크를 통과할 수 있다. 각 분기 네트워크는 3*3 합성곱 커널을 포함해도 된다. 제1 분기 네트워크는 4채널의 특징맵 locatiion을 형성할 수 있고, 제2 분기 네트워크는 2채널의 특징맵 mask를 형성할 수 있다. 여기서, 특징맵 mask의 2채널은 각각 좌측 상단 코너 특징과 우측 하단 코너 특징을 나타내고, 특징맵 location의 4채널은 각각 좌측 상단 코너에 대응하는 폭 특징(dw), 길이 특징(dh), 및 우측 하단 코너에 대응하는 폭 특징(dw), 길이 특징(dh)을 나타낸다.And a corner pulling process is performed with respect to the 2nd feature map F2, and a process result is obtained. A result of this processing may pass through the first branch network and the second branch network, respectively. Each branch network may contain a 3*3 convolution kernel. The first branch network may form a 4-channel feature map location, and the second branch network may form a 2-channel feature map mask. Here, the 2 channels of the feature map mask represent the upper left corner feature and the lower right corner feature, respectively, and the 4 channels of the feature map location respectively correspond to the width feature (dw), the length feature (dh), and the right side corresponding to the upper left corner. The width feature (dw) and the length feature (dh) corresponding to the bottom corner are shown.

좌측 상단 코너 특징과 우측 하단 코너 특징, 좌측 상단 코너에 대응하는 폭 특징과 길이 특징, 우측 하단 코너에 대응하는 폭 특징과 길이 특징에 기초하여, 하나의 특징 영역을 특정할 수 있다. 제2 특징맵(F₂)으로부터 특징 영역의 이미지 특징을 추출하고, 목표 검출 대상의 대상 특징을 얻을 수 있다. 예를 들면, RoI Align층에 의해 제2 특징맵(F2)의 특징 영역 내에서 대응하는 이미지 특징을 얻을 수 있다. 그 후, 3*3의 합성곱 커널을 사용하여 대상 특징을 분류하고, 검출 이미지에서의 목표 검출 대상의 카테고리를 얻을 수 있다.One characteristic region may be specified based on the upper left corner feature and the lower right corner feature, the width feature and the length feature corresponding to the upper left corner, and the width feature and the length feature corresponding to the lower right corner. An image feature of a feature region may be extracted from the second feature map F ₂ , and a target feature of a target detection target may be obtained. For example, a corresponding image feature in the feature region of the second feature map F2 can be obtained by the RoI Align layer. Then, the target feature is classified using the 3*3 convolution kernel, and the target detection target category in the detection image can be obtained.

여기서, 좌측 상단 코너 특징, 우측 하단 코너 특징 및 좌측 상단 코너에 대응하는 dw, dh, 우측 하단 코너에 대응하는 dw, dh에 의해, 목표 검출 대상의 검출 프레임을 얻을 수 있다.Here, the detection frame of the target detection target may be obtained by dw and dh corresponding to the upper left corner feature, the lower right corner feature, and the upper left corner, and dw and dh corresponding to the lower right corner.

검출 프레임의 폭을 예로 하면, 검출 프레임의 폭은 하기 식 (1)로 산출된다.Taking the width of the detection frame as an example, the width of the detection frame is calculated by the following formula (1).

단, w는 검출 프레임의 이미지 폭이고, s, β, α는 매핑 파라미터여도 되고, 네트워크 파라미터에 의해 얻을 수 있다. dw는 폭 특징이다.However, w is the image width of the detection frame, and s, β, and α may be mapping parameters, which can be obtained by network parameters. dw is the width characteristic.

목표 검출 대상의 검출 프레임이 복수 있는 경우, 목표 검출 대상의 복수의 검출 프레임에 대해 비극대값 억제 처리를 행하고, 목표 검출 대상의 복수의 검출 프레임을 하나의 검출 프레임에 합병하여 목표 검출 대상의 최종적인 검출 결과를 얻을 수 있다.When there are a plurality of detection frames of the target detection object, non-polar value suppression processing is performed on the plurality of detection frames of the target detection object, and the plurality of detection frames of the target detection object are merged into one detection frame to obtain a final target detection object. detection results can be obtained.

본 발명의 실시예에 따른 목표 검출 방법에 의하면, 코너에 기초하여 취해진 목표 검출 대상의 검출 프레임을 보다 효과적으로 예측할 수 있다. 검출 프레임을 보다 정확하게 예측할 수 있고, 목표 검출 대상의 중첩에 기인하여 예측된 검출 프레임의 정밀도가 낮다는 문제를 효과적으로 완화할 수 있다. 또한, 검출 프레임의 예측과 목표 검출 대상의 분류는 비병행으로 행해지고, 즉, 검출 프레임의 위치를 나타내는 사이즈 특징과 코너 특징을 먼저 얻고, 그 후, 사이즈 특징과 코너 특징에 의해 특정된 대상 특징에 기초하여, 목표 검출 대상을 분류하고, 보다 정확한 분류 결과를 얻을 수 있다.According to the target detection method according to the embodiment of the present invention, it is possible to more effectively predict the detection frame of the target detection target taken based on the corner. The detection frame can be predicted more accurately, and the problem that the precision of the predicted detection frame is low due to the overlap of the target detection objects can be effectively alleviated. Further, the prediction of the detection frame and the classification of the target detection object are performed non-parallel, that is, the size characteristic and the corner characteristic indicating the position of the detection frame are first obtained, and then, the size characteristic and the target characteristic specified by the corner characteristic are obtained. Based on the classification, the target detection target can be classified, and a more accurate classification result can be obtained.

본 발명에서 언급되는 상기 각 방법의 실시예는 원리와 논리를 위반하지 않는 한 서로 조합하여 실시예를 형성할 수 있음을 이해해야 한다. 분량에 한계가 있으므로, 본 발명에서는 상세한 설명을 생략한다.It should be understood that the embodiments of the respective methods mentioned in the present invention may be combined with each other to form embodiments as long as the principles and logic are not violated. Since there is a limit to the amount, a detailed description is omitted in the present invention.

또한, 본 발명에서는 목표 검출 장치, 전자 기기, 컴퓨터 판독 가능 기억 매체, 프로그램이 제공된다. 이들은 전부 본 발명에 따른 목표 검출 방법 중 어느 하나를 실현하기 위해 이용할 수 있다. 대응하는 기술적 해결 수단과 설명은 방법의 대응하는 기재를 참조해도 되고, 상세한 설명을 생략한다.In addition, the present invention provides a target detection device, an electronic device, a computer-readable storage medium, and a program. All of these can be used to realize any one of the target detection methods according to the present invention. Corresponding technical solutions and descriptions may refer to corresponding descriptions of methods, and detailed descriptions are omitted.

또한, 당업자이면, 구체적인 실시형태에 따른 상기 방법에서는 각 단계의 기재 순서는 실행 순서를 엄밀하게 한정하여 실시 과정을 한정하는 것이 아니고, 각 단계의 실행 순서가 이 기능과 내부 논리에 의해 구체적으로 결정된다는 것을 이해해야 한다.In addition, those skilled in the art will know that in the above method according to a specific embodiment, the description order of each step does not strictly limit the execution order to limit the implementation process, but the execution order of each step is specifically determined by this function and internal logic It must be understood that

도 4는 본 발명의 실시예에 따른 목표 검출 장치의 블록도를 나타낸다. 도 4에 나타내는 바와 같이, 상기 목표 검출 장치는4 is a block diagram of a target detection apparatus according to an embodiment of the present invention. As shown in Fig. 4, the target detection device

처리 대상인 검출 이미지를 취득하는 취득 모듈(41)과,an acquisition module 41 for acquiring a detection image to be processed;

상기 검출 이미지의 이미지 특징에 기초하여, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 특정하는 특정 모듈(42)과,a specific module (42) for specifying a size feature and a corner feature corresponding to a target detection target based on the image feature of the detection image;

상기 사이즈 특징과 상기 코너 특징에 기초하여, 상기 이미지 특징에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출하는 추출 모듈(43)과,an extraction module (43) for extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature;

상기 대상 특징에 기초하여, 상기 목표 검출 대상의 카테고리를 특정하는 분류 모듈(44)을 포함한다.and a classification module (44) for specifying a category of the target detection target based on the target characteristic.

하나의 가능한 실시형태에서는 상기 특정 모듈(42)은 구체적으로 상기 검출 이미지에 대해 1단 이상의 합성곱 처리를 행하고, 상기 검출 이미지의 이미지 특징을 얻고, 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징과 코너 특징을 얻는다.In one possible embodiment, the specific module 42 specifically performs one or more steps of convolutional processing on the detected image, obtaining image features of the detected image, and performing corner pooling processing on the image features of the detected image. and obtains a size characteristic and a corner characteristic corresponding to the target detection object.

하나의 가능한 실시형태에서는, 상기 합성곱 처리는 업 샘플링 처리와 다운 샘플링 처리를 포함하고, 상기 특정 모듈(42)은 구체적으로 상기 검출 이미지에 대해 1단 이상의 다운 샘플링 처리를 행하고, 1단 이상의 다운 샘플링 처리 후의 제1 특징맵을 얻고,In one possible embodiment, the convolution processing includes up-sampling processing and down-sampling processing, and the specific module 42 specifically performs one or more stages of downsampling processing on the detected image, and one or more stages of downsampling processing. Obtain a first feature map after sampling processing,

상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵에 기초하여, 1단 이상의 업 샘플링 처리 후의 제2 특징맵을 얻고, 상기 1단 이상의 다운 샘플링 처리 후의 제1 특징맵과 상기 1단 이상의 업 샘플링 처리 후의 제2 특징맵에 기초하여, 상기 검출 이미지의 이미지 특징을 얻는다.Based on the first feature map after the one or more stages of downsampling processing, a second feature map after one or more stages of upsampling is obtained, and the first feature map after the one or more stages of downsampling processing and the one or more stages of upsampling processing Based on the later second feature map, image features of the detected image are obtained.

하나의 가능한 실시형태에서는, 각 단의 상기 다운 샘플링 처리 후에 하나의 제1 특징맵을 출력하고, 각 단의 업 샘플링 처리 후에 하나의 제2 특징 이미지를 출력하고, 상기 특정 모듈(42)은 구체적으로, 상기 1단 이상의 업 샘플링 처리 중의 1단째의 업 샘플링 처리에 대해, 상기 1단 이상의 다운 샘플링 처리 중 최종단의 다운 샘플링 처리 후의 제1 특징맵을, 상기 1단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 1단째의 업 샘플링 처리 후에 출력된 제2 특징맵을 얻고, 상기 1단 이상의 업 샘플링 처리 중 N단째의 업 샘플링 처리에 대해, 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵, 및 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을, 상기 N단째의 업 샘플링 처리에 대한 입력으로 하고, 상기 N단째의 업 샘플링 처리에 의해 출력된 제2 특징맵을 얻고, 여기서, N은 1보다 큰 양의 정수이다.In one possible embodiment, one first feature map is output after the downsampling process of each stage, and one second feature image is output after the upsampling process of each stage, and the specific module 42 is configured to: , with respect to the upsampling process of the first stage in the upsampling process of the first stage or more, the first feature map after the downsampling process of the last stage among the downsampling processes of the first stage or more is input to the upsampling process of the first stage to obtain a second feature map output after the first-stage up-sampling process, and after the up-sampling process immediately before the N-th up-sampling process for the N-th up-sampling process among the first-stage or more up-sampling processes A first feature map matching the output second feature map and the second feature map output after the N-th up-sampling process is used as an input to the N-th up-sampling process, and the N-th up-sampling process is performed. A second feature map output by the process is obtained, where N is a positive integer greater than one.

하나의 가능한 실시형태에서는, 상기 특정 모듈(42)은 구체적으로 상기 N단째의 업 샘플링 처리 직전의 업 샘플링 처리 후에 출력된 제2 특징맵과, 상기 N단째의 업 샘플링 처리 후에 출력된 제2 특징맵에 매칭하는 제1 특징맵을 특징 융합시켜 상기 N단째의 업 샘플링 처리에 대한 입력을 얻는다.In one possible embodiment, the specific module 42 specifically includes a second feature map output after the up-sampling process immediately before the N-th stage upsampling process, and a second feature output after the N-th stage upsampling process. A first feature map matching the map is feature-fused to obtain an input for the N-th step upsampling process.

하나의 가능한 실시형태에서는 상기 특정 모듈(42)은 구체적으로 상기 검출 이미지의 이미지 특징에 대해 코너 풀링 처리를 행하고, 처리 결과를 얻고, 제1 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 사이즈 특징을 얻고, 제1 분기 네트워크와 채널 수가 상이한 제2 분기 네트워크를 사용하여 상기 처리 결과에 대해 합성곱 처리를 행하고, 목표 검출 대상에 대응하는 코너 특징을 얻는다.In one possible embodiment, the specific module 42 specifically performs corner pooling processing on the image features of the detected image, obtaining a processing result, and performing convolution processing on the processing result using a first branching network. a size feature corresponding to the target detection target is obtained, and a convolution process is performed on the processing result using a second branch network different from the first branch network in the number of channels to obtain a corner feature corresponding to the target detection target.

하나의 가능한 실시형태에서는, 상기 추출 모듈(43)은 구체적으로 상기 사이즈 특징과 상기 코너 특징에 기초하여 상기 검출 이미지에서의 상기 목표 검출 대상의 이미지 영역과 매핑 관계가 있는 특징 영역을 특정하고, 상기 이미지 특징의 특징 영역에서 상기 목표 검출 대상에 대응하는 대상 특징을 추출한다.In one possible embodiment, the extraction module 43 specifically specifies a feature region having a mapping relationship with an image region of the target detection object in the detection image based on the size feature and the corner feature, A target feature corresponding to the target detection target is extracted from the feature area of the image feature.

하나의 가능한 실시형태에서는 상기 목표 검출 대상에 대응하는 코너 특징은 적어도 상기 목표 검출 대상에 대응하는 제1 코너 특징과 제2 코너 특징을 포함하고, 상기 목표 검출 대상에 대응하는 사이즈 특징은 상기 목표 검출 대상의 제1 코너 특징에 대응하는 길이 특징, 폭 특징과, 상기 목표 검출 대상의 제2 코너 특징에 대응하는 길이 특징, 폭 특징을 포함한다.In one possible embodiment the corner feature corresponding to the target detection object comprises at least a first corner feature and a second corner feature corresponding to the target detection object, and the size feature corresponding to the target detection object is the target detection and a length feature and a width feature corresponding to a first corner feature of the object, and a length feature and a width feature corresponding to a second corner feature of the target detection object.

하나의 가능한 실시형태에서는 상기 장치는 상기 제1 코너 특징에서의 대응하는 길이 특징, 폭 특징 및 상기 제2 코너 특징에서의 대응하는 길이 특징, 폭 특징에 기초하여 상기 검출 이미지에서 상기 목표 검출 대상의 검출 프레임을 특정하고, 겹치는 임의의 2개의 검출 프레임간의 교차 오버 유니온을 결정하고, 미리 설정된 임계값보다 상기 교차 오버 유니온이 큰 경우, 상기 겹치는 임의의 2개의 검출 프레임을 하나의 검출 프레임에 합병하는 합병 모듈을 추가로 포함한다.In one possible embodiment the device determines the target detection object in the detection image based on a corresponding length feature, a width feature in the first corner feature and a corresponding length feature, a width feature in the second corner feature. A detection frame is specified, a cross-over union between any two overlapping detection frames is determined, and when the cross-over union is greater than a preset threshold, the two overlapping detection frames are merged into one detection frame. It further includes a merge module.

하나의 가능한 실시형태에서는, 상기 분류 모듈(44)은 구체적으로 상기 대상 특징에 대해 1단 이상의 합성곱 처리를 행하고, 상기 목표 검출 대상이 각 미리 설정된 카테고리에 속할 확률을 얻고, 상기 목표 검출 대상이 각 미리 설정된 카테고리에 속할 확률에 기초하여, 상기 미리 설정된 카테고리에서 상기 목표 검출 대상의 카테고리를 특정한다.In one possible embodiment, the classification module 44 specifically performs one or more stages of convolutional processing on the target feature, obtains a probability that the target detection target belongs to each preset category, and the target detection target is Based on the probability of belonging to each preset category, the category of the target detection target is specified in the preset category.

일부 실시예에서는 본 발명의 실시예에 따른 장치가 구비하는 기능 또는 모듈은 상술한 방법의 실시예에 설명되는 방법을 실행하기 위해 이용할 수 있고, 그 구체적인 실현에 대해 상술한 방법의 실시예의 설명을 참조하면 되고, 간소화를 위해 여기서 상세한 설명을 생략한다.In some embodiments, functions or modules provided by the apparatus according to the embodiments of the present invention may be used to implement the methods described in the embodiments of the methods described above, and for specific realization thereof, refer to the description of the embodiments of the methods described above. For reference, a detailed description is omitted here for the sake of simplification.

본 발명의 실시예는 추가로, 프로세서와, 프로세서가 실행 가능한 명령을 기억하기 위한 메모리를 포함하고, 상기 프로세서는 상기 방법을 실행하도록 구성되는 전자 기기를 제공한다.An embodiment of the present invention further provides an electronic device comprising a processor and a memory for storing instructions executable by the processor, wherein the processor is configured to execute the method.

전자 기기는 단말, 서버 또는 다른 형태의 기기로서 제공되어도 된다.The electronic device may be provided as a terminal, server, or other type of device.

도 5는 예시적인 실시예에 따른 전자 기기(1900)의 블록도를 나타낸다. 예를 들면, 전자 기기(1900)는 서버로서 제공되어도 된다. 도 5를 참조하면, 전자 기기(1900)는 하나 이상의 프로세서를 포함하는 처리 컴포넌트(1922) 및 처리 컴포넌트(1922)에 의해 실행 가능한 명령, 예를 들면, 애플리케이션 프로그램을 기억하기 위한, 메모리(1932)를 대표로 하는 메모리 자원을 포함한다. 메모리(1932)에 기억되어 있는 애플리케이션 프로그램은 각각이 하나의 명령군에 대응하는 하나 이상의 모듈을 포함해도 된다. 또한, 처리 컴포넌트(1922)는 명령을 실행함으로써 상기 방법을 실행하도록 구성된다.5 is a block diagram of an electronic device 1900 according to an exemplary embodiment. For example, the electronic device 1900 may be provided as a server. Referring to FIG. 5 , an electronic device 1900 includes a processing component 1922 including one or more processors and a memory 1932 for storing instructions executable by the processing component 1922 , for example, an application program. Includes memory resources represented by . The application program stored in the memory 1932 may include one or more modules each corresponding to one instruction group. Further, processing component 1922 is configured to execute the method by executing instructions.

전자 기기(1900)는 추가로, 전자 기기(1900)의 전원 관리를 실행하도록 구성된 전원 컴포넌트(1926), 전자 기기(1900)를 네트워크에 접속하도록 구성된 유선 또는 무선 네트워크 인터페이스(1950) 및 입출력(I/O) 인터페이스(1958)를 포함해도 된다. 전자 기기(1900)는 메모리(1932)에 기억된 오퍼레이팅 시스템, 예를 들면, Windows Server^TM, Mac OS X^TM, Unix^TM, Linux^TM, FreeBSD^TM 또는 유사한 것에 기초하여 동작할 수 있다. The electronic device 1900 further includes a power component 1926 configured to perform power management of the electronic device 1900 , a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and input/output (I) /O) interface 1958. The electronic device 1900 may operate based on an operating system stored in the memory 1932 , for example, Windows Server ^TM , Mac OS X ^TM , Unix ^TM , Linux ^TM , FreeBSD ^TM or the like.

예시적인 실시예에서는 추가로, 휘발성 컴퓨터 판독 가능 기억 매체 또는 불휘발성 컴퓨터 판독 가능 기억 매체, 예를 들면, 컴퓨터 프로그램 명령을 포함하는 메모리(1932)가 제공되고, 상기 컴퓨터 프로그램 명령은 전자 기기(1900)의 처리 컴포넌트(1922)에 의해 실행되면, 상기 방법을 실행시킬 수 있다. In the exemplary embodiment, there is further provided a volatile computer readable storage medium or a nonvolatile computer readable storage medium, for example, a memory 1932 comprising computer program instructions, wherein the computer program instructions are stored in the electronic device 1900 . ), may cause the method to be executed.

예시적인 실시예에서는 추가로, 컴퓨터 판독 가능 코드를 포함하고, 컴퓨터 판독 가능 코드가 전자 기기에서 동작하면, 전자 기기의 프로세서에 상기 방법을 실현하기 위한 명령을 실행시키는 컴퓨터 프로그램을 제공한다.The exemplary embodiment further provides a computer program comprising computer readable code, which, when the computer readable code operates in an electronic device, executes instructions for realizing the method in a processor of the electronic device.

본 발명은 시스템, 방법 및/또는 컴퓨터 프로그램 제품이어도 된다. 컴퓨터 프로그램 제품은 프로세서에 본 발명의 각 측면을 실현시키기 위한 컴퓨터 판독 가능 프로그램 명령을 갖고 있는 컴퓨터 판독 가능 기억 매체를 포함해도 된다. The invention may be a system, method and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions for realizing each aspect of the present invention in a processor.

컴퓨터 판독 가능 기억 매체는 명령 실행 기기에 사용되는 명령을 저장 및 기억 가능한 유형(有形)의 장치여도 된다. 컴퓨터 판독 가능 기억 매체는 예를 들면, 전기 기억 장치, 자기 기억 장치, 광 기억 장치, 전자 기억 장치, 반도체 기억 장치, 또는 상기의 임의의 적당한 조합이어도 된다. 컴퓨터 판독 가능 기억 매체의 보다 구체적인 예(비망라적 리스트)로는 휴대형 컴퓨터 디스크, 하드 디스크, 랜덤 액세스 메모리(RAM), 판독 전용 메모리(ROM), 소거 가능 프로그래머블 판독 전용 메모리(EPROM 또는 플래시 메모리), 정적 랜덤 액세스 메모리(SRAM), 휴대형 콤팩트 디스크 판독 전용 메모리(CD-ROM), 디지털 다용도 디스크(DVD), 메모리 스틱, 플로피 디스크, 예를 들면, 명령이 기억되어 있는 천공 카드 또는 슬롯 내 돌기 구조와 같은 기계적 부호화 장치, 및 상기의 임의의 적당한 조합을 포함한다. 여기에서 사용되는 컴퓨터 판독 가능 기억 매체는 순시 신호 자체, 예를 들면, 무선 전파 또는 기타 자유롭게 전파되는 전자파, 도파로 또는 다른 전송 매체를 경유하여 전파되는 전자파(예를 들면, 광파이버 케이블을 통과하는 광펄스), 또는 전선을 경유하여 전송되는 전기 신호로 해석되는 것은 아니다. The computer-readable storage medium may be a tangible device capable of storing and storing instructions used in an instruction execution device. The computer-readable storage medium may be, for example, an electrical storage device, a magnetic storage device, an optical storage device, an electronic storage device, a semiconductor storage device, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory); Static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, such as a punched card or slot in which instructions are stored; such mechanical encoding devices, and any suitable combination of the above. As used herein, a computer-readable storage medium is an instantaneous signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating via waveguides or other transmission media (eg, optical pulses passing through optical fiber cables). ), or an electrical signal transmitted via a wire.

여기서 기술한 컴퓨터 판독 가능 프로그램 명령은 컴퓨터 판독 가능 기억 매체에서 각 계산/처리 기기에 다운로드되어도 되고, 또는 네트워크, 예를 들면, 인터넷, 로컬 에어리어 네트워크, 광역 네트워크 및/또는 무선 네트워크를 통해 외부의 컴퓨터 또는 외부 기억 장치에 다운로드되어도 된다. 네트워크는 구리 전송 케이블, 광파이버 전송, 무선 전송, 라우터, 방화벽, 교환기, 게이트웨이 컴퓨터 및/또는 에지 서버를 포함해도 된다. 각 계산/처리 기기 내의 네트워크 어댑터 카드 또는 네트워크 인터페이스는 네트워크에서 컴퓨터 판독 가능 프로그램 명령을 수신하고, 상기 컴퓨터 판독 가능 프로그램 명령을 전송하고, 각 계산/처리 기기 내의 컴퓨터 판독 가능 기억 매체에 기억시킨다.The computer readable program instructions described herein may be downloaded to each computing/processing device from a computer readable storage medium, or may be downloaded to an external computer via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. Alternatively, it may be downloaded to an external storage device. The network may include copper transport cables, fiber optic transport, wireless transport, routers, firewalls, switchboards, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives a computer readable program command from the network, transmits the computer readable program command, and stores the computer readable program command in a computer readable storage medium in each computing/processing device.

본 발명의 동작을 실행하기 위한 컴퓨터 프로그램 명령은 어셈블러 명령, 명령 세트 아키텍처(ISA) 명령, 기계어 명령, 기계 의존 명령, 마이크로 코드, 펌웨어 명령, 상태 설정 데이터 또는 Smalltalk, C＋＋ 등의 오브젝트 지향 프로그래밍 언어, 및 「C」언어 또는 유사한 프로그래밍 언어 등의 일반적인 절차형 프로그래밍 언어를 포함하는 하나 이상의 프로그래밍 언어의 임의의 조합으로 작성된 소스 코드 또는 목표 코드여도 된다. 컴퓨터 판독 가능 프로그램 명령은 완전히 사용자의 컴퓨터에서 실행되어도 되고, 부분적으로 사용자의 컴퓨터에서 실행되어도 되고, 독립형 소프트웨어 패키지로서 실행되어도 되고, 부분적으로 사용자의 컴퓨터에서 또한 부분적으로 리모트 컴퓨터에서 실행되어도 되고, 또는 완전히 리모트 컴퓨터 혹은 서버에서 실행되어도 된다. 리모트 컴퓨터의 경우, 리모트 컴퓨터는 로컬 에어리어 네트워크(LAN) 또는 광역 네트워크(WAN)를 포함하는 임의의 종류의 네트워크를 경유하여 사용자의 컴퓨터에 접속되어도 되고, 또는 (예를 들면, 인터넷 서비스 프로바이더를 이용해 인터넷을 경유하여) 외부 컴퓨터에 접속되어도 된다. 일부 실시예에서는 컴퓨터 판독 가능 프로그램 명령의 상태 정보를 이용하여, 예를 들면, 프로그래머블 논리 회로, 필드 프로그래머블 게이트 어레이(FPGA) 또는 프로그래머블 논리 어레이(PLA) 등의 전자 회로를 맞춤 제조하고, 상기 전자 회로에 의해 컴퓨터 판독 가능 프로그램 명령을 실행함으로써 본 발명의 각 측면을 실현하도록 해도 된다.The computer program instructions for carrying out the operations of the present invention may include assembler instructions, instruction set architecture (ISA) instructions, machine language instructions, machine dependent instructions, microcode, firmware instructions, state setting data or an object-oriented programming language such as Smalltalk, C++, etc.; and source code or target code written in any combination of one or more programming languages including a general procedural programming language such as "C" language or a similar programming language. The computer readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partly on a remote computer, or It may run entirely on a remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer via any kind of network, including a local area network (LAN) or a wide area network (WAN), or (eg, an Internet service provider via the Internet) may be connected to an external computer. In some embodiments, state information from computer readable program instructions is used to customize electronic circuitry, such as, for example, a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), the electronic circuitry. Each aspect of the present invention may be realized by executing computer readable program instructions by

여기서, 본 발명의 실시예에 따른 방법, 장치(시스템) 및 컴퓨터 프로그램 제품의 흐름도 및/또는 블록도를 참조하여 본 발명의 각 양태를 설명했지만, 흐름도 및/또는 블록도의 각 블록, 및 흐름도 및/또는 블록도의 각 블록의 조합은 전부 컴퓨터 판독 가능 프로그램 명령에 의해 실현할 수 있음을 이해해야 한다.Here, although each aspect of the present invention has been described with reference to flowcharts and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the present invention, each block in the flowcharts and/or block diagrams, and flowcharts And/or it should be understood that all combinations of blocks in the block diagram may be realized by computer-readable program instructions.

이들 컴퓨터 판독 가능 프로그램 명령은 범용 컴퓨터, 전용 컴퓨터 또는 기타 프로그래머블 데이터 처리 장치의 프로세서에 제공되고, 이들 명령이 컴퓨터 또는 기타 프로그래머블 데이터 처리 장치의 프로세서에 의해 실행되면, 흐름도 및/또는 블록도의 하나 이상의 블록에서 지정된 기능/동작을 실현시키도록 장치를 제조해도 된다. 이들 컴퓨터 판독 가능 프로그램 명령은 컴퓨터 판독 가능 기억 매체에 기억되고, 컴퓨터, 프로그래머블 데이터 처리 장치 및/또는 다른 기기를 특정 방식으로 동작시키도록 해도 된다. 이에 의해, 명령이 기억되어 있는 컴퓨터 판독 가능 기억 매체는 흐름도 및/또는 블록도 중 하나 이상의 블록에서 지정된 기능/동작의 각 측면을 실현하는 명령을 갖는 제품을 포함한다. These computer readable program instructions are provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing device, and when these instructions are executed by the processor of the computer or other programmable data processing device, one or more of the flowcharts and/or block diagrams The device may be manufactured to realize the functions/actions specified in the block. These computer readable program instructions may be stored in a computer readable storage medium, and may cause a computer, a programmable data processing apparatus, and/or other apparatus to operate in a specific manner. Thereby, a computer-readable storage medium having instructions stored thereon includes a product having instructions for realizing each aspect of a function/action specified in one or more blocks of a flowchart and/or a block diagram.

컴퓨터 판독 가능 프로그램 명령은 컴퓨터, 기타 프로그래머블 데이터 처리 장치 또는 기타 기기에 로드되고, 컴퓨터, 기타 프로그래머블 데이터 처리 장치 또는 기타 기기에 일련의 동작 단계를 실행시킴으로써, 컴퓨터에 의해 실시되는 프로세스를 생성하도록 해도 된다. 이렇게 하여, 컴퓨터, 기타 프로그래머블 데이터 처리 장치 또는 기타 기기에서 실행되는 명령에 의해 흐름도 및/또는 블록도 중 하나 이상의 블록에서 지정된 기능/동작을 실현한다. The computer readable program instructions may be loaded into a computer, other programmable data processing device, or other device, and cause the computer, other programmable data processing device, or other device to execute a series of operational steps, thereby creating a process executed by the computer. . In this way, the functions/operations specified in one or more blocks of the flowchart and/or block diagram are realized by instructions executed in a computer, other programmable data processing device, or other device.

도면 중 흐름도 및 블록도는 본 발명의 복수의 실시예에 따른 시스템, 방법 및 컴퓨터 프로그램 제품의 실현 가능한 시스템 아키텍처, 기능 및 동작을 나타낸다. 이 점에서는 흐름도 또는 블록도에서의 하나 이상의 각 블록은 하나의 모듈, 프로그램 세그먼트 또는 명령의 일부분을 대표할 수 있고, 상기 모듈, 프로그램 세그먼트 또는 명령의 일부분은 지정된 논리 기능을 실현하기 위한 하나 이상의 실행 가능 명령을 포함한다. 일부 대체로서의 실현형태에서는 블록에 표기되는 기능은 도면에 붙인 순서와 상이하게 실현해도 된다. 예를 들면, 연속적인 2개의 블록은 실질적으로 병렬로 실행해도 되고, 또한 관련된 기능에 따라 반대 순서로 실행해도 된다. 또한, 블록도 및/또는 흐름도에서의 각 블록, 및 블록도 및/또는 흐름도에서의 블록의 조합은 지정되는 기능 또는 동작을 실행하는 하드웨어에 기초하는 전용 시스템에 의해 실현해도 되며, 또는 전용 하드웨어와 컴퓨터 명령의 조합에 의해 실현해도 된다는 점에도 주의해야 한다.Flowcharts and block diagrams in the drawings represent realizable system architectures, functions, and operations of systems, methods, and computer program products according to a plurality of embodiments of the present invention. In this regard, each one or more blocks in the flowchart or block diagram may represent a module, program segment, or portion of an instruction, wherein the module, program segment, or portion of the instruction may represent one or more executions for realizing a specified logical function. Includes possible commands. In some alternative implementations, the functions indicated in the blocks may be realized in a different order from the order in the drawings. For example, two consecutive blocks may be executed substantially in parallel, or may be executed in the reverse order depending on the function involved. In addition, each block in the block diagram and/or flowchart, and a combination of blocks in the block diagram and/or flowchart may be realized by a dedicated system based on hardware for executing specified functions or operations, or dedicated hardware and It should also be noted that it may be realized by a combination of computer instructions.

이상, 본 발명의 각 실시예를 기술했지만, 상기 설명은 예시적인 것에 불과하고, 망라적인 것이 아니며, 또한 개시된 각 실시예에 한정되는 것도 아니다. 당업자에게 있어서, 설명된 각 실시예의 범위 및 정신에서 벗어나지 않고, 다양한 수정 및 변경이 자명하다. 본 명세서에 선택된 용어는 각 실시예의 원리, 실제 적용 또는 기존 기술에 대한 개선을 바람직하게 해석하거나, 또는 다른 당업자에게 본 명세서에 개시된 각 실시예를 이해시키기 위한 것이다. As mentioned above, although each embodiment of this invention was described, the said description is only exemplary, and is not exhaustive, and is not limited to each disclosed embodiment, either. Various modifications and changes will be apparent to those skilled in the art without departing from the scope and spirit of each described embodiment. The terminology selected herein is intended to preferably interpret the principle, practical application, or improvement of existing techniques of each embodiment, or to enable others skilled in the art to understand each embodiment disclosed herein.

Claims

acquiring a detection image to be processed;
specifying a size characteristic and a corner characteristic corresponding to a target detection target based on the image characteristic of the detection image;
extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature;
and specifying a category of the target detection target based on the target characteristic.

The method of claim 1,
Specifying the size feature and the corner feature corresponding to the target detection target based on the image feature of the detection image
performing one or more steps of convolution processing on the detected image to obtain image characteristics of the detected image;
and performing corner pooling processing on the image features of the detection image, and obtaining size features and corner features corresponding to the target detection object.

3. The method of claim 2,
The convolution process includes an up-sampling process and a down-sampling process,
Performing one or more stages of convolution processing on the detected image, and obtaining image characteristics of the detected image
performing one or more stages of downsampling processing on the detected image, and obtaining a first feature map after one or more stages of downsampling processing;
obtaining a second feature map after one or more stages of upsampling processing on the basis of the first feature map after the one or more stages of downsampling;
and obtaining image features of the detection image based on the first feature map after the one or more stages of downsampling processing and the second feature map after the one or more stages of upsampling processing.

4. The method of claim 3,
outputting one first feature map after the down-sampling process of each stage, and outputting one second feature image after the up-sampling process of each stage;
Based on the first feature map after one or more stages of downsampling processing, obtaining a second feature map after one or more stages of upsampling processing is
With respect to the upsampling process of the first stage among the upsampling processing of the first stage or more, a first feature map after the downsampling process of the last stage among the downsampling processing of the first stage is input to the upsampling process of the first stage and,
obtaining a second feature map output after the first-stage up-sampling process;
A second feature map output after the upsampling process immediately before the N-th up-sampling process with respect to the N-th up-sampling process among the one or more up-sampling processes, and a second output after the N-th up-sampling process using a first feature map matching the feature map as an input to the N-th step upsampling process;
and obtaining a second feature map output by the N-th up-sampling process, wherein N is a positive integer greater than one.

5. The method of claim 4,
A first feature map matching the second feature map output after the upsampling process immediately before the N-th-stage up-sampling process and the second feature map output after the N-th up-sampling process is subjected to upsampling of the N-th stage What to do as input to processing
The second feature map output after the upsampling process immediately before the N-th up-sampling process and the first feature map matching the second feature map output after the N-th up-sampling process are feature-fused, and the N-th stage A target detection method, obtaining an input for the upsampling process of.

3. The method of claim 2,
Performing corner pulling processing on the image features of the detection image, and obtaining the size features and corner features corresponding to the target detection target
performing corner pooling processing on the image features of the detected image, and obtaining a processing result;
performing convolution processing on the processing result using a first branch network to obtain a size characteristic corresponding to a target detection target;
A target detection method comprising: performing convolution processing on the processing result by using a first branching network and a second branching network having a different number of channels, and obtaining a corner feature corresponding to a target detection object.

7. The method according to any one of claims 1 to 6,
extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature
specifying a feature region having a mapping relationship with an image region of the target detection target in the detection image based on the size feature and the corner feature;
and extracting a target feature corresponding to the target detection target from a feature region of the image feature.

8. The method of claim 7,
The corner feature corresponding to the target detection object includes at least a first corner feature and a second corner feature corresponding to the target detection object,
The size characteristic corresponding to the target detection object includes a length characteristic and a width characteristic corresponding to a first corner characteristic of the target detection object, and a length characteristic and a width characteristic corresponding to a second corner characteristic of the target detection object, Target detection method.

9. The method of claim 8,
specifying a detection frame of the target detection object in the detection image based on a length characteristic and a width characteristic corresponding to the first corner characteristic and a length characteristic and a width characteristic corresponding to the second corner characteristic;
determining a cross over union between any two overlapping detection frames;
When the cross over union is greater than a preset threshold, the method further comprising merging the arbitrary two overlapping detection frames into one detection frame.

10. The method according to any one of claims 1 to 9,
Specifying the category of the target detection target based on the target characteristic includes:
performing one or more stages of convolution processing on the target feature, and obtaining a probability that the target detection target belongs to one or more preset categories;
and specifying a category of the target detection object in the preset category based on a probability that the target detection object belongs to one or more preset categories.

an acquisition module for acquiring a detection image to be processed;
a specific module for specifying a size characteristic and a corner characteristic corresponding to a target detection target based on the image characteristic of the detection image;
an extraction module for extracting a target feature corresponding to the target detection target from the image feature based on the size feature and the corner feature;
and a classification module for specifying a category of the target detection target based on the target characteristic.

12. The method of claim 11,
The specific module is specifically,
performing one or more stages of convolution processing on the detected image to obtain image characteristics of the detected image;
A target detection apparatus, wherein a corner pulling process is performed on an image characteristic of the detection image, and a size characteristic and a corner characteristic corresponding to a target detection object are obtained.

13. The method of claim 12,
The convolution process includes an up-sampling process and a down-sampling process,
The specific module is specifically,
performing one or more stages of downsampling processing on the detected image, and obtaining a first feature map after one or more stages of downsampling processing;
obtaining a second feature map after one or more stages of upsampling processing on the basis of the first feature map after one or more stages of downsampling processing;
An image feature of the detection image is obtained based on a first feature map after the one or more stages of downsampling processing and a second feature map after the one or more stages of upsampling processing.

14. The method of claim 13,
outputting one first feature map after the down-sampling process of each stage, and outputting one second feature image after the up-sampling process of each stage;
The specific module is specifically,
With respect to the first-stage upsampling processing among the one or more stages of upsampling processing, a first feature map after the downsampling processing of the last stage among the one or more stages of downsampling processing is used as an input to the first stage upsampling processing, ,
Obtaining a second feature map output after the up-sampling process of the first stage,
A second feature map output after the upsampling process immediately before the N-th up-sampling process with respect to the N-th up-sampling process among the one or more up-sampling processes, and a second output after the N-th up-sampling process using a first feature map matching the feature map as an input to the N-th step upsampling process;
and obtain a second feature map output by the N-th up-sampling process, wherein N is a positive integer greater than 1.

15. The method of claim 14,
The specific module specifically includes a second feature map output after the up-sampling process immediately before the N-th up-sampling process, and a first feature map matching the second feature map output after the N-th up-sampling process. A target detection device for obtaining an input for the N-th up-sampling process by fusion.

13. The method of claim 12,
The specific module is specifically,
performing corner pooling processing on the image features of the detected image to obtain a processing result;
performing convolution processing on the processing result using a first branch network to obtain a size characteristic corresponding to a target detection target;
A target detection apparatus, comprising: performing convolution processing on the processing result using a second branching network different from the first branching network in the number of channels to obtain a corner feature corresponding to a target detection target.

17. The method according to any one of claims 11 to 16,
The extraction module is specifically,
specifying a feature region having a mapping relationship with an image region of the target detection target in the detection image based on the size feature and the corner feature;
and extracting a target feature corresponding to the target detection target from a feature region of the image feature.

18. The method of claim 17,
The corner feature corresponding to the target detection object includes at least a first corner feature and a second corner feature corresponding to the target detection object,
The size characteristic corresponding to the target detection object includes a length characteristic and a width characteristic corresponding to a first corner characteristic of the target detection object, and a length characteristic and a width characteristic corresponding to a second corner characteristic of the target detection object, target detection device.

19. The method of claim 18,
specifying a detection frame of the target detection object in the detection image based on a length characteristic and a width characteristic corresponding to the first corner characteristic and a length characteristic and a width characteristic corresponding to the second corner characteristic; Target detection, further comprising a merging module for determining a cross-over union between detection frames, and merging the two overlapping detection frames into one detection frame when the cross-over union is greater than a preset threshold value Device.

20. The method according to any one of claims 11 to 19,
The classification module is specifically,
performing one or more stages of convolution processing on the target feature, and obtaining a probability that the target detection target belongs to one or more preset categories;
and specifying a category of the target detection object in the preset category based on a probability that the target detection object belongs to one or more preset categories.

processor and
a memory for storing instructions executable by the processor;
The electronic device, wherein the processor is configured to execute the method according to any one of claims 1 to 10 by invoking an instruction stored in the memory.

A computer readable storage medium having stored thereon computer program instructions, which, when executed by a processor, realizes the method of any one of claims 1 to 10.

A computer program comprising computer readable code, wherein when the computer readable code operates in an electronic device, a processor of the electronic device executes instructions for realizing the method of any one of claims 1 to 10.