KR102523886B1

KR102523886B1 - A method and a device for detecting small target

Info

Publication number: KR102523886B1
Application number: KR1020210040639A
Authority: KR
Inventors: 강 허
Original assignee: 아폴로 인텔리전트 커넥티비티 (베이징) 테크놀로지 씨오., 엘티디.
Priority date: 2020-05-27
Filing date: 2021-03-29
Publication date: 2023-04-21
Also published as: JP2021179971A; CN111626208B; KR20210042275A; JP7262503B2; CN111626208A

Abstract

본 발명의 실시예는 스몰 타깃 검출 방법 및 장치를 공개한다. 일 측면에 따른 스몰 타깃 검출 방븝은, 스몰 타깃을 포함하는 원본 이미지를 획득하는 단계; 원본 이미지를 저해상도 이미지로 축소하는 단계; 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별하는 단계; 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정하는 단계;를 포함한다. 상기 실시 형태는 2단계 검출 방법을 설계하였고, 먼저 경량 분할 네트워크를 통해 관심 영역을 검색한 후, 관심 영역에서 검출 모델을 실행함으로써, 계산량을 크게 줄일 수 있다.An embodiment of the present invention discloses a small target detection method and apparatus. A method of detecting a small target according to an aspect includes acquiring an original image including a small target; reducing the original image to a low-resolution image; identifying candidate regions containing small targets in the low-resolution image using a lightweight segmentation network; A region of the original image corresponding to the candidate region is set as a region of interest, and a pre-trained detection model is executed in the region of interest to determine a position of the small target in the original image. The above embodiment designs a two-step detection method, and first searches for a region of interest through a lightweight segmentation network and then executes a detection model in the region of interest, thereby greatly reducing the amount of computation.

Description

Small target detection method and apparatus {A METHOD AND A DEVICE FOR DETECTING SMALL TARGET}

본 발명의 실시예는 컴퓨터 기술분야에 관한 것으로, 구체적으로 스몰 타깃을 검출하는 방법 및 장치에 관한 것이다.Embodiments of the present invention relate to the field of computer technology, and specifically to a method and apparatus for detecting a small target.

타깃 검출은 자율 주행 분야에서 하나의 중요한 연구 방향이다. 주요 탐지 타깃은 정지 타깃과 이동 타깃 두 가지로 나뉜다. 정지 타깃은 신호등, 교통 표지판, 차선, 장애물 등과 같은 것이고, 이동 타깃은 차량, 보행자, 비동력 차량 등과 같은 것이다. 그 중 교통 표지판 검출은 주행 중인 무인 주행 차량에 풍부하고 필수적인 네비게이션 정보를 제공하며, 이는 매우 중요한 근본적인 작업이다.Target detection is an important research direction in the field of autonomous driving. The main detection targets are divided into two types: stationary targets and moving targets. Stationary targets are traffic lights, traffic signs, lanes, obstacles, etc., and moving targets are vehicles, pedestrians, non-motorized vehicles, etc. Among them, traffic sign detection provides rich and essential navigation information to driving unmanned vehicles, which is a very important fundamental task.

AR 네비게이션 등 애플리케이션에서 현재 도로 구간의 교통 표지판을 실시간으로 검출하고 사용자에게 대응되게 프롬프트하는 것은 매우 중요하다. 차량 탑재 동영상에서, 교통 표지판의 사이즈 분포 범위가 넓고 대량의 스몰 타깃(20 픽셀 이하)가 존재하는데, 스몰 타깃 검출은 검출 알고리즘 자체를 테스트해야할 뿐만 아니라, 이미지의 높은 해상도 유지를 필요로 하기 때문에, 차량 머신의 제한된 컴퓨팅 성능에 대한 커다란 테스트가 된다.In applications such as AR navigation, it is very important to detect traffic signs on the current road section in real time and prompt users accordingly. In vehicle-mounted video, traffic signs have a wide size distribution and a large number of small targets (20 pixels or less). It is a great test of the limited computing power of vehicle machines.

교통 표지판 식별의 시효성을 확보하기 위해, 기존의 수단들은 대부분 YOLO 모델을 사용하여 입력 이미지를 트레이닝시키고, 획득한 예측값을 통해 교통 표시판이 속하는 분류를 예측함으로써 식별을 완료한다. YOLO모델의 트레이닝 네트워크는 C1~C7까지 총 7개의 콘볼루션 트레이닝층과 2개의 완전 연결 계층을 포함하는 CNN 모델이므로, 비교적 빠른 속도로 식별을 완료할 수 있지만, 교통 표지판은 일반적으로 수집된 원본 이미지 중 매우 작은 부분만 차지하고, 특징맵이 1개의 콘볼루션 계층을 지날때마다 그 크기가 계속하여 축소되므로, 기존의 YOLO 모델 방법을 사용하여 다중 콘볼루션을 진행하면, 비교적 작은 이미지의 특징을 잃어버리기 쉬워 교통 표지판 식별의 성공율에 영향을 미친다.In order to ensure validity of traffic sign identification, most of the existing means train an input image using a YOLO model, and complete identification by predicting a classification to which a traffic sign belongs through an obtained prediction value. The training network of the YOLO model is a CNN model that includes a total of 7 convolutional training layers from C1 to C7 and 2 fully connected layers, so identification can be completed relatively quickly, but traffic signs are generally collected from original images. Since the size of the feature map continues to decrease every time it passes through one convolutional layer, when multiple convolutions are performed using the existing YOLO model method, relatively small image features are lost. Ease affects the success rate of traffic sign identification.

본 발명의 실시예는 스몰 타깃 검출 방법 및 장치를 제공한다.Embodiments of the present invention provide a method and apparatus for detecting a small target.

일 측면에 따른 스몰 타깃 검출 방법은, 스몰 타깃을 포함하는 원본 이미지를 획득하는 단계; 원본 이미지를 저해상도 이미지로 축소하는 단계; 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별하는 단계; 및 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정하는 단계;를 포함한다.A method of detecting a small target according to an aspect includes acquiring an original image including a small target; reducing the original image to a low-resolution image; identifying candidate regions containing small targets in the low-resolution image using a lightweight segmentation network; and determining a position of the small target in the original image by determining a region of the original image corresponding to the candidate region as a region of interest and executing a pre-trained detection model in the region of interest.

일부 실시예에서, 검출 모델은, 초기 검출 모델의 네트워크 구조를 결정하고 초기 검출 모델의 네트워크 파라미터를 초기화하고; 트레이닝 샘플 세트를 획득하고 - 트레이닝 샘플은 샘플 이미지와 샘플 이미지 중 스몰 타깃의 위치를 표시하기 위한 라벨링 정보를 포함함 -; 트레이닝 샘플을 복제, 멀티 스케일 변화, 편집 중의 적어도 하나의 방식을 통해 증강시키고; 증강 후의 트레이닝 샘플 세트 중의 트레이닝 샘플 중의 샘플 이미지와 라벨링 정보를 각각 초기 검출 모델의 입력 및 예상 출력으로 하고, 기계학습 방법을 이용하여 초기 검출 모델을 트레이닝하며; 및 트레이닝하여 획득한 초기 검출 모델을 미리 트레이닝된 검출 모델로 결정하는 방식을 통해 트레이닝된다.In some embodiments, the detection model includes: determining a network structure of the initial detection model and initializing network parameters of the initial detection model; obtaining a training sample set, the training sample including a sample image and labeling information for indicating a position of a small target in the sample image; augmenting the training samples in at least one of replication, multi-scale transformation, and editing; take the sample images and labeling information in the training samples in the set of training samples after augmentation as inputs and expected outputs of the initial detection model, respectively, and train the initial detection model using a machine learning method; and a method of determining an initial detection model acquired through training as a pre-trained detection model.

일부 실시예에서, 트레이닝 샘플은, 샘플 이미지에서 스몰 타깃을 커팅하고; 스몰 타깃을 줌(zoom) 및/또는 회전 조작한 후 샘플 이미지의 다른 위치에 랜덤으로 붙여 새로운 샘플 이미지를 획득하는 방식을 통해 편집된다.In some embodiments, the training sample cuts a small target in the sample image; After zooming and/or rotating the small target, it is edited through a method of acquiring a new sample image by attaching it randomly to another location of the sample image.

일부 실시예에서, 상기 방법은, 분할 네트워크의 트레이닝 샘플을 제작시, 원래 태스크 검출에 사용된 구형 박스 내의 픽셀점을 양성 샘플로 설정하고, 구형 박스 밖의 픽셀점을 음성 샘플로 설정하는 단계; 길이와 폭이 사전에 결정된 픽셀 수량보다 작은 스몰 타깃의 구형 박스를 바깥으로 확장하는 단계; 및 바깥으로 확장된 구형 박스 내의 픽셀을 모두 양성 샘플로 설정하는 단계;를 더 포함한다.In some embodiments, the method further includes, when constructing training samples of the segmentation network, setting pixel points within a rectangle box originally used for task detection as positive samples, and setting pixel points outside the rectangle boxes as negative samples; expanding outwardly a rectangular box of a small target whose length and width are smaller than a predetermined number of pixels; and setting all the pixels within the outwardly expanded rectangular box as positive samples.

일부 실시예에서, 검출 모델은 심층 신경망이다.In some embodiments, the detection model is a deep neural network.

일부 실시예에서, 검출 모델은 각각의 예측 계층 특징을 융합한 후 주의 모듈을 인입하여, 상이한 채널의 특징을 위해 하나의 적합한 가중치를 학습한다.In some embodiments, the detection model fuses each prediction layer feature and then introduces an attention module to learn one suitable weight for feature of different channels.

다른 측면에 따른 스몰 타깃 검출 장치는, 스몰 타깃을 포함하는 원본 이미지를 획득하는 획득 유닛; 원본 이미지를 저해상도 이미지로 축소하는 축소 유닛; 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별하는 제1 검출 유닛; 및 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정하는 제2 검출 유닛;을 포함한다.An apparatus for detecting a small target according to another aspect includes an acquiring unit acquiring an original image including a small target; a reduction unit that reduces the original image to a low-resolution image; a first detection unit for identifying a candidate region containing a small target in the low-resolution image using a lightweight segmentation network; and a second detection unit configured to set a region of the original image corresponding to the candidate region as a region of interest and execute a pre-trained detection model in the region of interest to determine the position of the small target in the original image.

일부 실시예에서, 본 발명의 실시예에서 제공되는 상기 장치는 트레이닝 유닛을 더 포함하되, 상기 트레이닝 유닛은, 초기 검출 모델의 네트워크 구조를 결정하고 초기 검출 모델의 네트워크 파라미터를 초기화하고; 트레이닝 샘플 세트를 획득하고 - 트레이닝 샘플은 샘플 이미지와 샘플 이미지 중 스몰 타깃의 위치를 표시하기 위한 라벨링 정보를 포함함 - ; 트레이닝 샘플을 복제, 멀티 스케일 변화, 편집 중의 적어도 하나의 방식을 통해 증강시키고; 증강 후의 트레이닝 샘플 세트 중의 트레이닝 샘플 중의 샘플 이미지와 라벨링 정보를 각각 초기 검출 모델의 입력 및 예상 출력으로 하고, 기계학습 장치를 이용하여 초기 검출 모델을 트레이닝하며; 트레이닝하여 획득한 초기 검출 모델을 미리 트레이닝된 검출 모델로 결정한다.In some embodiments, the apparatus provided in the embodiments of the present invention further includes a training unit, wherein the training unit determines a network structure of an initial detection model and initializes network parameters of the initial detection model; obtaining a training sample set, the training sample including a sample image and labeling information for indicating a position of a small target in the sample image; augmenting the training samples in at least one of replication, multi-scale transformation, and editing; The sample images and labeling information in the training samples in the set of training samples after augmentation are respectively used as inputs and expected outputs of the initial detection model, and the initial detection model is trained using a machine learning device; An initial detection model obtained by training is determined as a pre-trained detection model.

일부 실시예에서, 트레이닝 유닛은, 샘플 이미지에서 스몰 타깃을 커팅하고; 스몰 타깃을 줌(zoom) 및/또는 회전 조작한 후 샘플 이미지의 다른 위치에 랜덤으로 붙여 새로운 샘플 이미지를 획득한다.In some embodiments, the training unit cuts a small target in the sample image; After zooming and/or rotating the small target, a new sample image is obtained by randomly attaching the small target to a different position of the sample image.

일부 실시예에서, 제1 검출 유닛은, 분할 네트워크의 트레이닝 샘플을 제작시, 원래 태스크 검출에 사용된 구형 박스 내의 픽셀점을 양성 샘플로 설정하고, 구형 박스 밖의 픽셀점을 음성 샘플로 설정하며; 길이와 폭이 사전에 결정된 픽셀 수량보다 작은 스몰 타깃의 구형 박스를 바깥으로 확장하고; 바깥으로 확장시킨 구형 박스 내의 픽셀을 모두 양성 샘플로 설정하도록 구성된다.In some embodiments, the first detection unit sets the pixel points within the rectangle box originally used for detecting the task as positive samples, and sets the pixel points outside the rectangle box as negative samples when preparing training samples of the segmentation network; expand outward the rectangle box of the small target, the length and width of which are less than a predetermined number of pixels; It is configured to set all pixels within the spherical box that extends outward as positive samples.

또 다른 측면에 따른 전자 기기는, 적어도 하나의 프로세서; 적어도 하나의 프로그램이 저장된 저장 장치를 포함하고, 적어도 하나의 프로그램이 적어도 하나의 프로세서에 의해 실행될 경우, 적어도 하나의 프로세서가 상술한 방법을 구현한다.An electronic device according to another aspect includes at least one processor; When the storage device includes a storage device in which at least one program is stored, and the at least one program is executed by at least one processor, the at least one processor implements the above method.

또 다른 측면에따른 컴퓨터 프로그램이 저장된 컴퓨터 판독 가능한 저장 매체는, 프로그램이 프로세서에 의해 실행될 경우 상술한 방법이 구현된다.In a computer readable storage medium in which a computer program according to another aspect is stored, the above method is implemented when the program is executed by a processor.

또 다른 측면에따른 컴퓨터 판독 가능한 저장 매체에 저장된 컴퓨터 프로그램은, 상기 컴퓨터 프로그램이 프로세서에 의해 실행될 경우 상술한 방법이 구현된다.A computer program stored in a computer readable storage medium according to another aspect, the above-described method is implemented when the computer program is executed by a processor.

본 발명의 실시예에서 제공되는 스몰 타깃 검출 방법 및 장치는 주요하게 트레이닝 방법, 모델 구조, 2단계 검출 이 3개 방면으로부터 해결하며, 그 중 트레이닝 방법과 모델 구조는 주로 모델이 스몰 타깃에 대한 검출 능력을 향상시키는데 사용되고, 2단계 검출은 이미지와 무관한 영역의 계산량을 감소시키는데 사용되어, 연산 속도를 향상시킨다.The small target detection method and apparatus provided in the embodiments of the present invention are mainly solved from three aspects: a training method, a model structure, and a two-step detection. It is used to improve performance, and two-step detection is used to reduce the amount of computation in regions unrelated to the image, thereby improving computation speed.

본 발명은 AR네비게이션 프로젝트를 위해 실시간 교통 표지판 검출 알고리즘을 제공할 수 있고, 스몰 타깃 검출에서 보다 우수한 성능을 보여주므로, 사용자의 네비게이션 체험을 향상시킬 수 있다.The present invention can provide a real-time traffic sign detection algorithm for an AR navigation project, and can improve a user's navigation experience because it shows better performance in small target detection.

아래 첨부 도면에 도시된 비 제한적인 실시예의 상세한 설명에 대한 열독 및 참조를 통해 본 발명의 다른 특징, 목적 및 장점이 보다 명백해질 것이다.
도 1은 본 발명의 일 실시예가 적용될 수 있는 예시적 시스템 아키텍처이다.
도 2는 본 발명에 따른 스몰 타깃 검출 방법의 일 실시예의 흐름도이다.
도 3은 본 발명에 따른 스몰 타깃 검출 방법의 일 응용 장면의 모식도이다.
도 4는 본 발명에 따른 스몰 타깃 검출 방법의 다른 실시예의 흐름도이다.
도 5는 본 발명에 따른 스몰 타깃 검출 방법의 검출 모델의 네트워크 구조도이다.
도 6은 본 발명에 따른 스몰 타깃 검출 장치의 일 실시예의 구조 모식도이다.
도 7은 본 발명의 실시예를 구현하는데 적합한 전자 기기의 컴퓨터 시스템의 구조 모식도이다.Other features, objects and advantages of the present invention will become more apparent upon reading and reference to the detailed description of the non-limiting embodiments shown in the accompanying drawings below.
1 is an exemplary system architecture to which one embodiment of the present invention may be applied.
2 is a flowchart of an embodiment of a small target detection method according to the present invention.
3 is a schematic diagram of one application scene of the small target detection method according to the present invention.
4 is a flowchart of another embodiment of a small target detection method according to the present invention.
5 is a network structure diagram of a detection model of a small target detection method according to the present invention.
6 is a structural schematic diagram of an embodiment of a small target detection device according to the present invention.
7 is a structural schematic diagram of a computer system of an electronic device suitable for implementing an embodiment of the present invention.

본 실시예들에서 사용되는 용어는 본 실시예들에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 부분에서 상세히 그 의미를 기재할 것이다. 따라서, 본 실시예들에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 실시예들 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the present embodiments have been selected from general terms that are currently widely used as much as possible while considering the functions in the present embodiments, but these may vary depending on the intention of a person skilled in the art or a precedent, the emergence of new technologies, etc. there is. In addition, in a specific case, there are also terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the relevant part. Therefore, the term used in the present embodiments should be defined based on the meaning of the term and the general content of the present embodiment, not a simple name of the term.

본 실시예들은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는바, 일부 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 본 실시예들을 특정한 개시형태에 대해 한정하려는 것이 아니며, 본 실시예들의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 명세서에서 사용한 용어들은 단지 실시예들의 설명을 위해 사용된 것으로, 본 실시예들을 한정하려는 의도가 아니다.Since the present embodiments can have various changes and various forms, some embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the present embodiments to a specific disclosure, and should be understood to include all changes, equivalents, or substitutes included in the spirit and scope of the present embodiments. Terms used in this specification are only used for description of the embodiments, and are not intended to limit the embodiments.

본 실시예들에 사용되는 용어들은 다르게 정의되지 않는 한, 본 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미가 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 실시예들에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않아야 한다.Terms used in the present embodiments have the same meaning as commonly understood by a person of ordinary skill in the art to which the present embodiments belong, unless otherwise defined. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present embodiments, in an ideal or excessively formal meaning. should not be interpreted.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이러한 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 본 명세서에 기재되어 있는 특정 형상, 구조 및 특성은 본 발명의 정신과 범위를 벗어나지 않으면서 일 실시예로부터 다른 실시예로 변경되어 구현될 수 있다. 또한, 각각의 실시예 내의 개별 구성요소의 위치 또는 배치도 본 발명의 정신과 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 행하여지는 것이 아니며, 본 발명의 범위는 특허청구범위의 청구항들이 청구하는 범위 및 그와 균등한 모든 범위를 포괄하는 것으로 받아들여져야 한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 구성요소를 나타낸다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The detailed description of the present invention which follows refers to the accompanying drawings which illustrate, by way of illustration, specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable any person skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different from each other but are not necessarily mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented from one embodiment to another without departing from the spirit and scope of the present invention. It should also be understood that the location or arrangement of individual components within each embodiment may be changed without departing from the spirit and scope of the present invention. Therefore, the detailed description to be described later is not performed in a limiting sense, and the scope of the present invention should be taken as encompassing the scope claimed by the claims and all scopes equivalent thereto. Like reference numbers in the drawings indicate the same or similar elements throughout the various aspects.

한편, 본 명세서에서 하나의 도면 내에서 개별적으로 설명되는 기술적 특징은 개별적으로 구현될 수도 있고, 동시에 구현될 수도 있다.Meanwhile, technical features individually described in one drawing in this specification may be implemented individually or simultaneously.

본 명세서에서, "~유닛(unit)"은 프로세서 또는 회로와 같은 하드웨어 구성(hardware component), 및/또는 프로세서와 같은 하드웨어 구성에 의해 실행되는 소프트웨어 구성(software component)일 수 있다.In this specification, “~ unit” may be a hardware component such as a processor or a circuit, and/or a software component executed by a hardware component such as a processor.

아래 첨부 도면 및 실시예를 참조하여 본 발명을 더 상세히 설명한다. 여기서 설명되는 구체적인 실시예는 관련 발명을 해석하기 위한 것일 뿐 본 발명은 이에 한정되지 않음을 이해할 수 있을 것이다. 이 밖에, 설명의 편의를 위해 도면에는 해당 발명과 관련된 부분만이 도시되었음을 유의해야 한다.The present invention will be described in more detail with reference to the accompanying drawings and examples below. It will be understood that the specific embodiments described herein are only for interpreting the related invention, and the present invention is not limited thereto. In addition, it should be noted that only parts related to the invention are shown in the drawings for convenience of explanation.

모순되지 않는 한 본 발명의 실시예 및 실시예의 특징은 서로 조합될 수 있음을 유의해야 한다. 아래 첨부 도면을 참조하고 실시예를 참조하여 본 발명을 상세히 설명하기로 한다.It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other unless contradictory. The present invention will be described in detail with reference to the accompanying drawings below and reference to examples.

도 1은 본 발명의 일 실시예가 적용될 수 있는 예시적 시스템 아키텍처이다.1 is an exemplary system architecture to which one embodiment of the present invention may be applied.

도 1에 도시된 바와 같이, 시스템 아키텍처(100)는 차량(101)및 교통 표지판(102)을 포함할 수 있다.As shown in FIG. 1 , system architecture 100 may include vehicle 101 and traffic sign 102 .

차량(101)은 일반 동력 엔진 차량일 수 있고 무인 주행 차량일 수도 있다. 차량(101)에는 컨트롤러(1011), 네트워크(1012) 및 센서(1013)가 설치될 수 있다. 네트워크(1012)는 컨트롤러(1011)와 센서(1013) 사이에 통신 링크의 매체를 제공하기 위한 것이다. 네트워크(1012)는 다양한 연결 타입을 포함할 수 있는 바, 예를 들면 유선, 무선 통신 링크 또는 광섬유 케이블 등이다.The vehicle 101 may be a general power engine vehicle or may be an unmanned vehicle. A controller 1011, a network 1012, and a sensor 1013 may be installed in the vehicle 101. Network 1012 is to provide a medium of communication link between controller 1011 and sensor 1013. Network 1012 may include a variety of connection types, such as wired, wireless communication links, or fiber optic cables.

컨트롤러(또는, 차량용 브레인이라고도 함)(1011)는 차량(101)의 스마트 제어를 책임진다. 컨트롤러(1011)는 프로그래머블 로직 컨트롤러(Programmable Logic Controller, PLC), 단일침 마이크로 컴퓨터, 산업용 컨트롤 컴퓨터 등과 같은 별도로 설치된 컨트롤러일 수 있고, 입력/출력 포트를 구비하고, 연산 및 제어 기능이 있는 전자 소자로 이루어진 다른 기기일 수도 있으며, 차량 주행 제어 타입 애플리케이션이 설치된 컴퓨터 기기일 수도 있다. 컨트롤러에는 트레이닝을 거친 분할 네트워크와 검출 모델이 설치된다.The controller (or, also referred to as a vehicle brain) 1011 is responsible for smart control of the vehicle 101 . The controller 1011 may be a separately installed controller such as a programmable logic controller (PLC), a single needle microcomputer, an industrial control computer, etc., and is an electronic device having an input/output port and having operation and control functions. It may be another device configured, or a computer device in which a vehicle driving control type application is installed. The trained split network and detection model are installed in the controller.

센서(1013)는 카메라, 중력 센서, 휠 속도 센서, 온도 센서, 습도 센서, 레이저 레이더, 밀리파 레이더 등과 같은 다양한 타입의 센서일 수 있다. 일부 상황에서, 차량(101)에는 GNSS(Global Navigation Satellite System, 글로벌 네비게이션 위성 시스템) 기기 및 SINS(Strap-down Inertial Navigation System, 스트랩 다운형 관성 네비게이션 시스템) 등이 설치될 수도 있다.The sensor 1013 may be various types of sensors, such as a camera, gravity sensor, wheel speed sensor, temperature sensor, humidity sensor, laser radar, millimeter wave radar, and the like. In some circumstances, a global navigation satellite system (GNSS) device and a strap-down inertial navigation system (SINS) may be installed in the vehicle 101 .

차량(101)이 주행 과정에서 교통 표지판(102)을 촬영한다. 비교적 먼 거리에서 촬영하여 얻은 이미지 이든지, 근 거리이서 촬영하여 얻은 이미지 이든지 막론하고, 이미지 중의 교통 표지판은 모두 스몰 타깃이다.While the vehicle 101 is driving, the traffic sign 102 is photographed. Regardless of whether it is an image obtained by shooting from a relatively long distance or an image obtained by shooting from a short distance, all traffic signs in the image are small targets.

차량(101)은 교통 표지판이 포함된 촬영된 원본 이미지를 컨트롤러에 의해 식별하여, 교통 표지판의 위치를 결정해낸다. 또한 OCR식별을 진행하여 교통 표지판의 내용을 식별해낼 수도 있다. 다음, 음성 또는 문자 형식으로 교통 표지판의 내용을 출력한다.The vehicle 101 determines the position of the traffic sign by identifying the captured original image including the traffic sign by the controller. In addition, OCR identification can be performed to identify the contents of traffic signs. Next, the content of the traffic sign is output in the form of voice or text.

본 발명의 실시예에서 제공되는 스몰 타깃 검출 방법은 일반적으로 컨트롤러(1011)에 의해 수행되고, 상응하게, 스몰 타깃 검출 장치는 일반적으로 컨트롤러(1011)에 설치된다.The small target detection method provided in the embodiments of the present invention is generally performed by the controller 1011, and correspondingly, the small target detection device is generally installed in the controller 1011.

도 1 중의 컨트롤러(1011), 네트워크(1012) 및 센서(1013)의 개수는 예시적인 것일 뿐이며, 실제 필요에 따라 차량(101)은 임의의 개수의 컨트롤러, 네트워크 및 센서를 구비할 수 있다.The number of controllers 1011, networks 1012, and sensors 1013 in FIG. 1 is merely exemplary, and the vehicle 101 may include any number of controllers, networks, and sensors according to actual needs.

도 2는 본 발명에 따른 스몰 타깃 검출 방법의 일 실시예의 흐름도이다.2 is a flowchart of an embodiment of a small target detection method according to the present invention.

스몰 타깃 검출 방법은 하기와 같은 단계를 포함한다.The small target detection method includes the following steps.

단계(201)에서, 스몰 타깃을 포함하는 원본 이미지를 획득한다.In step 201, an original image including a small target is acquired.

본 실시예에서, 스몰 타깃 검출 방법의 수행 주체(예를 들어, 도 1에 도시된 컨트롤러(1011))는 차량용 카메라를 통해 전방 이미지를 수집할 수 있고, 수집한 원본 이미지는 스몰 타깃을 포함한다. 스몰 타깃은 길이와 폭의 픽셀 수량이 사전 결정값(예를 들어 20)보다 작은 타깃 물체의 이미지를 가리킨다.In this embodiment, a subject performing the small target detection method (for example, the controller 1011 shown in FIG. 1 ) may collect front images through a vehicle camera, and the collected original image includes the small target. . The small target refers to an image of a target object in which the number of pixels of length and width is smaller than predetermined values (eg, 20).

단계(202)에서, 원본 이미지를 저해상도 이미지로 축소한다.In step 202, the original image is reduced to a lower resolution image.

본 실시예에서, 원본 사진 길이 및 폭 방향에서 각각 4(또는 다른 배수)로 나누어 저해상도 이미지를 얻을 수 있다. 축소 과정에 길이 및 폭의 비율은 변하지 않는다.In this embodiment, a low-resolution image can be obtained by dividing by 4 (or other multiples) in the length and width directions of the original photo. During the reduction process, the ratio of length and width does not change.

단계(203)에서, 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별한다.In step 203, a lightweight segmentation network is used to identify candidate regions containing small targets in the low-resolution image.

본 실시예에서, 제1 단계 검출시 타깃이 존재할 수 있는 대략적인 위치만 위치 결정하여야 하고, 정확한 외부 테두리가 필요하지 않기에, 경량의 분할 네트워크가 사용되며, 그 최종 출력 히트 맵(heat map)에서 일정한 임계값보다 큰 포인트를 타깃이 존재하는 의심 포인트로 간주한다. U-Net과 유사한 분할 네트워크를 사용할 수 있으며, 백본(backbone) 네트워크로는 경량화를 위해 shufflenet를 사용한다.In this embodiment, since only the approximate location where the target can be present must be located during the first step detection, and an accurate outer border is not required, a lightweight segmentation network is used, and the final output heat map A point larger than a certain threshold value in is regarded as a suspicious point where the target exists. A split network similar to U-Net can be used, and as a backbone network, shufflenet is used for light weight.

분할 네트워크의 트레이닝 샘플을 제작 시, 원래 태스크 검출에 사용된 구형 박스 내의 픽셀점을 양성 샘플로 설정하고, 구형 박스 밖의 픽셀점을 음성 샘플로 설정한다. 길이 및 폭 방향에서의 줌(zoom)이 존재하기에, 스몰 타깃에서의 리콜 레이트(rate)를 보장하기 위해, 트레이닝 샘플을 제작할 경우, 길이 및 폭이 사전 결정값보다 작게 하고, 예를 들어 20픽셀인 타깃의 구형 박스를 바깥으로 1배 확장시킨 다음, 바깥으로 확장시킨 구형 박스 내의 픽셀을 모두 양성 샘플로 설정한다.When creating a training sample for the segmentation network, the pixel points inside the rectangle box originally used for task detection are set as positive samples, and the pixel points outside the rectangle box are set as negative samples. Since there is a zoom in the length and width directions, in order to guarantee a recall rate in a small target, when making a training sample, the length and width are smaller than predetermined values, for example 20 After extending the rectangular box of the target, which is a pixel, by a factor of 1, all pixels within the expanded rectangular box are set as positive samples.

단계(204)에서, 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정한다.In step 204, a region of the original image corresponding to the candidate region is set as a region of interest, and a pre-trained detection model is executed in the region of interest to determine the position of the small target in the original image.

본 실시예에서, 분할 네트워크가 출력한 결과 중의 노이즈 포인트를 필터링한 후, 나머지 모든 의심 타깃 포인트를 둘러싸는 하나의 최소 외접 구형을 형성하고, 줌을 거치지 않은 고해상도 이미지에서 상기 구형에 대응되는 영역을 관심 영역으로 한다. 그 후에, 상기 관심 영역에서 검출 모델을 실행한다. 이렇게 고해상도 사진의 일부 영역만 처리하면 되므로 계산량을 감소시킨다.In this embodiment, after filtering noise points in the results output by the segmentation network, a minimum circumscribed sphere is formed enclosing all remaining suspect target points, and an area corresponding to the sphere is obtained in a high-resolution image that has not been zoomed. as an area of interest. After that, a detection model is run in the region of interest. In this way, only a part of the high-resolution picture needs to be processed, thereby reducing the amount of calculation.

상술한 바와 같이, 스몰 타깃을 보다 우수하게 검출해내기 위해, 사진은 고해상도를 유지해야 하고, 사진이 크면 계산량도 증가하므로 차량 머신 환경에서 실시간 처리를 진행할 수 없게 된다. 한편, 교통 표지판은 사진에서 차지하는 비율이 매우 작고 대부분이 배경 영역이므로, 배경 영역의 계산량이 전체 계산량의 큰 부분을 차지하며, 고해상도에서 배경 영역을 처리하는 것인 시간 소모가 크고 무의미하다. 따라서, 본 발명은 2단계 검출 방식을 사용하며, 우선 하나의 경량의 분할 네트워크를 통해 저해상도의 사진에서 의심 타깃의 대략적인 위치를 위치 결정한 다음, 모든 의심 타깃이 포함되어 있는 최소 외접 구형을 구하고, 마지막으로 상기 최소 외접 구형에 대응되는 고해상도 이미지 블록에서 검출 모델을 실행함으로써, 스몰 타깃의 검출율을 보장하는 상황 하에서 계산량을 감소시킨다.As described above, in order to better detect a small target, the high resolution of the photo must be maintained, and if the photo is large, the amount of calculation increases, so real-time processing cannot be performed in a vehicle machine environment. On the other hand, traffic signs occupy a very small percentage of the picture and most of them are background areas, so the background area calculation amount occupies a large part of the total calculation amount, and processing the background areas at high resolution is time consuming and meaningless. Therefore, the present invention uses a two-step detection method, first, through one lightweight segmentation network, the approximate position of the suspect target is determined in the low-resolution photo, and then the minimum circumscribed sphere containing all the suspect targets is obtained, Finally, by executing the detection model on the high-resolution image block corresponding to the minimum circumscribed rectangle, the amount of calculation is reduced under the condition that the detection rate of the small target is guaranteed.

상기 2단계의 처리를 거친 후, 검출 모델의 평균 계산량은 원래 계산량의 약 25%로 감소되고, 두 개의 모델을 합한 평균 계산량은 원래 계산량의 약 45%로 감소된다.After the two-step process, the average computational amount of the detection model is reduced to about 25% of the original computational amount, and the average computational amount of the sum of the two models is reduced to about 45% of the original computational amount.

도 3은 본 발명에 따른 스몰 타깃 검출 방법의 일 응용 장면의 모식도이다.3 is a schematic diagram of one application scene of the small target detection method according to the present invention.

도 3의 응용 장면에서, 차량은 주행 과정에서 전방 이미지를 실시간으로 수집한다. 획득한 원본 이미지의 길이 및 폭을 4로 나누어 저해상도 이미지로 축소시킨다. 저해상도 이미지를 경량의 분할 네트워크에 입력하여, 교통 표지판을 포함하는 후보 영역을 식별한다. 다음, 원본 이미지에서 후보 영역에 대응되는 원본 이미지의 영역을 찾아 관심 영역으로 한다. 관심 영역의 이미지를 커팅해내어, 미리 트레이닝된 검출 모델이 입력하여, 교통 표지판이 원본 이미지에서의 구체적인 위치를 결정하고, 점선 박스로 도시한다.In the application scene of FIG. 3 , the vehicle collects front images in real time while driving. Divide the length and width of the acquired original image by 4 to reduce it to a low-resolution image. A low-resolution image is fed into a lightweight segmentation network to identify candidate regions containing traffic signs. Next, a region of the original image corresponding to the candidate region is found in the original image and set as a region of interest. The image of the region of interest is cut out, and a pre-trained detection model inputs it to determine the specific location of the traffic sign in the original image, which is indicated by a dotted line box.

본 발명의 상기 실시예에서 제공되는 방법은 2차 검출을 거치므로, 계산량을 감소시키고 식별 속도와 정확도를 향상시킨다.The method provided in the above embodiment of the present invention undergoes secondary detection, thereby reducing the amount of calculation and improving identification speed and accuracy.

도 4는 본 발명에 따른 스몰 타깃 검출 방법의 다른 실시예의 흐름도이다.4 is a flowchart of another embodiment of a small target detection method according to the present invention.

스몰 타깃 검출 방법의 프로세스는 하기와 같은 단계를 포함한다.The process of the small target detection method includes the following steps.

단계(401)에서, 초기 검출 모델의 네트워크 구조를 결정하고, 초기 검출 모델의 네트워크 파라미터를 초기화한다.In step 401, a network structure of an initial detection model is determined, and network parameters of the initial detection model are initialized.

본 실시예에서, 스몰 타깃 검출 방법을 수행하는 전자 기기(예를 들어, 도 1에 도시된 컨트롤러(1011))는 검출 모델을 트레이닝할 수 있다. 또는, 제3의 서버에 의해 검출 모델을 트레이닝한 다음 차량의 컨트롤러에 설치할 수도 있다. 검출 모델은 신경망 모델이고, 기존의 임의의 타깃 검출용 신경망일 수 있다.In this embodiment, an electronic device (eg, the controller 1011 shown in FIG. 1 ) performing the small target detection method may train a detection model. Alternatively, the detection model may be trained by a third server and then installed in the controller of the vehicle. The detection model is a neural network model, and may be any existing neural network for target detection.

본 실시예의 일부 선택 가능한 구현 형태에서, 검출 모델은 YOLO계열 네트워크와 같은 심층 신경망이다. YOLO(You Only Look Once)는 심층 신경망에 기반한 객체 식별 및 위치 결정 알고리즘이고, 가장 큰 특정은 실행 속도가 매우 빠른 것이며, 실시간 시스템에 사용될 수 있다. 현재 YOLO는 이미 v3버전(YOLO3)까지 발전하였으나, 새로운 버전도 기전 버전을 토대로 계속 개진하여 진화된 것이다. YOLO3 원래의 구조 설계에서, 업 샘플링을 통해 저해상도 특징맵은 고해상도 특징맵과 융합된다. 그러나 이러한 융합은 고해상도 특징맵에서만 발생하고, 상이한 척도의 특징을 충분히 융합시킬 수 없다.In some optional implementations of this embodiment, the detection model is a deep neural network, such as a YOLO-based network. YOLO (You Only Look Once) is an object identification and location determination algorithm based on a deep neural network, and the greatest feature is that the execution speed is very fast, and it can be used in a real-time system. Currently, YOLO has already developed to the v3 version (YOLO3), but the new version is also evolved by continuing to develop based on the previous version. In the original structural design of YOLO3, low-resolution feature maps are fused with high-resolution feature maps through upsampling. However, such fusion occurs only in high-resolution feature maps, and features of different scales cannot be sufficiently fused.

상이한 계층의 특징을 보다 잘 융합하기 위해, 본 발명은 우선 백본 네트워크에서 8배, 16배 및 32배로 다운 샘플링된 특징을 기초 특징을 선택하고, 다음 상이한 크기의 타깃을 예측하기 위해, 예측 특징맵의 크기는 각각 사진을 8배, 16배 및 32배로 다운 샘플링한 크기로 설정된다. 각각의 예측 특징맵의 특징은 모두 3개의 기초 특징층에서 획득한 것이고, 다운 샘플링 또는 업 샘플링을 통해 동일한 사이즈로 통일시킨 후 융합한다. 사진을 다운 샘플링 16배로 다운 샘플링한 예측 계층을 예로 들면, 이의 특징은 각각 3개의 기초 특징층으로부터 획득되고, 통일한 사이즈로 통일시키기 위해, 8배로 다운 샘플링한 기초 특징층에 대해 1배의 다운 샘플링을 진행하며, 32배로 다운 샘플링된 기초 특징층에 대해 1배의 업 샘플링을 진행한 다음, 다시 두 개의 특징층을 16배로 다운 샘플링한 기초 특징층과 융합시킨다.In order to better fuse the features of different layers, the present invention first selects 8-fold, 16-fold, and 32-fold downsampled features in the backbone network as basic features, and then, to predict targets of different sizes, the predicted feature map The size of is set to the size of downsampling the picture to 8x, 16x, and 32x, respectively. The features of each predictive feature map are all obtained from the three basic feature layers, unified into the same size through downsampling or upsampling, and then fused. Taking a picture as an example of a prediction layer downsampled by a factor of 16, its features are obtained from each of the three basic feature layers, and 1x downsampled for the basic feature layer downsampled by a factor of 8 to unify them into a unified size. Sampling is performed, 1x upsampling is performed on the basic feature layer downsampled by a factor of 32, and then the two feature layers are merged with the basic feature layer downsampled by a factor of 16.

상이한 척도의 특징만 단순히 융합하면, 이 3개의 예측 계층 중의 특징의 비중은 모두 마찬가지이므로, 각기 상이한 예측 타깃에 따라 집중적으로 사용할 수 없다. 따라서, 각각의 예측 계층 특징이 융합된 후 다시 주의 모듈을 인입하여, 상이한 채널의 특징을 위해 하나의 적합한 가중치를 학습하고, 이렇게 각각의 예측 계층은 자체가 필요로 하는 예측 타깃의 특성에 따라, 융합 후의 특징을 집중적으로 사용할 수 있다. 네트워크 구조는 도 5에 도시된 바와 같다. 주의 모듈의 파라미터의 학습 방식은 선행 기술이므로 여기서 더이상 설명하지 않는다.If only features of different scales are simply fused, the proportions of features in these three prediction layers are all the same, so it cannot be used intensively according to different prediction targets. Therefore, after the features of each prediction layer are fused, the attention module is introduced again to learn one suitable weight for the features of different channels. Features after fusion can be used intensively. The network structure is as shown in FIG. 5 . Since the learning method of the parameters of the attention module is a prior art, it is not further described here.

본 발명은 검출 네트워크로 YOLO3을 사용할 수 있고, 앵커(anchor) 포인트에 기반한 이러한 검출 방법에서 anchor의 설계 및 할당이 매우 중요하며, 스몰 타깃에 매칭될 수 있는 anchor의 수량이 매우 적기 때문에, 모델에 의한 스몰 타깃에 대한 학습이 부족함을 직접적으로 초래할 수 있어, 스몰 타깃을 잘 검출할 수 없다. 이를 위해 동적인 anchor 매칭 메커니즘을 사용하여, ground truth(기본 참값)의 크기에 따라 anchor와 ground truth 매칭시의 IOU(신뢰 점수) 임계값을 적응적으로 선택하고, 타깃이 비교적 작을 때, IOU 임계값을 낮게 조절하여, 더욱 많은 스몰 타깃이 트레이닝에 참여하여, 모델이 스몰 타겟 검출에서의 성능을 향상시킬 수 있다. 트레이닝 샘플을 제작할 경우, 타깃의 크기를 이미 알고 있기에, 타깃 크기에 따라 적합한 IOU 임계값을 선택한다.The present invention can use YOLO3 as a detection network, design and allocation of anchors are very important in this detection method based on anchor points, and the number of anchors that can be matched to small targets is very small. can directly lead to a lack of learning on small targets by To this end, a dynamic anchor matching mechanism is used to adaptively select an IOU (confidence score) threshold when matching anchors and ground truth according to the size of the ground truth (default true value), and when the target is relatively small, the IOU threshold By adjusting the value low, more small targets can participate in training, and the model can improve its performance in small target detection. When creating a training sample, the size of the target is already known, so an appropriate IOU threshold is selected according to the size of the target.

단계(402)에서, 트레이닝 샘플 세트를 획득한다.At step 402, a set of training samples is obtained.

본 실시예에서, 트레이닝 샘플은 샘플 이미지와 샘플 이미지 중 스몰 타깃의 위치를 표시하기 위한 라벨링 정보를 포함한다.In this embodiment, the training sample includes a sample image and labeling information for displaying a position of a small target among the sample images.

단계(403)에서, 트레이닝 샘플을 복제, 멀티 스케일 변화, 편집 중의 적어도 하나의 방식을 통해 증강시킨다.In step 403, the training samples are augmented in at least one of replication, multi-scale transformation, and editing.

본 실시예에서, 이것은 주로 트레이닝 데이터 중 스몰 타깃 수량이 부족한 경우에 대해 사용되는 전략이다. 한편으로는 데이터 세트 중 스몰 타깃을 포함하는 사진을 복수 개로 복제하여, 데이터 중 스몰 타깃의 수량을 직접 증가시키고, 다른 한편으로는 사진 중의 스몰 타깃을 커팅하고, 줌, 회전 등 조작을 진행한 후, 다시 이미지 다른 위치에 랜덤으로 붙인다. 이렇게 스몰 타깃의 수량을 증가할 수 있을 뿐만 아니라, 더욱 많은 변화를 도입할 수 있어 트레이닝 데이터의 할당을 풍부하게 한다.In this embodiment, this is a strategy mainly used for the case where the quantity of small targets in the training data is insufficient. On the one hand, a plurality of photos containing small targets in the data set are duplicated, and the number of small targets in the data is directly increased; , and attach it randomly to another location on the image again. In this way, not only can the quantity of small targets be increased, but also more variations can be introduced, enriching the allocation of training data.

선택 가능하게, 트레이닝 사진을 상이한 척도로 줌한 후 트레이닝함으로써, 기존 데이터 세트 중의 타깃 척도 변화를 풍부하게 할 수 있고, 모델이 상이한 척도 타깃의 검출 태스크에 적응될 수 있도록 한다.Optionally, by zooming the training picture to different scales and then training, it is possible to enrich the target scale change in the existing data set, and make the model adapt to the detection task of the different scale target.

단계(404)에서, 증강 후의 트레이닝 샘플 세트 중의 트레이닝 샘플 중의 샘플 이미지와 라벨링 정보를 각각 초기 검출 모델의 입력 및 예상 출력으로 하고, 기계학습 방법을 이용하여 초기 검출 모델을 트레이닝한다.In step 404, sample images and labeling information in training samples in the training sample set after augmentation are used as inputs and expected outputs of the initial detection model, respectively, and the initial detection model is trained using a machine learning method.

본 실시예에서, 수행 주체는 트레이닝 샘플 세트 중의 트레이닝 샘플 중의 샘플 이미지를 초기 검출 모델에 입력하여, 상기 샘플 이미지 중 스몰 타깃의 위치 정보를 얻어 상기 트레이닝 샘플 중의 라벨링 정보를 초기 검출 모델의 예상 출력으로 하고, 기계학습 방법을 이용하여 초기 검출 모델을 트레이닝할 수 있다. 구체적으로, 우선 사전 설정된 손실 함수에 의해 산출하여 얻은 위치 정보와 상기 트레이닝 샘플 중의 라벨링 정보 사이의 차이를 이용하고, 예를 들어, L2함수를 손실 함수로 사용하여 산출하여 얻은 위치 정보와 상기 트레이닝 샘플 중의 라벨링 정보 사이의 차이를 이용한다. 다음, 산출하여 얻은 차이에 기반하여, 초기 검출 모델의 네트워크 파라미터를 조절하고, 사전 설정된 트레이닝 종료 조건을 만족하는 경우, 트레이닝을 종료할 수 있다. 예를 들어, 여기서 사전 설정된 트레이닝 종료 조건은 트레이닝 시간이 사전에 설정된 시간 길이를 초과하는 것, 트레이닝 횟수가 사전에 설정된 횟수를 초과하는 것 및 산출하여 얻은 차이가 사전에 설정된 차이 임계값보다 작은 것 중의 하나를 포함할 수 있으나, 이에 제한되지 않는다.In this embodiment, the performing entity inputs a sample image of a training sample in a training sample set to an initial detection model, obtains position information of a small target in the sample image, and sets labeling information in the training sample as an expected output of the initial detection model. and train an initial detection model using a machine learning method. Specifically, first, the difference between the position information obtained by calculating by a preset loss function and the labeling information in the training sample is used, for example, the position information obtained by using the L2 function as a loss function and the position information obtained by calculating the training sample. It uses the difference between the labeling information in Next, based on the calculated difference, network parameters of the initial detection model are adjusted, and training may be terminated when a preset training end condition is satisfied. For example, the preset training end condition here is that the training time exceeds the preset time length, the number of training times exceeds the preset number of times, and the calculated difference is less than the preset difference threshold. It may include one of, but is not limited thereto.

여기서, 다양한 구현 형태를 사용하고, 생성된 위치 정보와 상기 트레이닝 샘플 중의 라벨링 정보 사이의 차이에 기반하여, 초기 검출 모델의 네트워크 파라미터를 조절할 수 있다. 예를 들어, BP(Back Propagation, 역전파) 알고리즘 또는 SGD(Stochastic Gradient Descent, 확률적 경사 하강) 알고리즘을 사용하여 초기 검출 모델의 네트워크 파라미터를 조절할 수 있다.Here, various implementations may be used, and network parameters of the initial detection model may be adjusted according to the difference between the generated location information and the labeling information in the training sample. For example, network parameters of an initial detection model may be adjusted using a Back Propagation (BP) algorithm or a Stochastic Gradient Descent (SGD) algorithm.

단계(405)에서, 트레이닝하여 획득한 초기 검출 모델을 미리 트레이닝된 검출 모델로 결정한다.In step 405, an initial detection model obtained by training is determined as a pre-trained detection model.

본 실시예에서, 트레이닝 단계의 수행 주체는 단계(404)에서 트레이닝하여 얻은 초기 검출 모델을 미리 트레이닝된 검출 모델로 결정할 수 있다.In this embodiment, the subject performing the training step may determine the initial detection model obtained by training in step 404 as a pre-trained detection model.

도 6은 본 발명에 따른 스몰 타깃 검출 장치의 일 실시예의 구조 모식도이다.6 is a structural schematic diagram of an embodiment of a small target detection device according to the present invention.

도 6을 참조하면, 장치(600) 실시예는 도 2에 도시된 방법 실시예와 대응되고, 상기 장치(600)는 구체적으로 다양한 전자 기기에 응용될 수 있다.Referring to FIG. 6 , an embodiment of the device 600 corresponds to the embodiment of the method shown in FIG. 2 , and the device 600 may be specifically applied to various electronic devices.

도 6에 도시된 바와 같이, 본 실시예의 스몰 타깃 검출 장치(600)는, 획득 유닛(601), 축소 유닛(602), 제1 검출 유닛(603) 및 제2 검출 유닛(604)을 포함한다. 여기서, 획득 유닛(601)은 스몰 타깃을 포함하는 원본 이미지를 획득하고; 축소 유닛(602)은 원본 이미지를 저해상도 이미지로 축소하며; 제1 검출 유닛(603)은 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별하고; 제2 검출 유닛(604)은 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정한다.As shown in FIG. 6 , the small target detection apparatus 600 of this embodiment includes an acquisition unit 601, a reduction unit 602, a first detection unit 603 and a second detection unit 604. . Here, the acquiring unit 601 acquires an original image containing a small target; the reduction unit 602 reduces the original image to a low-resolution image; The first detection unit 603 uses a lightweight segmentation network to identify candidate regions containing small targets in the low-resolution image; The second detection unit 604 sets the region of the original image corresponding to the candidate region as the region of interest, and executes a pre-trained detection model in the region of interest to determine the position of the small target in the original image.

본 실시예에서, 스몰 타깃 검출 장치(600)의 획득 유닛(601), 축소 유닛(602), 제1 검출 유닛(603), 제2 검출 유닛(604)의 구체적인 처리는 도 2의 대응되는 실시예 중의 단계(201), 단계(202), 단계(203), 단계(204)를 참조할 수 있다.In this embodiment, the specific processing of the acquisition unit 601, the reduction unit 602, the first detection unit 603, and the second detection unit 604 of the small target detection device 600 corresponds to the corresponding implementation in FIG. Step 201, step 202, step 203, step 204 in the example can be referred to.

본 실시예의 일부 선택 가능한 구현 형태에서, 장치(600)는, 초기 검출 모델의 네트워크 구조를 결정하고, 초기 검출 모델의 네트워크 파라미터를 초기화하고; 트레이닝 샘플 세트를 획득하며, 트레이닝 샘플은 샘플 이미지와 샘플 이미지 중 스몰 타깃의 위치를 표시하기 위한 라벨링 정보를 포함하고; 트레이닝 샘플을 복제, 멀티 스케일 변화, 편집 중의 적어도 하나의 방식을 통해 증강시키며; 증강 후의 트레이닝 샘플 세트 중의 트레이닝 샘플 중의 샘플 이미지와 라벨링 정보를 각각 초기 검출 모델의 입력 및 예상 출력으로 하고, 기계학습 장치를 이용하여 초기 검출 모델을 트레이닝하며; 트레이닝하여 획득한 초기 검출 모델을 미리 트레이닝된 검출 모델로 결정하는 트레이닝 유닛(미도시)을 더 포함한다.In some optional implementation forms of this embodiment, the device 600 determines the network structure of the initial detection model, initializes network parameters of the initial detection model; A training sample set is obtained, the training sample including a sample image and labeling information for indicating a position of a small target in the sample image; augmenting the training sample through at least one of replication, multi-scale transformation, and editing; The sample images and labeling information in the training samples in the set of training samples after augmentation are respectively used as inputs and expected outputs of the initial detection model, and the initial detection model is trained using a machine learning device; It further includes a training unit (not shown) for determining an initial detection model obtained by training as a pre-trained detection model.

본 실시예의 일부 선택 가능한 구현 형태에서, 트레이닝 유닛(미도시)은 또한, 샘플 이미지에서 스몰 타깃을 커팅하고; 스몰 타깃을 줌(zoom) 및/또는 회전 조작한 후 샘플 이미지의 다른 위치에 랜덤으로 붙여 새로운 샘플 이미지를 획득한다.In some optional implementations of this embodiment, the training unit (not shown) also cuts a small target in the sample image; After zooming and/or rotating the small target, a new sample image is obtained by randomly attaching the small target to a different position of the sample image.

본 실시예의 일부 선택 가능한 구현 형태에서, 제1 검출 유닛(603)은, 분할 네트워크의 트레이닝 샘플을 제작시, 원래 태스크 검출에 사용된 구형 박스 내의 픽셀점을 양성 샘플로 설정하고, 구형 박스 밖의 픽셀점을 음성 샘플로 설정하며; 길이와 폭이 사전에 결정된 픽셀 수량보다 작은 스몰 타깃의 구형 박스를 바깥으로 확장하고; 바깥으로 확장시킨 구형 박스 내의 픽셀을 모두 양성 샘플로 설정한다.In some optional implementation forms of this embodiment, the first detection unit 603, when creating training samples of the segmentation network, sets the pixel points within the rectangle box originally used for task detection as positive samples, and the pixels outside the rectangle box. set points to audio samples; expand outward the rectangle box of the small target, the length and width of which are less than a predetermined number of pixels; Set all pixels within the outwardly expanded rectangle box as positive samples.

본 실시예의 일부 선택 가능한 구현 형태에서, 검출 모델은 심층 신경망이다.In some optional implementations of this embodiment, the detection model is a deep neural network.

본 실시예의 일부 선택 가능한 구현 형태에서, 검출모델은 각각의 예측 계층 특징을 융합한 후 주의 모듈을 인입하여, 상이한 채널의 특징을 위해 하나의 적합한 가중치를 학습한다.In some optional implementations of this embodiment, the detection model introduces an attention module after merging features of each prediction layer to learn a suitable weight for features of different channels.

도 7은 본 발명의 실시예를 구현하는데 적합한 전자 기기의 컴퓨터 시스템의 구조 모식도이다.7 is a structural schematic diagram of a computer system of an electronic device suitable for implementing an embodiment of the present invention.

도 7에 도시된 전자 기기(700)(예를 들면, 도 1의 컨트롤러(1011))는 하나의 예시일 뿐, 본 발명의 실시예의 기능 또는 사용범위에 대한 어떠한 한정도 아니다.The electronic device 700 shown in FIG. 7 (eg, the controller 1011 of FIG. 1 ) is only an example, and is not intended to be any limitation on the function or scope of use of an embodiment of the present invention.

도 7에 도시된 바와 같이, 전자 기기(700)는 판독 전용 메모리(ROM)(702)에 저장된 프로그램 또는 저장 장치(708)로부터 랜덤 액세스 메모리(RAM)(703)로 로딩된 프로그램에 따라 다양하고 적절한 동작 및 처리를 수행할 수 있는 처리 장치(예를 들면 중앙 처리 장치, 그래픽 처리 장치 등)(701)를 포함한다. RAM(703)에는 또한 전자 기기(700)의 조작에 필요한 다양한 프로그램 및 데이터가 저장된다. 처리 장치(701), ROM(702) 및 RAM(703)은 버스(704)를 통해 서로 연결된다. 입/출력(I/O) 인터페이스(705) 역시 버스(704)에 연결된다.As shown in FIG. 7 , the electronic device 700 varies according to a program stored in a read-only memory (ROM) 702 or a program loaded into a random access memory (RAM) 703 from a storage device 708 and and a processing unit (eg, central processing unit, graphics processing unit, etc.) 701 capable of performing appropriate operations and processing. The RAM 703 also stores various programs and data necessary for operating the electronic device 700 . The processing unit 701 , ROM 702 and RAM 703 are connected to each other via a bus 704 . An input/output (I/O) interface 705 is also connected to bus 704.

일반적으로, 예를 들어 터치 스크린, 터치 패드, 키보드, 마우스, 카메라, 마이크, 가속도계, 자이로스코프 등을 포함하는 입력 장치(706); 예를 들어 액정 디스플레이(LCD), 스피커, 진동기 등을 포함하는 출력 장치(707); 예를 들어 자기 테이프, 하드 드라이버 등을 포함하는 저장 장치(708); 및 통신 장치(709)는 I/O 인터페이스(705)에 연결될 수 있다. 통신 장치(709)는 전자 기기(700)가 무선 또는 유선으로 다른 기기와 통신하여 데이터를 교환하도록 허용할 수 있다. 비록 도 7에서 다양한 장치를 갖는 전자 기기(700)를 나타냈지만, 모든 도시된 장치를 실시하거나 구비할 필요는 없음을 이해해야 한다. 전자 기기(700)는 보다 많거나 보다 적은 장치를 대체적으로 실시하거나 구비할 수 있다. 도 7에 도시된 각각의 블록은 하나의 장치를 대표할 수 있고, 수요에 따라 다수의 장치를 대표할 수도 있다.Input devices 706 generally include, for example, touch screens, touch pads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, and the like; an output device 707 including, for example, a liquid crystal display (LCD), a speaker, a vibrator, and the like; storage device 708 including, for example, magnetic tape, hard drives, etc.; and communication device 709 can be coupled to I/O interface 705 . The communication device 709 may allow the electronic device 700 to exchange data by communicating with another device wirelessly or wired. Although FIG. 7 shows an electronic device 700 with various devices, it should be understood that it is not necessary to implement or have all of the illustrated devices. The electronic device 700 may alternatively implement or include more or fewer devices. Each block shown in FIG. 7 may represent one device, or may represent multiple devices according to demand.

특히, 본 발명의 실시예에 따르면, 앞에서 흐름도를 참조하여 설명한 과정은 컴퓨터 소프트웨어 프로그램으로서 구현될 수 있다. 예를 들어, 본 발명의 실시예는 컴퓨터 판독 가능 매체에 베어링된 컴퓨터 프로그램을 포함하는 컴퓨터 프로그램 제품을 포함하고, 상기 컴퓨터 프로그램은 흐름도에 도시된 방법을 수행하기 위한 프로그램 코드를 포함한다. 이런 실시예에서, 상기 컴퓨터 프로그램은 통신 장치(709)를 통해 네트워크로부터 다운로드 및 설치될 수 있거나 및/또는 ROM(702)으로부터 설치될 수 있다. 상기 컴퓨터 프로그램이 처리 장치(701)에 의해 실행될 때, 본 발명의 방법에 한정된 상기 기능들이 수행된다. 본 발명에 기재된 컴퓨터 판독 가능 매체는 컴퓨터 판독 가능 신호 매체 또는 컴퓨터 판독 가능 매체 또는 이 양자의 임의의 조합 일 수 있음에 유의해야 한다. 컴퓨터 판독 가능 매체는 예를 들어, 전자, 자기, 광학, 전자기, 적외선 또는 반도체 시스템, 장치 또는 소자, 또는 이들의 임의의 조합일 수 있지만, 이에 한정되지 않는다. 컴퓨터 판독 가능 매체의 보다 구체적인 예는 하나 또는 복수의 도선에 의해 전기적 연결, 휴대용 컴퓨터 디스크, 하드 디스크, 랜덤 액세스 메모리(RAM), 판독 전용 메모리(ROM), 소거 가능 프로그램 가능 읽기 전용 메모리(EPROM 또는 플래시 메모리), 광섬유, 휴대용 컴팩트 디스크 판독 전용 메모리(CD-ROM), 광학 저장 장치, 자기 저장 장치 또는 이들의 임의의 적절한 조합을 포함할 수 있지만, 이에 한정되지 않는다. 본 발명에서, 컴퓨터 판독 가능 매체는 명령 실행 시스템, 장치 또는 소자 또는 이들과 결합되어 사용될 수 있는 프로그램을 포함하거나 저장할 수 있는 임의의 타입의 매체일 수 있다. 본 발명에서, 컴퓨터 판독 가능 신호 매체는 컴퓨터 판독 가능 프로그램 코드를 베어링하는 베이스 밴드 또는 캐리어의 일부로 전파되는 데이터 신호를 포함할 수 있다. 이러한 전파된 데이터 신호는 전자기 신호, 광학 신호, 또는 상기 임의의 적절한 조합을 포함하지만 이에 한정되지 않는 다양한 형태를 취할 수 있다. 컴퓨터 판독 가능 신호 매체는 또한 명령 실행 시스템, 장치 또는 소자에 사용되거나 이와 결합하여 사용하기 위한 프로그램을 전송, 전파 또는 전송할 수 있는 컴퓨터 판독 가능 매체 이외의 임의의 컴퓨터 판독 가능 매체일 수 있다. 컴퓨터 판독 가능 매체에 포함된 프로그램 코드는 전기선, 광섬유 케이블, RF(무선 주파수) 등, 또는 상기의 임의의 적절한 조합을 포함하지만 이에 한정되지 않는 임의의 적절한 매체에 의해 전송될 수 있다.In particular, according to an embodiment of the present invention, the process described above with reference to the flowchart may be implemented as a computer software program. For example, an embodiment of the present invention includes a computer program product including a computer program embodied in a computer readable medium, and the computer program includes program code for performing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709 and/or installed from ROM 702 . When the computer program is executed by the processing unit 701, the functions defined in the method of the present invention are performed. It should be noted that computer readable media described herein may be computer readable signal media or computer readable media or any combination of both. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, device or element, or any combination thereof. More specific examples of computer readable media are electrically connected by one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination thereof. In the present invention, a computer readable medium can be any type of medium that can contain or store an instruction execution system, device or element, or a program that can be used in combination therewith. In the present invention, a computer readable signal medium may include a data signal propagated as part of a baseband or carrier bearing computer readable program code. These propagated data signals can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer readable signal medium can also be any computer readable medium other than a computer readable medium that can transmit, propagate, or transport a program for use in or in conjunction with an instruction execution system, device, or element. The program code embodied in a computer readable medium may be transmitted by any suitable medium, including but not limited to electrical wire, fiber optic cable, radio frequency (RF), or the like, or any suitable combination of the foregoing.

상기 컴퓨터 판독 가능 매체는 상기 전자 기기에 포함될 수 있거나 상기 전자 기기에 조립되지 않고 별도로 존재할 수 있다. 상기 컴퓨터 판독 가능 매체에는 적어도 하나의 프로그램이 저장되어, 상기 적어도 하나의 프로그램이 상기 전자 기기에 의해 실행 시 상기 전자 기기로 하여금, 스몰 타깃을 포함하는 원본 이미지를 획득하는 단계; 원본 이미지를 저해상도 이미지로 축소하는 단계; 경량의 분할 네트워크를 사용하여 저해상도 이미지에서 스몰 타깃을 포함하는 후보 영역을 식별해내는 단계; 및 후보 영역에 대응되는 원본 이미지의 영역을 관심 영역으로 하고, 관심 영역에서 미리 트레이닝된 검출 모델을 실행하여, 스몰 타깃의 원본 이미지에서의 위치를 결정하는 단계를 수행하도록 한다.The computer readable medium may be included in the electronic device or may exist separately without being assembled in the electronic device. acquiring an original image including a small target by storing at least one program in the computer readable medium and, when the at least one program is executed by the electronic device; reducing the original image to a low-resolution image; identifying candidate regions containing small targets in the low-resolution image using a lightweight segmentation network; and setting a region of the original image corresponding to the candidate region as a region of interest and executing a pre-trained detection model in the region of interest to determine the position of the small target in the original image.

본 발명의 동작을 수행하기 위한 컴퓨터 프로그램 코드는 하나 또는 하나 이상의 프로그래밍 언어, 또는 그들의 조합으로 작성될 수 있다. 상기 프로그래밍 언어는 Java, Smalltalk, C++를 비롯한 객체 지향 프로그래밍 언어와 "C" 언어 또는 유사한 프로그래밍 언어를 비롯한 기존 절차적 프로그래밍 언어를 포함한다. 프로그램 코드는 완전히 사용자의 컴퓨터에서 실행되거나, 부분적으로 사용자의 컴퓨터에서 실행되거나, 독립형 소프트웨어 패키지로서 실행되거나, 일부는 사용자의 컴퓨터에서 실행되고 일부는 원격 컴퓨터에서 실행되거나, 또는 완전히 원격 컴퓨터 또는 서버에서 실행될 수 있다. 원격 컴퓨터의 경우 원격 컴퓨터는 LAN 또는 WAN을 포함한 모든 종류의 네트워크를 통해 사용자의 컴퓨터에 연결되거나 외부 컴퓨터에 연결될 수 있다(예를 들어, 인터넷 서비스 제공 업체를 이용하여 인터넷을 통해 연결).Computer program code for performing the operations of the present invention may be written in one or more programming languages, or a combination thereof. The programming languages include object oriented programming languages including Java, Smalltalk, C++ and conventional procedural programming languages including the "C" language or similar programming languages. The program code may run entirely on the user's computer, partly on the user's computer, as a standalone software package, partly on the user's computer and partly on a remote computer, or entirely on a remote computer or server. can be executed In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or to an external computer (for example, through the Internet using an Internet Service Provider).

도면의 흐름도 및 블록도는 본 발명의 다양한 실시예에 따른 시스템, 방법 및 컴퓨터 프로그램 제품의 구현 가능한 아키텍처, 기능 및 동작을 도시한다. 이 점에서, 흐름도 또는 블록도의 각 블록은 지정된 논리적 기능을 구현하기 위한 하나 또는 하나 이상의 실행 가능한 명령을 포함하는 모듈, 프로그램 세그먼트 또는 코드의 일부를 나타낼 수 있다. 일부 대안적인 구현에서, 블록에 표기된 기능은 또한 도면에 도시된 것과 다른 순서로 구현될 수 있음에 유의해야 한다. 예를 들어, 연속적으로 표현된 2개의 블록은 실제로 병렬 실행될 수 있고, 관련 기능에 따라 때때로 역순으로 실행될 수도 있다. 또한, 블록도 및/또는 흐름도의 각 블록, 및 블록도 및/또는 흐름도에서 블록의 조합은 지정된 기능 또는 동작을 수행하는 전용 하드웨어 기반 시스템에서 구현될 수 있거나 전용 하드웨어와 컴퓨터 명령어를 조합하여 구현할 수도 있음에 유의해야 한다.The flow diagrams and block diagrams in the drawings illustrate the implementable architectures, functions and operations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block of a flowchart or block diagram may represent a module, program segment, or portion of code that includes one or more executable instructions for implementing a specified logical function. It should be noted that in some alternative implementations, the functions indicated in the blocks may also be implemented in a different order than shown in the figures. For example, two blocks presented sequentially may actually be executed in parallel, and sometimes in reverse order depending on the function involved. In addition, each block in the block diagram and/or flowchart and the combination of blocks in the block diagram and/or flowchart may be implemented in a dedicated hardware-based system that performs a designated function or operation, or may be implemented by combining dedicated hardware and computer instructions. It should be noted that there are

본 발명의 실시예들에 설명된 유닛들은 소프트웨어 또는 하드웨어에 의해 구현될 수 있다. 설명된 유닛은 또한 프로세서, 예를 들어 획득 유닛, 축소 유닛, 제1 검출 유닛 및 제2 검출 유닛을 포함하는 프로세서에 설치될 수도 있다. 여기서 이들 유닛의 명칭은 경우에 따라서는 유닛 자체로 한정되지 않으며, 예를 들어, 획득 유닛은 “사용자 웹페이지 브라우징 요청을 수신하는 유닛”으로 기술될 수도 있다.Units described in the embodiments of the present invention may be implemented by software or hardware. The described unit may also be installed in a processor, eg a processor comprising an acquisition unit, a reduction unit, a first detection unit and a second detection unit. Here, the names of these units are not limited to the units themselves in some cases, and for example, the acquisition unit may be described as “a unit that receives a user webpage browsing request”.

상기 설명은 본 발명의 바람직한 실시예 및 적용된 기술의 원리에 대한 설명일 뿐이다. 본 발명이 속하는 기술분야의 통상의 기술자들은 본 발명에 언급된 본 발명의 범위는 상기 기술적 특징의 특정 조합에 따른 기술적 해결수단에 한정되지 않으며, 동시에 본 발명의 사상을 벗어나지 않으면서 상기 기술적 특징 또는 그 등가 특징에 대해 임의로 조합하여 형성된 다른 기술적 해결수단, 예를 들어, 상기 특징과 본 발명에 공개된(단 이에 한정되지 않음) 유사한 기능을 구비하는 기술적 특징을 서로 교체하여 형성된 기술적 해결수단을 포함함을 이해하여야 한다.The above description is merely a description of the preferred embodiment of the present invention and the principles of applied technology. Those skilled in the art to which the present invention pertains, the scope of the present invention mentioned in the present invention is not limited to the technical solutions according to the specific combination of the above technical features, and at the same time, without departing from the spirit of the present invention, the technical features or Other technical solutions formed by arbitrarily combining the equivalent features, for example, including technical solutions formed by exchanging the above features with technical features having similar functions disclosed in the present invention (but not limited thereto). should understand that

Claims

As a small target detection method,
obtaining an original image including a small target;
reducing the original image to a low-resolution image;
identifying a candidate region containing the small target in the low-resolution image using a lightweight segmentation network; and
determining a location of the small target in the original image by determining a region of the original image corresponding to the candidate region as a region of interest and executing a pre-trained detection model in the region of interest;
The training sample of the split network is
Fabrication by outwardly extending a rectangle box of a small target whose length and width are smaller than a predetermined number of pixels, setting pixels within the expanded rectangle box as positive samples, and setting pixels outside the expanded rectangle box as negative samples. how to be

According to claim 1,
The detection model,
determining a network structure of an initial detection model and initializing network parameters of the initial detection model;
obtaining a training sample set, the training sample including a sample image and labeling information for indicating a position of a small target in the sample image;
augmenting the training sample through at least one of replication, multi-scale transformation, and editing;
taking sample images and labeling information of training samples in the set of training samples after augmentation as inputs and expected outputs of the initial detection model, respectively, and training the initial detection model using a machine learning method; and
A method of training in such a way that the initial detection model obtained by training is determined as the pre-trained detection model.

According to claim 2,
The training sample is
cutting a small target in the sample image; and
A method of editing by performing at least one of zoom and rotation operations on the small target, and then obtaining a new sample image by attaching it randomly to another location of the sample image.

delete

According to any one of claims 1 to 3,
The detection model,
A method that is a deep neural network.

According to claim 5,
The detection model,
A method of learning one suitable weight for features of different channels by merging the features of each prediction layer and then introducing the attention module.

As a small target detection device,
an acquiring unit acquiring an original image including a small target;
a reduction unit that reduces the original image to a low-resolution image;
a first detection unit for identifying a candidate region containing the small target in the low-resolution image using a lightweight segmentation network; and
a second detection unit configured to set a region of the original image corresponding to the candidate region as a region of interest, and execute a pre-trained detection model in the region of interest to determine the position of the small target in the original image; ,
The training sample of the split network is
Fabrication by outwardly extending a rectangle box of a small target whose length and width are smaller than a predetermined number of pixels, setting pixels within the expanded rectangle box as positive samples, and setting pixels outside the expanded rectangle box as negative samples. device to be.

According to claim 7,
The apparatus further comprises a training unit, wherein the training unit comprises:
determining a network structure of an initial detection model and initializing network parameters of the initial detection model;
obtaining a training sample set, the training sample including a sample image and labeling information for indicating a position of a small target in the sample image;
augmenting the training sample through at least one of replication, multi-scale transformation, and editing;
take sample images and labeling information of training samples in the set of training samples after augmentation as inputs and expected outputs of the initial detection model, respectively, and train the initial detection model using a machine learning device;
An apparatus for determining the initial detection model obtained by training as the pre-trained detection model.

According to claim 8,
The training unit,
cutting a small target in the sample image;
Apparatus for obtaining a new sample image by randomly attaching the small target to another location of the sample image after performing at least one of zoom and rotation operations on the small target.

delete

According to any one of claims 7 to 9,
The detection model,
A device that is a deep neural network.

According to claim 11,
The detection model,
A device that learns one suitable weight for features of different channels by merging the features of each prediction layer and then introducing the attention module.

As an electronic device,
at least one processor;
A storage device in which at least one program is stored;
An electronic device in which the at least one processor implements the method according to claim 1 when the at least one program is executed by the at least one processor.

A computer-readable storage medium in which a computer program is stored,
A computer-readable storage medium on which the method according to claim 1 is implemented when the program is executed by a processor.

A computer program stored on a computer readable storage medium,
A computer program in which the method according to claim 1 is implemented when the computer program is executed by a processor.