US20230368033A1

US20230368033A1 - Information processing device, control method, and program

Info

Publication number: US20230368033A1
Application number: US18/227,699
Authority: US
Inventors: Hiroyoshi Miyano; Tetsuaki Suzuki
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2018-06-01
Filing date: 2023-07-28
Publication date: 2023-11-16
Also published as: WO2019229979A1; US20210209396A1; JPWO2019229979A1; JP7006782B2

Abstract

An information processing apparatus (2000) generates likelihood data for each of a plurality of partial regions (12) in image data (10). The likelihood data are data being associated with a position and a size on the image data (10) and indicating a likelihood that a target object exists in an image region at the position with the size. The information processing apparatus (2000) computes a distribution (probability hypothesis density: PHD) of an existence likelihood of a target object with respect to a position and a size by computing the total sum of likelihood data each piece of which is generated for each partial region (12). The information processing apparatus (2000) extracts, from the PHD, partial distributions each of which relates to one target object. For each extracted partial distribution, the information processing apparatus (2000) outputs a position and a size of a target object represented by the partial distribution, based on a statistic of the partial distribution.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation application of U.S. patent application Ser. No. 17/059,678 filed on Nov. 30, 2020, which is a National Stage Entry of PCT/JP2018/021207 filed on Jun. 1, 2018, the contents of all of which are incorporated herein by reference, in their entirety.

TECHNICAL FIELD

The present invention relates to a technology of detecting an object from an image.

BACKGROUND ART

Technologies of detecting an object from image data have been developed. For example, Patent Document 1 discloses a technology of performing object detection by use of a deep neural network. A system in Patent Document 1 generates a feature map of image data by use of a convolutional neural network and, by inputting the generated feature map to a neural network called a region proposal network (RPN), outputs many proposals of rectangular regions (region proposals) each of which including an object. The system further estimates a class of an object included in a region proposal by performing classification in a layer called a box-classification layer. The system also adjusts a position and a size of a region proposal by performing regression in a layer called a box-regression convolutional layer.
Further, a system in Non Patent Document 1 generates a plurality of feature maps by use of a convolutional neural network and outputs many object proposals from each feature map. The each object proposal includes rectangular coordinates and a likelihood of an object class.
Many erroneous outputs not being correct answers are included in the aforementioned outputs in both the technique in Patent Document 1 and the technique in Non Patent Document 1. Therefore, a detection result to be finally output is acquired out of many object proposals by performing processing of reducing neighboring and significantly overlapping region proposals, the processing being called non-maximum suppression.

Claims

1. An information processing apparatus comprising:

at least one memory configured to store instructions; and

at least one processor configured to execute the instructions to perform:

training a neural network by use of one or more combinations of prepared learning image data and an ideal PHD for each of mutually different types of target objects;

acquiring image data;

generating likelihood data for each of a plurality of partial regions included in the image data by inputting the acquired image data to the trained neural network;

computing a distribution of a likelihood of existence of the target objects with respect to a position and a size by computing a total sum of the likelihood data, and extracting, from the computed distribution, one or more partial distributions each of which relates to one target object; and

outputting, for each of the one or more partial distribution, a position and a size of the one target object relating to the partial distribution, based on a statistic of the partial distribution.

2. The information processing apparatus according to claim 1, wherein

the likelihood data is represented by a distribution conforming to a predetermined model, and

for the each partial region, the trained neural network outputs a likelihood that a target object exists in the partial region and a parameter value of the predetermined model.

3. The information processing apparatus according to claim 1, wherein

the at least one processor is configured to execute the instructions to perform:

computing a number of target objects included in the image data, based on an integral value of the distribution represented by the total sum of the likelihood data, and

extracting as many as the number of the partial distributions from the distribution represented by the total sum of the likelihood data.

4. The information processing apparatus according to claim 1, wherein

extracting the partial distributions an integral value of each of which is 1 from the distribution represented by the total sum of the likelihood data.

5. The information processing apparatus according to claim 1, wherein

generating the likelihood data for each of mutually different types of the target objects;

computing, for each of mutually different types of the target objects, a distribution of a likelihood of existence of the target objects and extracting the partial distribution from the distribution; and

outputting a position and a size of a target object relating to the each partial distribution along with a type of the target objects relating to the partial distribution.

6. A control method executed by at least one computer, the control method comprising:

acquiring image data;

7. The control method according to claim 6, wherein,

the control method comprises:

8. The control method according to claim 6, wherein

the control method comprises:

computing a number of target objects included in the image data, based on an integral value of the distribution represented by the total sum of the likelihood data; and

extracting as many as the number of the partial distributions from a distribution represented by the total sum of the likelihood data.

9. The control method according to claim 6, wherein

the control method comprises:

extracting the partial distributions an integral value of each of which is 1 from a distribution represented by the total sum of the likelihood data.

10. The control method according to claim 6, wherein

the control method comprises:

11. A non-transitory recording medium storing a program causing at least one computer to execute:

acquiring image data;