WO2021117363A1 - オブジェクト検出方法及びオブジェクト検出装置 - Google Patents
オブジェクト検出方法及びオブジェクト検出装置 Download PDFInfo
- Publication number
- WO2021117363A1 WO2021117363A1 PCT/JP2020/040222 JP2020040222W WO2021117363A1 WO 2021117363 A1 WO2021117363 A1 WO 2021117363A1 JP 2020040222 W JP2020040222 W JP 2020040222W WO 2021117363 A1 WO2021117363 A1 WO 2021117363A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- end point
- image
- detection method
- area
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- the present disclosure relates to an object detection method and an object detection device that detect a predetermined object from an image.
- Object detection technology that detects objects such as people and vehicles from images taken by cameras is used as the basic technology for applications such as surveillance camera systems and in-vehicle camera systems.
- deep learning has been used as an object detection technique. Examples of the object detection method by deep learning include ExtremeNet (see Non-Patent Document 1) and YOLO (see Non-Patent Document 2).
- Non-Patent Document 1 four endpoints (the minimum value on the X-axis, the maximum value on the X-axis, and the minimum value on the Y-axis) related to the boundary of an object on an image are used by using a trained neural network. The point that becomes the maximum value on the Y-axis) is detected. Then, the accuracy of detecting the position of the object is improved by determining the rectangular area (BB: Bounding Box) surrounding the object using these four end points.
- BB Bounding Box
- Non-Patent Document 2 it is determined whether the detected object corresponds to the object class to be detected, or the "detection" that specifies the position of the area containing the object in the image, which is performed separately in the conventional neural network. High-speed object detection is realized by performing the identification "identification” at the same time by evaluating the entire image only once.
- Non-Patent Document 1 calculates the likelihood indicating the probability of being an end point for each pixel having the same resolution as the input image, and it takes time to calculate the likelihood for each pixel.
- Non-Patent Document 2 does not calculate the position of a feature point such as an end point of an object, so that although it is fast, there is a possibility that the position of the object cannot be detected with sufficient accuracy.
- the present disclosure has been made in view of the above problems, and an object of the present disclosure is to provide an object detection method and an object detection device capable of high-speed and highly accurate object detection.
- the object detection method is an object detection method for detecting a predetermined object from an image, and is an end point estimation method for estimating an end point region including feature points satisfying a criterion regarding an object boundary on the image. It is characterized by having a step.
- the area estimation step for estimating the object area including the object it is determined that the feature points included in the end point area are the feature points of the objects in the object area. It may further include a mapping step to be associated.
- a determination step for determining the object class to which the object included in the object area corresponds may be further provided.
- a correction step for correcting the position and size of the object area according to the associated end point area may be further provided.
- a removal step of removing a part of the object areas based on the degree of overlap of the plurality of object areas may be further provided.
- the feature point may be a point having a maximum value or a minimum value in the coordinate axes of the two-dimensional Cartesian coordinate system among the points on the boundary of the object on the image.
- the feature point is the boundary of the object on the image.
- a set of a point having a maximum value or a minimum value in the coordinate axis of the first coordinate system and a point having a maximum value or a minimum value in the coordinate axis of the second coordinate system may be.
- region estimation step and the end point estimation step may be executed in parallel by a learning model in which machine learning for detecting the object is performed.
- the area estimation step, the end point estimation step, and the determination step may be executed in parallel by a learning model in which machine learning for detecting the object is performed.
- the learning model is a convolutional neural network
- the parameters of the convolutional neural network are a learning image including the object to be detected, a true value of the position of the object to be detected in the learning image, and the learning. It may be determined by machine learning based on the true value of the position of the feature point that satisfies the criteria regarding the boundary of the object to be detected in the image.
- the object detection device of one aspect of the present disclosure is an object detection device that detects a predetermined object from an image, and is an end point estimation that detects an end point region including a feature point that satisfies a criterion regarding the boundary of the object on the image. It is characterized by including a learning model in which machine learning for detecting the object for executing processing is performed.
- the feature points related to the boundary of the object are estimated as a region, it is not necessary to calculate the high processing cost such as the likelihood for each pixel, and the feature points (end points) related to the boundary of the object are detected at high speed. can do. Further, since the area including the end points of the object is estimated instead of the area including the entire object, the boundary of the object can be detected with high accuracy.
- FIG. It is a block diagram which shows the schematic structure of the object detection apparatus 1 which concerns on Embodiment 1.
- FIG. It is a figure which shows an example of the photographed image of the camera 10 which becomes the input of the trained AI model 20. It is a figure which shows the photographed image divided into the grid cell of W ⁇ H. It is a figure which shows the data structure of the object estimation data output by the trained AI model 20. It is a figure which showed the position and size of the object BB in the object estimation data. It is a figure which shows the example of the classification result of the classification performed for each grid cell. It is a figure for demonstrating IoU which is an index of the degree of overlap of two regions.
- FIG. 9A shows an example of the object BB remaining after the processing of the overlapping BB removing unit 30 and each end point BB.
- FIG. 9B is a diagram showing an example of a first end point BB, a second end point BB, a third end point BB, and a fourth end point BB associated with the object BB.
- FIG. 10A is a diagram illustrating the correspondence between the object BB and the end point BB.
- 10B is a diagram showing the object BB after shaping. It is a figure which shows the example which superposed and displayed the object BB position and size of the object detection result, the position of the associated four end point BB, and the determination result of the corresponding grid cell on the input image. It is a flowchart which shows the operation of the object detection device 1.
- Configuration Figure 1 is a block diagram showing the configuration of the object detection device 1.
- the object detection device 1 includes a camera 10, a learned AI (Artificial Intelligence) model 20, a duplicate BB removal unit 30, an association unit 40, and an object detection result storage unit 50.
- AI Artificial Intelligence
- the camera 10 is provided with an image sensor such as a CMOS (Complementary Metal-Axis-Semiconductor field-effect transformer) image sensor or a CCD (Charge-Coupled Device) image sensor, and an electric signal is obtained by photoelectric conversion of the light formed on the image sensor.
- An image of a predetermined size is output by converting to. If the size of the output image of the camera 10 and the size of the input image of the trained AI model 20 are different, the output image of the camera 10 may be resized.
- CMOS Complementary Metal-Axis-Semiconductor field-effect transformer
- CCD Charge-Coupled Device
- the trained AI model 20 is a convolutional neural network that has undergone machine learning to detect a predetermined object using a teacher signal, and is an object by once evaluating the entire image from an input image of a predetermined size. Output the estimated data.
- the object estimation data includes a BB (object BB) that surrounds the object to be detected on the input image, a BB (end point BB) that includes feature points (end points) related to the boundary of the object to be detected on the input image, and an object BB. Contains data such as class probabilities that indicate which of the object classes to be detected corresponds to the object surrounded by. The details of the teacher signal at the time of learning and the object estimation data to be output will be described later.
- the duplicate BB removing unit 30 removes an object BB having a reliability score lower than the threshold value and an object BB having a high degree of overlap with an object BB having a higher reliability score from the object estimation data output by the trained AI model 20. To do. Similarly, the overlapping BB removing unit removes the end point BB having a reliability score lower than the threshold value and the end point BB having a high degree of overlap with the end point BB having a higher reliability score. The reliability score is calculated using the reliability and class probability of the object BB and the end point BB included in the object estimation data.
- the association unit 40 associates the object BB that remains without being removed with the end point BB, shapes the object BB according to the associated end point BB, that is, corrects the position and size of the object BB.
- the object detection result storage unit 50 stores the position and size of the object BB after shaping and the class determination value based on the class probability of the object BB as the detection result.
- Each processing unit of the trained AI model 20, the duplicate BB removal unit 30, and the association unit 40 is composed of a microprocessor, a ROM (Read Only Memory), a RAM (Random Access Memory), an HDD (Hard Disk Drive), and the like. It is a computer system. A computer program loaded from the ROM or HDD is stored in the RAM, and the microprocessor realizes the functions of each processing unit by operating according to the computer program on the RAM.
- a computer program is configured by combining a plurality of instruction codes indicating instructions to a computer in order to achieve a predetermined function.
- the object detection result storage unit 50 is realized by a storage device such as an HDD.
- the trained AI model 20 is a convolutional neural network in which machine learning is performed to detect an object with a person, a dog, a cow, or the like as an object class to be detected.
- the machine learning AI model 20 outputs object estimation data for each of the W ⁇ H grid cells in which the input image is divided.
- FIG. 2 is an example of the input image of the trained AI model 20, and FIG. 3 shows the input image divided into grid cells.
- the input image is divided into 8 ⁇ 6 grid cells.
- FIG. 4 shows the data structure of the object estimation data 400 for each grid cell.
- the object estimation data 400 includes object BB information, first end point BB information, second end point BB information, third end point BB information, fourth end point BB information, and class probability.
- Object BB information consists of relative position (X-axis and Y-axis), size (X-axis and Y-axis), and reliability with respect to the grid cell.
- the relative position with respect to the grid cell is information indicating the estimated position of the object BB, and indicates the upper left coordinate of the object BB when the upper left coordinate of the corresponding grid cell is taken as the origin.
- the size is information indicating the size of the object BB, and indicates the lower right coordinate of the object BB when the upper left coordinate of the object BB is the origin.
- the reliability is information indicating whether an object corresponding to any of the object classes to be detected exists in the object BB, and if the object exists, its position and size can be accurately detected.
- the reliability is a value close to 1 when it is estimated that an object corresponding to the object class to be detected exists in the object BB, and a value close to 0 when it is estimated that the object does not exist.
- the reliability is close to 1 when it is estimated that the position and size can be detected accurately, and is close to 0 when it is estimated that the position and size cannot be detected accurately. ..
- the first endpoint BB information, the second endpoint BB information, the third endpoint BB information, and the fourth endpoint BB information are also relative to the grid cell (X-axis and Y-axis), size (X-axis and Y-axis), and And reliability.
- the point having the minimum value on the X-axis is referred to as the first end point.
- the point with the maximum value on the X-axis is called the second endpoint
- the point with the minimum value on the Y-axis is called the third endpoint
- the maximum value on the Y-axis is also relative to the grid cell (X-axis and Y-axis), size (X-axis and Y-axis), and And reliability.
- the point having the minimum value on the X-axis is referred to as the first end point.
- the point with the maximum value on the X-axis is called the second endpoint
- the point with the minimum value on the Y-axis is called the third endpoint
- the point that becomes is called the fourth end point.
- the first end point BB is a BB including the first end point of the object detected in the object BB of the same grid cell.
- the second end point BB, the third end point BB, and the fourth end point BB are BBs including the second end point, the third end point, and the fourth end point of the object detected in the object BB of the same grid cell, respectively.
- Each end point BB is smaller than the size of the object BB and is estimated as a BB having a size corresponding to the size of the object BB.
- the class probability is information indicating an estimated value of which of the object classes to be detected corresponds to the object included in the object BB of the corresponding grid cell. For example, if the number of object classes is C and each object class is class 1 (person), class 2 (dog), class 3 (cow), ..., A person is included in the object BB. If it is presumed to be included, the probability of a person (class 1) is high (takes a value close to 1), and if it is presumed to include a cow, the probability of a cow (class 3) is high (close to 1). Take a value).
- the trained AI model 20 has 5 dimensional BB information (object BB information, 1st end point BB information, 2nd end point BB information, 3rd end point BB information, 4th end point) for one grid cell. Information) and C-dimensional class probability (5 ⁇ 5 + C) -dimensional object estimation data are output. Since this is calculated for each W ⁇ H grid, the object estimation data output by the trained AI model 20 is W ⁇ H ⁇ (25 + C) dimensional data (third-order tensor).
- FIG. 5 is an example showing the position and size of the object BB of each grid cell in the object estimation data output for the input image.
- W ⁇ H (8 ⁇ 6 in this example) objects BB are output.
- W ⁇ H are output for each end point BB.
- the duplicate BB removal unit 30 classifies each grid cell based on the object estimation data output by the trained AI model 20.
- the duplicate BB removal unit 30 calculates a reliability score for each grid cell, and determines that a grid cell having a reliability score of a predetermined threshold value (for example, 0.6) or less is a background grid cell that does not include an object. To do.
- the duplicate BB removal unit 30 determines that the grid cells other than the background are grid cells of the object class having the highest class probability.
- FIG. 6 is an example of the classification result of the classification performed for each grid cell.
- the reliability score is, for example, the product of the class probability of the object class with the highest probability and the reliability of the object BB.
- the reliability of the object BB may be used as it is as the reliability score, or the class probability of the object class having the highest probability may be used as the reliability score.
- the overlapping BB removing unit 30 removes the object BB and each end point BB of the grid cell determined to be the background.
- the duplicate BB removing unit 30 selects an object BB having a high degree of overlap with the object BB of the grid cell having a higher reliability score for each type of the determined object class for the grid cell determined to be an object class other than the background. Remove. Specifically, for one object class, the degree of overlap between the object BB of the grid cell having the highest reliability score and the object BB of another grid cell is calculated, and the calculated degree of overlap is a predetermined threshold value (for example, 0.6). ) Remove the above object BB. After that, the degree of overlap between the object BB of the grid cell having the highest reliability score among the objects BB that were not removed and the object BB of another grid cell is calculated, and if the degree of overlap is high, the process of removing the object BB is repeated. ..
- IoU Intersection-over-Union
- the degree of duplication for example, IoU (Intersection-over-Union) can be used.
- the IoU does not share the area of the part of the area 701 that is not common with the area 702 with the area 701 of the area 702.
- the overlapping BB removing unit 30 removes the first endpoint BB having a high degree of overlap with the first endpoint BB of the grid cell having a higher reliability score for the first endpoint BB.
- the second end point BB, the third end point BB, and the fourth end point BB removes the first endpoint BB having a high degree of overlap with the first endpoint BB of the grid cell having a higher reliability score for the first endpoint BB.
- the second end point BB, the third end point BB, and the fourth end point BB removes the first endpoint BB having a high degree of overlap with the first endpoint BB of the grid cell having a higher reliability score for the first endpoint BB.
- FIG. 8 shows the removal of the object BB and each end point BB of the grid cell determined to be the background, and the removal of the object BB and each end point BB having a high degree of overlap with the grid cell having a higher reliability score.
- An example of the remaining object BB and each end point BB is shown.
- two object BBs for a grid cell whose object class is "cow", two object BBs, five first endpoints BB, four second endpoints BB, three third endpoints BB, and four fourth endpoints. The four end points BB remain without being removed.
- the association unit 40 associates the remaining object BB with each end point BB after the processing of the duplicate BB removal unit 30. Specifically, the association unit 40 identifies the first endpoint BB at the position closest to the first side of the object BB for one of the remaining objects BB, and the identified first endpoint BB. Is associated with this object BB. Similarly, the second end point BB, the third end point BB, and the fourth end point BB at positions closest to the second side, the third side, and the fourth side of this object BB are specified, and the specified second end point BB, The third end point BB and the fourth end point BB are associated with this object BB.
- the one with the smaller X-axis value is the first side
- the one with the larger X-axis value is the second side
- the one with the smaller Y-axis value is the third side
- the one with the larger Y-axis value is the fourth side.
- the distance between the side and the BB is the distance from the center of the BB to the closest point of the side.
- FIG. 9A shows an example of the object BB and each end point BB remaining after the processing of the duplicate BB removing unit 30, and FIG. 9B shows the first end point BB and the first end point BB associated with the object BB.
- An example of the 2nd end point BB, the 3rd end point BB, and the 4th end point BB is shown.
- the association unit 40 associates the first end point BB, the second end point BB, the third end point BB, and the fourth end point BB with all the remaining objects BB after the processing of the duplicate BB removal unit 30.
- the mapping unit 40 shapes the object BB based on the four endpoints BB with respect to the object BB associated with the four endpoints BB. Specifically, as shown by reference numeral 1001 in FIG. 10A, the association unit 40 sets the first side so that the X coordinate of the first side coincides with the X coordinate of the center of the first end point BB. Move. Similarly, as indicated by reference numerals 1002, 1003, 1004, the second side is moved so that the X coordinate of the second side coincides with the X coordinate of the center of the second endpoint BB, and the Y coordinate of the third side becomes.
- FIG. 10B shows the object BB after shaping.
- the associating unit 40 shapes the object BB based on the four end point BBs for all the remaining object BBs.
- the association unit 40 stores the position and size of the object BB after shaping, the positions of the four end points BB associated with each other, and the classification result of the corresponding grid cells as the object detection result in the object detection result storage unit 50.
- FIG. 11 shows an example in which the object BB position and size of the object detection result, the positions of the four associated end point BBs, and the classification result are superimposed and displayed on the input image.
- FIG. 12 is a flowchart showing the operation of the object detection device 1.
- the camera 10 acquires the captured image (step S1), inputs the captured image to the trained AI model 20, and the trained AI model 20 outputs W ⁇ H ⁇ (25 + 8) dimensional object estimation data (step S2). ..
- the overlapping BB removal unit 30 classifies the grid cells, removes the object BB and the end point BB of the background grid cell (step S3), and also removes the BB of the grid cell having a higher reliability score (object BB and each end point). ) And the BB having a high degree of overlap (object BB and each end point BB) are removed (step S4).
- the association unit 40 associates the remaining object BB with each end point BB (step S5), shapes the object BB based on the position of the associated end point BB (step S6), and after shaping.
- Each end point BB associated with the object BB is output as an object detection result (step S7).
- the trained AI model 20 consists of 24 convolutional layers, 4 pooling layers, and 2 fully connected layers, similar to YOLO described in Non-Patent Document 2. It is a convolutional neural network.
- YOLO the input image is divided into S ⁇ S grid cells and B BBs are output for each grid cell, but the trained AI model 20 is divided into 5 grid cells of the input image W ⁇ H.
- BB object BB, first end point BB, second end point BB, third end point BB, fourth end point BB
- the learning image including the object to be detected, the true value of the position and size of the object BB of the object to be detected in the learning image as a teacher signal, the position and size of the four endpoints BB, and the object BB.
- the object class (one-hot class probability) of the object included in is input.
- the position of the end point BB serving as the teacher signal may be the one whose center coincides with the true value of the end point of the object to be detected and whose size is a constant multiple of the area of the object BB.
- the area of the object BB may be approximated by the length of the diagonal line of the object BB.
- the five errors are (1) the error between the position of the object BB and each end point BB of the grid cell where the center of the object BB of the teacher signal exists and the position of the object BB of the teacher signal and each end point BB, and (2) the teacher signal.
- the error between the size of the object BB and each end point BB of the grid cell where the center of the object BB exists and the size of the object BB and each end point BB of the teacher signal, (3) of the grid cell where the center of the object of the teacher signal exists.
- the error from the reliability (5) the error between the class probability of the grid cell in which the center of the object BB of the teacher signal exists and the object class of the teacher signal.
- the reliability of the object BB and the end point BB of the teacher signal may be calculated as 1, and the non-object reliability may be calculated as 0.
- the first endpoint having the minimum value on the X-axis, the second endpoint having the maximum value on the X-axis, and the Y-axis have been detected, but the end points to be detected are not limited to the above four end points.
- the above four end points in each coordinate system may be detected.
- each processing unit of the trained AI model 20, the duplicate BB removal unit 30, and the association unit 40 is a computer system composed of a microprocessor, ROM, RAM, HDD, and the like.
- a part or all of each processing unit may be composed of a system LSI (Large Scale Integration: large-scale integrated circuit).
- This disclosure is useful as an object detection device mounted on a surveillance camera system or an in-vehicle camera system.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/783,266 US12482130B2 (en) | 2019-12-09 | 2020-10-27 | Object detection method and object detection device |
| JP2021563778A JP7294454B2 (ja) | 2019-12-09 | 2020-10-27 | オブジェクト検出方法及びオブジェクト検出装置 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2019221988 | 2019-12-09 | ||
| JP2019-221988 | 2019-12-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021117363A1 true WO2021117363A1 (ja) | 2021-06-17 |
Family
ID=76329740
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2020/040222 Ceased WO2021117363A1 (ja) | 2019-12-09 | 2020-10-27 | オブジェクト検出方法及びオブジェクト検出装置 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US12482130B2 (https=) |
| JP (1) | JP7294454B2 (https=) |
| WO (1) | WO2021117363A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024070665A1 (ja) * | 2022-09-26 | 2024-04-04 | 富士フイルム株式会社 | 画像処理装置、画像処理方法及びプログラム |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12482130B2 (en) * | 2019-12-09 | 2025-11-25 | Konica Minolta, Inc. | Object detection method and object detection device |
| US12001513B2 (en) * | 2020-11-30 | 2024-06-04 | Nec Corporation | Self-optimizing video analytics pipelines |
| EP4288930A1 (en) * | 2021-02-05 | 2023-12-13 | CBmed GmbH Center for Biomarker Research in Medicine | Representing a biological image as a grid data-set |
| WO2023136418A1 (en) * | 2022-01-13 | 2023-07-20 | Samsung Electronics Co., Ltd. | Method and electronic device for automatically generating region of interest centric image |
| US12586198B2 (en) * | 2022-02-18 | 2026-03-24 | Techcyte, Inc. | Image analysis for identifying objects and classifying background exclusions |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019046007A (ja) * | 2017-08-31 | 2019-03-22 | 株式会社Pfu | 座標検出装置及び学習済みモデル |
| JP2019139497A (ja) * | 2018-02-09 | 2019-08-22 | 株式会社日立ソリューションズ・クリエイト | 画像処理システム及び画像処理方法 |
Family Cites Families (32)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100422370B1 (ko) * | 2000-12-27 | 2004-03-18 | 한국전자통신연구원 | 3차원 물체 부피계측시스템 및 방법 |
| JP3935499B2 (ja) * | 2004-07-26 | 2007-06-20 | 松下電器産業株式会社 | 画像処理方法、画像処理装置および画像処理プログラム |
| GB0608069D0 (en) * | 2006-04-24 | 2006-05-31 | Pandora Int Ltd | Image manipulation method and apparatus |
| AU2010219406B2 (en) * | 2010-05-19 | 2013-01-24 | Plf Agritech Pty Ltd | Image analysis for making animal measurements |
| US9311556B2 (en) * | 2010-05-19 | 2016-04-12 | Plf Agritech Pty Ltd | Image analysis for making animal measurements including 3-D image analysis |
| US8867790B2 (en) * | 2010-08-03 | 2014-10-21 | Panasonic Corporation | Object detection device, object detection method, and program |
| TWI420906B (zh) * | 2010-10-13 | 2013-12-21 | Ind Tech Res Inst | 興趣區域之追蹤系統與方法及電腦程式產品 |
| US9336456B2 (en) * | 2012-01-25 | 2016-05-10 | Bruno Delean | Systems, methods and computer program products for identifying objects in video data |
| CN103971361B (zh) * | 2013-02-06 | 2017-05-10 | 富士通株式会社 | 图像处理装置和方法 |
| JP2016006626A (ja) | 2014-05-28 | 2016-01-14 | 株式会社デンソーアイティーラボラトリ | 検知装置、検知プログラム、検知方法、車両、パラメータ算出装置、パラメータ算出プログラムおよびパラメータ算出方法 |
| JP6348093B2 (ja) * | 2015-11-06 | 2018-06-27 | ファナック株式会社 | 入力データから検出対象物の像を検出する画像処理装置および方法 |
| US10783610B2 (en) * | 2015-12-14 | 2020-09-22 | Motion Metrics International Corp. | Method and apparatus for identifying fragmented material portions within an image |
| US9972092B2 (en) * | 2016-03-31 | 2018-05-15 | Adobe Systems Incorporated | Utilizing deep learning for boundary-aware image segmentation |
| JP2018036898A (ja) * | 2016-08-31 | 2018-03-08 | キヤノン株式会社 | 画像処理装置及びその制御方法 |
| JP6939111B2 (ja) * | 2017-06-13 | 2021-09-22 | コニカミノルタ株式会社 | 画像認識装置および画像認識方法 |
| JP6939608B2 (ja) * | 2018-01-30 | 2021-09-22 | コニカミノルタ株式会社 | 画像認識装置、画像認識方法、および、画像認識プログラム |
| DE112019000049T5 (de) * | 2018-02-18 | 2020-01-23 | Nvidia Corporation | Für autonomes fahren geeignete objekterfassung und erfassungssicherheit |
| US10643336B2 (en) * | 2018-03-06 | 2020-05-05 | Sony Corporation | Image processing apparatus and method for object boundary stabilization in an image of a sequence of images |
| JP6977667B2 (ja) * | 2018-06-01 | 2021-12-08 | 日本電信電話株式会社 | 物体らしさ推定装置、方法、およびプログラム |
| US10817740B2 (en) * | 2018-06-20 | 2020-10-27 | Zoox, Inc. | Instance segmentation inferred from machine learning model output |
| US10936922B2 (en) * | 2018-06-20 | 2021-03-02 | Zoox, Inc. | Machine learning techniques |
| JP7028099B2 (ja) * | 2018-08-02 | 2022-03-02 | 日本電信電話株式会社 | 候補領域推定装置、候補領域推定方法、及びプログラム |
| KR102615196B1 (ko) * | 2018-08-21 | 2023-12-18 | 삼성전자주식회사 | 객체 검출 모델 트레이닝 장치 및 방법 |
| CN109697460B (zh) * | 2018-12-05 | 2021-06-29 | 华中科技大学 | 对象检测模型训练方法、目标对象检测方法 |
| EP3690704B1 (en) * | 2019-01-29 | 2021-02-24 | Accenture Global Solutions Limited | Distributed and self-validating dense object detection in digital images |
| US10776647B2 (en) * | 2019-01-31 | 2020-09-15 | StradVision, Inc. | Method and device for attention-driven resource allocation by using AVM to thereby achieve safety of autonomous driving |
| CN109858569A (zh) * | 2019-03-07 | 2019-06-07 | 中国科学院自动化研究所 | 基于目标检测网络的多标签物体检测方法、系统、装置 |
| US12482130B2 (en) * | 2019-12-09 | 2025-11-25 | Konica Minolta, Inc. | Object detection method and object detection device |
| CN111161301B (zh) * | 2019-12-31 | 2021-07-27 | 上海商汤智能科技有限公司 | 图像分割方法及装置、电子设备和存储介质 |
| WO2021246217A1 (ja) * | 2020-06-05 | 2021-12-09 | コニカミノルタ株式会社 | オブジェクト検出方法、オブジェクト検出装置及びプログラム |
| US12524498B2 (en) * | 2020-12-30 | 2026-01-13 | Synaptics Incorporated | Multi-object detection with single detection per object |
| US20220207305A1 (en) * | 2020-12-30 | 2022-06-30 | Synaptics Incorporated | Multi-object detection with single detection per object |
-
2020
- 2020-10-27 US US17/783,266 patent/US12482130B2/en active Active
- 2020-10-27 JP JP2021563778A patent/JP7294454B2/ja active Active
- 2020-10-27 WO PCT/JP2020/040222 patent/WO2021117363A1/ja not_active Ceased
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019046007A (ja) * | 2017-08-31 | 2019-03-22 | 株式会社Pfu | 座標検出装置及び学習済みモデル |
| JP2019139497A (ja) * | 2018-02-09 | 2019-08-22 | 株式会社日立ソリューションズ・クリエイト | 画像処理システム及び画像処理方法 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024070665A1 (ja) * | 2022-09-26 | 2024-04-04 | 富士フイルム株式会社 | 画像処理装置、画像処理方法及びプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7294454B2 (ja) | 2023-06-20 |
| US12482130B2 (en) | 2025-11-25 |
| US20230009925A1 (en) | 2023-01-12 |
| JPWO2021117363A1 (https=) | 2021-06-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021117363A1 (ja) | オブジェクト検出方法及びオブジェクト検出装置 | |
| JP6842520B2 (ja) | 物体検出方法、装置、機器、記憶媒体及び車両 | |
| CN112233181B (zh) | 6d位姿识别的方法、装置及计算机存储介质 | |
| US9044858B2 (en) | Target object gripping apparatus, method for controlling the same and storage medium | |
| JP7251692B2 (ja) | オブジェクト検出方法、オブジェクト検出装置及びプログラム | |
| EP3168810A1 (en) | Image generating method and apparatus | |
| US10713530B2 (en) | Image processing apparatus, image processing method, and image processing program | |
| CN109493384B (zh) | 相机位姿估计方法、系统、设备及存储介质 | |
| CN116494253B (zh) | 目标物体抓取位姿获取方法及机器人抓取系统 | |
| JP6245880B2 (ja) | 情報処理装置および情報処理手法、プログラム | |
| JP2006252162A (ja) | パターン認識装置及びその方法 | |
| WO2019171628A1 (en) | Image processing system and image processing method | |
| JP2022045905A (ja) | 混合サイズデパレタイジング | |
| JP2025504056A (ja) | 顔姿勢推定方法、装置、電子機器及び記憶媒体 | |
| CN119540942B (zh) | 一种基于YOLOv11和ORB-SLAM3的动态环境密集点云的SLAM方法及系统 | |
| CN115984759B (zh) | 变电站开关状态识别方法、装置、计算机设备和存储介质 | |
| JP7121132B2 (ja) | 画像処理方法、装置及び電子機器 | |
| CN110458128A (zh) | 一种姿态特征获取方法、装置、设备及存储介质 | |
| CN110598647A (zh) | 一种基于图像识别的头部姿态识别方法 | |
| Xie et al. | Geometry-based populated chessboard recognition | |
| CN114820681A (zh) | 一种基于rgb相机的库位检测方法及系统 | |
| JP5791373B2 (ja) | 特徴点位置決定装置、特徴点位置決定方法及びプログラム | |
| US12561902B2 (en) | Information processing apparatus and information processing method | |
| CN113505629B (zh) | 一种基于轻量网络的智能仓储物件识别装置 | |
| JP6448204B2 (ja) | 物体検出装置、物体検出方法及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20899625 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021563778 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20899625 Country of ref document: EP Kind code of ref document: A1 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 17783266 Country of ref document: US |