WO2023228364A1 - 物体検出装置、物体検出方法、及び物体検出プログラム - Google Patents

物体検出装置、物体検出方法、及び物体検出プログラム Download PDF

Info

Publication number
WO2023228364A1
WO2023228364A1 PCT/JP2022/021587 JP2022021587W WO2023228364A1 WO 2023228364 A1 WO2023228364 A1 WO 2023228364A1 JP 2022021587 W JP2022021587 W JP 2022021587W WO 2023228364 A1 WO2023228364 A1 WO 2023228364A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
reliability
object detection
class
detection device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/021587
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
寛之 鵜澤
彩希 八田
周平 吉田
宥光 飯沼
大祐 小林
優也 大森
祐輔 堀下
健 中村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Inc
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to US18/868,738 priority Critical patent/US20250174017A1/en
Priority to JP2024522833A priority patent/JP7794311B2/ja
Priority to PCT/JP2022/021587 priority patent/WO2023228364A1/ja
Publication of WO2023228364A1 publication Critical patent/WO2023228364A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/10Recognition assisted with metadata

Definitions

  • the disclosed technology relates to an object detection device, an object detection method, and an object detection program.
  • BB bounding box
  • class type of person, car, etc.
  • reliability included in the image.
  • YOLO You Only Look Once
  • SSD Single Shot multibox Detector
  • FIG. 1 is a diagram showing a CNN processing flow including detection processing.
  • YOLO or SSD features corresponding to predetermined B num BBs (B[0] to B[B num -1]) are defined for each unit called Grid, which is obtained by dividing an image of W pixels horizontally and H pixels vertically. Map values are obtained by CNN.
  • the feature map values of the CNN output include a value corresponding to the coordinates of BB (tx, ty, tw, th), a value corresponding to the confidence level (object confidence level) regarding the presence or absence of an object at the coordinates (p obj ), and values corresponding to the reliability of each class of the object (p[0] to p[C num ⁇ 1], C num : number of classes).
  • Detection processing converts these feature map values into BBs, removes BBs whose object reliability obtained as a result of the conversion is less than a threshold value, and removes duplicate BBs (Non-Maximum-Suppression: NMS).
  • Non-Patent Documents 1 and 2 A method for performing real-time object detection based on CNN has been disclosed (Non-Patent Documents 1 and 2).
  • the calculations within the CNN (convolution calculations, etc.) until the CNN outputs the feature map (CNN output feature map) are sped up using dedicated hardware.
  • the Detection process that takes as input the CNN output feature map that is the output result of the CNN is implemented in software and cannot be sped up.
  • the CNN output feature map is stored in a DRAM (Dynamic Random Access Memory), the detection process needs to be performed by reading the feature map from the DRAM.
  • DRAM Dynamic Random Access Memory
  • the disclosed technology has been made in view of the above points, and aims to provide an object detection device, an object detection method, and an object detection program that speed up detection processing compared to existing technologies.
  • a first aspect of the present disclosure is an object detection device, which includes a metadata acquisition unit that acquires metadata including at least the position and reliability of an object included in the image from a convolutional neural network into which the image is input; a holding unit that holds a group of feature map values that are output results of the convolutional neural network; and a holding unit that reads feature map values related to the reliability from the group of feature map values held in the holding unit.
  • a feature map value acquisition unit that reads a feature map value related to the corresponding position of the object from the holding unit and obtains the position of the object only when the obtained reliability exceeds a predetermined threshold value.
  • a second aspect of the present disclosure is an object detection method, wherein the processor obtains metadata including at least the position and reliability of an object included in the image from a convolutional neural network into which the image is input, and A group of feature map values that are the output results of a neural network is retained, and the reliability obtained by reading a feature map value related to the reliability from among the retained feature map value group exceeds a predetermined threshold. Only in this case, the feature map value related to the position of the corresponding object is read out from the holding unit to obtain the position of the object.
  • a third aspect of the present disclosure is an object detection program, which acquires metadata including at least the position and reliability of an object included in the image from a convolutional neural network into which an image is input to a computer, and A group of feature map values that are the output results of a neural network is retained, and the reliability obtained by reading a feature map value related to the reliability from among the retained feature map value group exceeds a predetermined threshold. Only in this case, the feature map value related to the position of the corresponding object is read out from the holding unit to execute a process of obtaining the position of the object.
  • an object detection device According to the disclosed technology, it is possible to provide an object detection device, an object detection method, and an object detection program that perform faster detection processing than existing technologies.
  • FIG. 3 is a diagram showing a processing flow of CNN including Detection processing.
  • FIG. 2 is a flowchart showing a Detection process performed by the object detection device as a comparative example of the embodiment.
  • FIG. 3 is a diagram illustrating the Detection process shown in FIG. 2.
  • FIG. FIG. 2 is a block diagram showing the hardware configuration of an object detection device.
  • FIG. 2 is a block diagram showing an example of a functional configuration of an object detection device.
  • 7 is a diagram for explaining the Detection process shown in FIG. 6.
  • FIG. 7 is a graph comparing the number of times a feature map is read between a method according to an embodiment and a method according to a comparative example. It is a flowchart which shows the flow of object detection processing by an object detection device.
  • FIG. 2 is a flowchart showing a Detection process performed by the object detection device as a comparative example of this embodiment.
  • FIG. 3 is a diagram for explaining the Detection process shown in FIG. 2, and is a diagram for explaining step S13 in the flowchart shown in FIG.
  • the object detection device converts all feature map values of B[n] into BB information in step S13 in FIG. 2, but reads all channels regardless of the value (p obj ) corresponding to the object reliability. It was converted to BB information.
  • the object detection device After converting the feature map value of B[n] into BB information, the object detection device then removes BBs whose object reliability is less than or equal to the threshold (step S14), and increments the variable n by one (step S15). .
  • step S12 if n becomes greater than or equal to B num (step S12; No), the object detection device subsequently removes the duplicate BB by NMS (step S16).
  • NMS is a process in which when predicted BBs overlap, those with low scores are excluded.
  • This embodiment shows an object detection device that can shorten the processing time compared to Detection processing as a comparative example.
  • FIG. 4 is a block diagram showing the hardware configuration of the object detection device 10.
  • the object detection device 10 includes a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a storage 14, an input section 15, and a display section 16. and communication interface (I/F) 17.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • storage 14 an input section
  • I/F communication interface
  • Each configuration is communicably connected to each other via a bus 19.
  • the CPU 11 is a central processing unit that executes various programs and controls various parts. That is, the CPU 11 reads a program from the ROM 12 or the storage 14 and executes the program using the RAM 13 as a work area. The CPU 11 controls each of the above components and performs various arithmetic processing according to programs stored in the ROM 12 or the storage 14. In this embodiment, the ROM 12 or the storage 14 stores an object detection program for detecting objects included in an image.
  • the ROM 12 stores various programs and various data.
  • the RAM 13 temporarily stores programs or data as a work area.
  • the storage 14 is constituted by a storage device such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive), and stores various programs including an operating system and various data.
  • the input unit 15 includes a pointing device such as a mouse and a keyboard, and is used to perform various inputs.
  • the display unit 16 is, for example, a liquid crystal display, and displays various information.
  • the display section 16 may adopt a touch panel method and function as the input section 15.
  • the communication interface 17 is an interface for communicating with other devices.
  • a wired communication standard such as Ethernet (registered trademark) or FDDI
  • a wireless communication standard such as 4G, 5G, or Wi-Fi (registered trademark) is used.
  • FIG. 5 is a block diagram showing an example of the functional configuration of the object detection device 10.
  • the object detection device 10 has an image acquisition section 101, a recognition section 102, a metadata acquisition section 103, a holding section 104, a feature map value acquisition section 105, and an output section 106 as functional configurations.
  • Each functional configuration is realized by the CPU 11 reading out an object detection program stored in the ROM 12 or the storage 14, loading it into the RAM 13, and executing it.
  • the image acquisition unit 101 acquires an image of an object detection target.
  • the recognition unit 102 performs image processing on the image acquired by the image acquisition unit 101 and recognizes objects included in the image.
  • the recognition unit 102 inputs the image acquired by the image acquisition unit 101 to a convolutional neural network (CNN).
  • the CNN outputs metadata including at least the position of the object included in the image and the reliability of the object.
  • the metadata is temporarily held in the holding unit 104 by a metadata acquisition unit 103, which will be described later.
  • the retained metadata is read out by the feature map value acquisition unit 105 that satisfies predetermined conditions.
  • the metadata acquisition unit 103 acquires metadata including at least the position and reliability of the object included in the input image from the CNN to which the image is input.
  • the reliability may consist of a class-by-class reliability group for each class of objects. Further, the reliability may further include an object reliability indicating the certainty of the existence of the object.
  • the holding unit 104 holds a group of feature map values that are the output results of the CNN.
  • the feature map value group corresponds to a predetermined number of B num BBs (B[0] to B[B num -1]) for each unit called Grid, which is obtained by dividing an image of W pixels horizontally and H pixels vertically. It is a set of feature map values.
  • the holding unit 104 may be provided in the RAM 13, for example.
  • the feature map value acquisition unit 105 reads a feature map value related to reliability from the holding unit 104 among the feature map value group held in the holding unit 104, and obtains the feature map value only when the reliability obtained exceeds a predetermined threshold. , the feature map value related to the position of the corresponding object is read out from the holding unit 104 to obtain the position of the object.
  • the threshold value can be changed depending on the required detection accuracy.
  • the feature map value acquisition unit 105 obtains the feature map value related to the position of the corresponding object and the feature map value related to the reliability for each class only when the object reliability obtained from the feature map value related to the object reliability exceeds the threshold value. Read from the holding unit 104.
  • the output unit 106 outputs the object recognition result by the recognition unit 102.
  • the result of image recognition by the recognition unit 102 can be outputted in a state where it is superimposed on the input image. For example, as shown in FIG. 1, the output unit 106 superimposes a frame on the area corresponding to the object in the input image, and superimposes the name of the detected object in the frame, and performs image recognition. You can also output the results.
  • FIG. 6 is a flowchart showing the flow of object detection processing by the object detection device 10.
  • the object detection process is performed by the CPU 11 reading the object detection program from the ROM 12 or the storage 14, expanding it to the RAM 13, and executing it.
  • FIG. 6 is a detection process for the CNN output feature map output by the CNN and stored in the RAM 13, for example. Further, FIG. 7 is a diagram for explaining the Detection process shown in FIG. 6.
  • the CPU 11 initializes a variable n used in the Detection process to 0 (step S101).
  • step S102 determines whether the variable n is less than B num. As a result of the determination in step S102, if the variable n is less than B num (step S102; Yes), the CPU 11 converts all feature map values in the p obj channel in B[n] into BB information (object reliability). Convert (step S103).
  • step S104 the CPU 11 extracts grids whose object reliability is greater than or equal to a predetermined threshold (step S104).
  • the CPU 11 calculates the feature map values of channels other than the p obj channel (tx, ty, tw, th, p[0] to p[C num -1] channels) at the extracted grid position. is read out and converted into BB information (step S105).
  • tx, ty, tw, th are values corresponding to the coordinates of BB
  • p[0] to p[C num -1] are values corresponding to the reliability of each object class
  • p[0] to p[ C num ⁇ 1] are collectively referred to as the class-by-class reliability group.
  • step S105 the CPU 11 increments the variable n by one (step S106), and returns to the determination process of step S102.
  • step S102 determines whether n is greater than or equal to B num (step S102; No).
  • the CPU 11 removes BBs whose object reliability obtained as a result of conversion to BB information is less than a threshold value, and removes duplicate BBs. (Step S107).
  • the CPU 11 removes the BB using NMS (Non-Maximum-Suppression). NMS is a process in which when predicted BBs overlap, those with low scores are excluded.
  • the object detection device 10 exhaustively reads out the feature map values of the channel p obj , but the feature map values of other channels are read out when the object reliability obtained from p obj exceeds the threshold. Read only if the value exceeds the value.
  • FIG. 6 shows how the readout of feature map values corresponding to BBs whose object reliability is less than or equal to the threshold and which are to be removed is omitted except for pobj .
  • W ⁇ H ⁇ B num is the number of times required to read all p obj .
  • the number of channels of p obj is B num , which is the same number as the number of BBs divided by the number of grids.
  • the number of channels for each BB is 4+C num , excluding the channel of p obj .
  • 4 corresponds to four channels: tx, ty, tw, and th. These channels are read out only when the object reliability obtained from the corresponding p obj exceeds a threshold, so the number of reads is K ⁇ (4+C num ).
  • FIG. 8 is a graph comparing the number of times the feature map is read between the method according to the present embodiment and the method of the comparative example.
  • the feature map values of all channels of all grids were read, so the number of times of reading was constant.
  • the number of reads is 1/50 or less compared to 1321920 times in the comparative example.
  • the BB may not include the object reliability. Even in that case, a method for reducing the number of times the feature map is read will be described below. Specifically, when the object reliability is not included in BB, the object detection device 10 exhaustively determines the feature map of each channel of the class-by-class reliability groups p[0] to p[C num ⁇ 1]. Read out. The object detection device 10 calculates the grid feature map values corresponding to the other tx, ty, tw, and th channels by class-by-class reliability obtained from the class-by-class reliability groups p[0] to p[C num ⁇ 1]. The class reliability is read only when any of the reliability (class reliability) is equal to or higher than the threshold.
  • FIG. 9 is a flowchart showing the flow of object detection processing by the object detection device 10.
  • the object detection process is performed by the CPU 11 reading the object detection program from the ROM 12 or the storage 14, expanding it to the RAM 13, and executing it.
  • the flowchart shown in FIG. 9 is a detection process for a CNN output feature map output by the CNN and stored, for example, in the RAM 13.
  • the CPU 11 initializes a variable n used in the Detection process to 0 (step S111).
  • step S112 determines whether the variable n is less than B num (step S112). As a result of the determination in step S112, if the variable n is less than B num (step S112; Yes), the CPU 11 initializes the variable m used in the Detection process to 0 (step S113).
  • step S114 determines whether the variable m is less than C num. As a result of the determination in step S114, if the variable m is less than C num (step S114; Yes), the CPU 11 converts all feature map values in the p[m] channel in B[n] into BB information (class-wise reliability degree) (step S115).
  • step S116 the CPU 11 increments the variable m by one (step S116), and returns to the determination in step S114.
  • step S114 determines whether any one of the class-by-class reliability groups p[0] to p[C num -1] Grids whose value is equal to or greater than the threshold are extracted (step S117).
  • the CPU 11 calculates the feature map values of the channels (tx, ty, tw, and th channels) other than the class-by-class confidence group p[0] to p[C num -1] channels at the extracted grid position. It is read out and converted into BB information (step S118).
  • step S118 the CPU 11 increments the variable n by one (step S119), and returns to the determination process of step S112.
  • step S112 if the variable n is greater than or equal to B num (step S112; No), the CPU 11 removes BBs whose object reliability obtained as a result of conversion to BB information is less than the threshold, and removes duplicate BBs. Removal is performed (step S120).
  • the CPU 11 removes the BB using NMS (Non-Maximum-Suppression). NMS is a process in which when predicted BBs overlap, those with low scores are excluded.
  • NMS Non-Maximum-Suppression
  • various processors other than the CPU may execute the object detection processing that the CPU reads and executes the software (program) in each of the above embodiments.
  • the processors include FPGA (Field-Programmable Gate Array), PLD (Programmable Logic Device) whose circuit configuration can be changed after manufacturing, and ASIC (Application Specific I).
  • FPGA Field-Programmable Gate Array
  • PLD Programmable Logic Device
  • ASIC Application Specific I
  • An example is a dedicated electric circuit that is a processor having a specially designed circuit configuration.
  • the object detection process may be executed by one of these various processors, or by a combination of two or more processors of the same type or different types (for example, a combination of multiple FPGAs, and a combination of a CPU and an FPGA). etc.).
  • the hardware structure of these various processors is, more specifically, an electric circuit that is a combination of circuit elements such as semiconductor elements.
  • the object detection processing program is stored (installed) in the storage 14 in advance, but the present invention is not limited to this.
  • the program can be installed on CD-ROM (Compact Disk Read Only Memory), DVD-ROM (Digital Versatile Disk Read Only Memory), and USB (Universal Serial Bus) stored in a non-transitory storage medium such as memory It may be provided in the form of Further, the program may be downloaded from an external device via a network.
  • the processor includes: Obtaining metadata including at least the position and reliability of an object included in the image from a convolutional neural network into which the image is input, retaining a group of feature map values that are the output results of the convolutional neural network; Among the retained feature map values, feature map values related to the reliability are read out from the holding unit and only when the reliability obtained exceeds a predetermined threshold value, the feature map values related to the position of the corresponding object are determined. executing a process of reading a feature map value from the holding unit to obtain the position of the object;
  • An object detection device configured as follows.
  • a non-transitory storage medium storing a program executable by a computer to perform an object detection process,
  • the object detection process includes: Obtaining metadata including at least the position and reliability of an object included in the image from a convolutional neural network into which the image is input, retaining a group of feature map values that are the output results of the convolutional neural network; Among the retained feature map value group, feature map values related to the reliability are read out from the holding unit and only when the reliability obtained exceeds a predetermined threshold value, the feature map values related to the position of the corresponding object are determined. executing a process of reading a feature map value from the holding unit to obtain the position of the object; Non-transitory storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
PCT/JP2022/021587 2022-05-26 2022-05-26 物体検出装置、物体検出方法、及び物体検出プログラム Ceased WO2023228364A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US18/868,738 US20250174017A1 (en) 2022-05-26 2022-05-26 Object detection device, object detection method, and object detection program
JP2024522833A JP7794311B2 (ja) 2022-05-26 2022-05-26 物体検出装置、物体検出方法、及び物体検出プログラム
PCT/JP2022/021587 WO2023228364A1 (ja) 2022-05-26 2022-05-26 物体検出装置、物体検出方法、及び物体検出プログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/021587 WO2023228364A1 (ja) 2022-05-26 2022-05-26 物体検出装置、物体検出方法、及び物体検出プログラム

Publications (1)

Publication Number Publication Date
WO2023228364A1 true WO2023228364A1 (ja) 2023-11-30

Family

ID=88918796

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/021587 Ceased WO2023228364A1 (ja) 2022-05-26 2022-05-26 物体検出装置、物体検出方法、及び物体検出プログラム

Country Status (3)

Country Link
US (1) US20250174017A1 (https=)
JP (1) JP7794311B2 (https=)
WO (1) WO2023228364A1 (https=)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020071793A (ja) * 2018-11-02 2020-05-07 富士通株式会社 目標検出プログラム、目標検出装置、及び目標検出方法
JP2021033510A (ja) * 2019-08-21 2021-03-01 いすゞ自動車株式会社 運転支援装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022034571A (ja) 2018-11-26 2022-03-04 住友電気工業株式会社 交通情報処理サーバ、交通情報の処理方法、及びコンピュータプログラム
JP7179705B2 (ja) 2019-09-09 2022-11-29 ヤフー株式会社 情報処理装置、情報処理方法および情報処理プログラム
KR20250042199A (ko) 2020-06-30 2025-03-26 엔제루 구루푸 가부시키가이샤 게이밍 액티비티 모니터링 시스템 및 방법
CN113591703B (zh) 2021-07-30 2023-11-28 山东建筑大学 一种教室内人员定位方法及教室综合管理系统

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020071793A (ja) * 2018-11-02 2020-05-07 富士通株式会社 目標検出プログラム、目標検出装置、及び目標検出方法
JP2021033510A (ja) * 2019-08-21 2021-03-01 いすゞ自動車株式会社 運転支援装置

Also Published As

Publication number Publication date
US20250174017A1 (en) 2025-05-29
JP7794311B2 (ja) 2026-01-06
JPWO2023228364A1 (https=) 2023-11-30

Similar Documents

Publication Publication Date Title
US9697416B2 (en) Object detection using cascaded convolutional neural networks
JP6471448B2 (ja) 視差深度画像のノイズ識別方法及びノイズ識別装置
CN112801164A (zh) 目标检测模型的训练方法、装置、设备及存储介质
CN111652217A (zh) 文本检测方法、装置、电子设备及计算机存储介质
CN113807184B (zh) 障碍物检测方法、装置、电子设备及自动驾驶车辆
KR102585216B1 (ko) 영상 인식 방법 및 그 장치
WO2021052283A1 (zh) 处理三维点云数据的方法和计算设备
US10748023B2 (en) Region-of-interest detection apparatus, region-of-interest detection method, and recording medium
US9646220B2 (en) Methods and media for averaging contours of wafer feature edges
US11195083B2 (en) Object detection system and object detection method
US10719908B2 (en) Method and apparatus for storing image in texture memory
US20230093034A1 (en) Target area detection device, target area detection method, and target area detection program
CN105308618A (zh) 借助于并行检测和跟踪和/或分组特征运动移位跟踪的人脸识别
US20220156533A1 (en) Object detection in vehicles using cross-modality sensors
CN109815789A (zh) 在cpu上实时多尺度人脸检测方法与系统及相关设备
WO2024194951A1 (ja) 物体検出装置、方法、及びプログラム
US12136252B2 (en) Label estimation device, label estimation method, and label estimation program
WO2023228364A1 (ja) 物体検出装置、物体検出方法、及び物体検出プログラム
KR20210048837A (ko) 패치레벨 증강을 이용한 고해상도 영상에서의 객체 검출방법 및 장치
WO2017131629A1 (en) Merging object detections using graphs
CN113508395B (zh) 用于检测由像素构成的图像中的对象的方法和设备
CN119323726A (zh) 高光谱影像火星表面矿物识别方法、装置、设备及介质
US11055852B2 (en) Fast automatic trimap generation and optimization for segmentation refinement
CN116894963B (zh) 一种基于上下文聚类与多模态融合的目标检测方法和系统
JP7599885B2 (ja) 画像処理装置、画像処理方法、及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22943764

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2024522833

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 18868738

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22943764

Country of ref document: EP

Kind code of ref document: A1

WWP Wipo information: published in national office

Ref document number: 18868738

Country of ref document: US