WO2023062760A1 - 領域検出プログラム、装置、及び方法 - Google Patents

領域検出プログラム、装置、及び方法 Download PDF

Info

Publication number
WO2023062760A1
WO2023062760A1 PCT/JP2021/037958 JP2021037958W WO2023062760A1 WO 2023062760 A1 WO2023062760 A1 WO 2023062760A1 JP 2021037958 W JP2021037958 W JP 2021037958W WO 2023062760 A1 WO2023062760 A1 WO 2023062760A1
Authority
WO
WIPO (PCT)
Prior art keywords
person
image
area
height
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/037958
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
帆 楊
成幸 小田嶋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN202180102809.3A priority Critical patent/CN118043856A/zh
Priority to PCT/JP2021/037958 priority patent/WO2023062760A1/ja
Priority to EP21960618.3A priority patent/EP4418203A4/en
Priority to JP2023553831A priority patent/JP7639931B2/ja
Publication of WO2023062760A1 publication Critical patent/WO2023062760A1/ja
Priority to US18/603,752 priority patent/US20240242464A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the disclosed technology relates to an area detection program, an area detection device, and an area detection method.
  • a 3D model generation device that generates a 3D model of a subject from images captured by multiple cameras.
  • This device acquires a silhouette image for each viewpoint from a multi-view video, and generates a low-resolution voxel model having a first voxel size from the plurality of silhouette images by the visual volume intersection method.
  • the apparatus also classifies the low-resolution voxel models based on their features and, for each low-resolution voxel model, determines a second size that is smaller than the first size based on the classification results.
  • this apparatus generates a high-resolution voxel model having a second voxel size for each 3D bounding box of the low-resolution voxel model, and outputs a 3DCG model of the object based on the high-resolution voxel model.
  • the bounding box When using a machine learning model to detect a bounding box as an area that indicates a target person from an image, the bounding box may not be detected or may be detected incorrectly. In multi-viewpoint images for obtaining 3D information of a person, if such a bounding box is not detected or detected incorrectly in any of the images, it is difficult to obtain 3D information with high accuracy in subsequent processing. Can not.
  • the disclosed technology aims to appropriately interpolate undetected or erroneously detected bounding boxes in multi-view images.
  • the technology disclosed acquires images captured by each of a plurality of imaging devices that capture images of a person from different directions.
  • the technology disclosed inputs an acquired image to a machine learning model generated in advance by machine learning so as to detect an area of a person included in the image, and extracts an area indicating a person from each of the acquired images. to detect Then, according to the technology disclosed herein, based on the region of the person detected from the first image among the acquired images and the parameters of each of the plurality of photographing devices, the second of the acquired images is detected. interpolate the region showing the person in the image of .
  • it has the effect of being able to appropriately interpolate undetected or erroneously detected bounding boxes in multi-view images.
  • FIG. 4 is a schematic diagram showing connection between an area detection device and a camera; 3 is a functional block diagram of an area detection device; FIG. FIG. 4 is a diagram for explaining a two-dimensional bounding box; FIG. FIG. 4 is a diagram for explaining the difference in width of a two-dimensional bounding box depending on the viewpoint; FIG. 4 is a diagram for explaining identification of a three-dimensional person's center line; FIG. 4 is a diagram for explaining interpolation of a two-dimensional bounding box; FIG. 4 is a diagram for explaining the width and height of a three-dimensional bounding box; FIG. 4 is a diagram for explaining statistical information about a three-dimensional bounding box; FIG.
  • FIG. 1 is a block diagram showing a schematic configuration of a computer functioning as an area detection device
  • FIG. 6 is a flowchart showing an example of area detection processing
  • FIG. 4 is a diagram showing an example of interpolation of a two-dimensional bounding box
  • FIG. 10 is a diagram for explaining an example of a technology applied to a multi-viewpoint image in which two-dimensional bounding boxes are detected;
  • the area detection device 10 is connected to each of a plurality of cameras 30n that capture images of the gymnast 90 at viewpoints n from different directions.
  • n 1, 2, and 3
  • a camera 301 that captures images from a viewpoint 1 a camera 302 that captures images from a viewpoint 2, and a camera 303 that captures images from a viewpoint 3 are connected to the area detection device 10.
  • the number of cameras 30n connected to the area detection device 10 is not limited to the example in FIG.
  • the cameras 30n are installed at different positions within substantially the same horizontal plane at angles that allow the gymnast 90 to be captured within the shooting range. That is, the cameras 30n are arranged horizontally so as to surround the gymnast 90 .
  • “substantially in the same horizontal plane” means that the height of the camera 30n from the floor can be regarded as substantially the same, and that the difference in height from the floor of the camera 30n is equal to or less than a predetermined value.
  • Images captured by the camera 30n are sequentially input to the area detection device 10.
  • FIG. Time information is associated with each frame included in the video captured by each camera 30n, and the video captured by each camera 30n can be synchronized based on this time information.
  • the region detection device 10 functionally includes an acquisition unit 12, a detection unit 14, and an interpolation unit 16.
  • a detection model 20 is stored in a predetermined storage area of the area detection device 10 .
  • the acquisition unit 12 acquires, as multi-viewpoint images, a set of images indicated by frames corresponding to time information in the video input from the camera 30n to the area detection device 10 .
  • an image captured by the camera 30n among the images included in the multi-viewpoint image is referred to as an image 40n.
  • the detection unit 14 inputs the images 40n included in the multi-view images acquired by the acquisition unit 12 to the detection model 20, and generates two-dimensional bounding boxes (hereinafter referred to as “2D- BB”) is detected.
  • the detection model 20 is generated in advance using, as training data, images obtained by giving correct 2D-BB to images of gymnasts in various postures.
  • the detection model 20 is an example of a “machine learning model” of technology disclosed herein. For example, as shown in FIG. 3, the detection unit 14 detects the circumscribing rectangle of the area showing the gymnast 90 in the image 40n as a 2D-BB42n.
  • the interpolation unit 16 interpolates the 2D-BB 42n in the second image out of the multi-view images based on the 2D-BB 42n detected from the first image out of the multi-view images and each parameter of the camera 30n. do.
  • 2D-BB 42k is detected from image 40k taken by camera 30k at viewpoint k
  • 2D-BB 42_miss is not detected from image 40_miss taken by camera 30_miss at viewpoint _miss. do.
  • the image 40k is an example of the first image
  • the image 40_miss is an example of the second image.
  • the detection model 20 which is a machine learning model
  • the images 40n to 2D-BB42n may be undetected or erroneously detected. This is the case when part of the gymnast 90 in the image 40n is blocked by an obstacle, or when the posture of the target gymnast 90 is different from any of the gymnasts in the images used for training the detection model 20. This may occur when there is no similarity.
  • a recognition model that recognizes three-dimensional information such as skeleton information generated in advance by machine learning, a three-dimensional image of the gymnast 90 is obtained. Consider getting information.
  • 3D-BB three-dimensional bounding box 44 is a rectangular parallelepiped consisting of a horizontal plane and a vertical plane, which is the smallest rectangular parallelepiped that completely surrounds the gymnast 90 in three-dimensional space.
  • a 2D-BB 42n is obtained by projecting the 3D-BB 44 onto each image 40n at the viewpoint n of each camera 30n.
  • the height (length in the vertical direction) of the 2D-BB 42n in each image 40n is common regardless of the viewpoint of each camera 30n. do.
  • the heights of the 2D-BB42n and the 3D-BB44 are represented by lines connecting the asterisks.
  • the width (horizontal length) of the 2D-BB 42n in each image 40n differs depending on the viewpoint of each camera 30n.
  • the width of the 2D-BB421 detected from the image 401 and the width of the 2D-BB422 detected from the image 402 are different due to the difference in viewpoint. Therefore, for example, since the 2D-BB423 was not detected in the image 403, the width of the 2D-BB423 is cannot be properly determined. Therefore, the 2D-BB 423 in the image 403 cannot be interpolated.
  • the interpolation unit 16 calculates 2D-BB42_miss in image 40_miss in which 2D-BB is not detected based on the height of 2D-BB42k in image 40k in which 2D-BB is detected and each parameter of camera 30k. Determine the height of Then, the interpolation unit 16 estimates the width of the 2D-BB 42_miss in the image 40_miss in which the 2D-BB is not detected, based on the statistical information regarding the posture of the gymnast 90 and the parameters of the camera 30_miss.
  • FIG. 5 and 6 A specific description will be given with reference to FIGS. 5 and 6.
  • x k and y k are the coordinates in image 40k of the upper left corner point of 2D-BB 42k, and w and h are the width and height of 2D-BB 42k, respectively.
  • the interpolation unit 16 calculates the coordinates of the upper end point [x k +w k /2, y k ] and the coordinates of the lower end point [x k + w k /2, y k + h k ] of the vertical center line of the 2D-BB 42k. Identify. Then, the interpolation unit 16 converts the coordinates of the upper end point and the lower end point into three-dimensional coordinates using parameters of the camera 30k for converting the three-dimensional coordinates into coordinates on the image plane captured by the camera 30k. .
  • the interpolation unit 16 may convert coordinates using cv::sfm::trianglePoints defined in OpenCV (https://docs.opencv.org/3.4/d0/dbd/group__triangulation.html). .
  • the interpolation unit 16 calculates the three-dimensional coordinates of the points P3d top and P3d bot in the three-dimensional space corresponding to the upper end point and the lower end point using the parameter matrix ProjMat cam_k of the camera 30k, using the following (1) It is calculated as shown in the formula and the formula (2).
  • a line connecting P3d top and P3d bot is called a human center line.
  • the interpolation unit 16 converts the three-dimensional coordinates of P3d top and P3d bot into coordinates on image 40_miss based on the parameters of camera 30_miss. Thereby, the interpolation unit 16 calculates the coordinates of the upper end point and the lower end point of the vertical center line of 2D-BB42_miss. For example, the interpolation unit 16 uses the parameter matrix ProjMat cam_miss of the camera 30_miss and the parameter s representing the scale ratio between the three-dimensional coordinates and the size of the image 40, as shown in the following formulas (3) and (4): Perform coordinate transformation.
  • the interpolation unit 16 calculates y miss , h miss , and x miss +w miss /2 for specifying the coordinates of the upper end point and the lower end point of the vertical center line of 2D-BB42_miss based on the coordinate transformation described above. It is calculated as shown in formulas (5) to (7).
  • ymiss s x ymiss /s (5)
  • h miss s ⁇ (y miss +h miss )/sy miss (6)
  • xmiss + wmiss /2 s ⁇ ( xmiss + wmiss /2)/s (7)
  • the interpolation unit 16 identifies the line connecting the identified upper endpoint and lower endpoint as the vertical centerline of 2D-BB42_miss, and identifies the length of the centerline as the height of 2D-BB42_miss.
  • the interpolation unit 16 also estimates the width of the 2D-BB42_miss based on the specified height of the 2D-BB42_miss and the statistical information regarding the posture of the gymnast 90 .
  • the statistical information may be, for example, the average sum of the height and width of the 3D-BB 44 surrounding the gymnast in each of the different poses of the gymnast.
  • the height of the 3D-BB 44 is Height_3D
  • the larger width is Width_max_3D
  • the smaller width is Width_min_3D.
  • a full-size three-dimensional model of a gymnast is prepared for a plurality of postures (poses) obtained from motion capture, manual annotation, published data, or the like.
  • postures postures
  • three-dimensional models for M poses are prepared.
  • the 3D-BB44 is specified, Height_3D, Width_max_3D, and Width_min_3D are calculated, and the mean Mean_3D shown in the following equation (8) is calculated as statistical information.
  • Mean_3D (1/M) ⁇ M ((Width_max_3D+Width_min_3D)/2+Height_3D) (8)
  • ) x h miss (10) x miss x miss +w miss /2-w miss /2 (11)
  • the interpolation unit 16 calculates the 2D-BB42_miss specified by [x miss , y miss , w miss , h miss ] calculated by the formulas (5), (6), (10), and (11). , interpolate in image 40_miss. Then, the interpolation unit 16 puts together the interpolated 2D-BB42_miss and 2DBB42k and outputs them as a 2D-BB-detected multi-viewpoint image.
  • the area detection device 10 may be realized by, for example, a computer 50 shown in FIG.
  • the computer 50 includes a CPU (Central Processing Unit) 51 , a memory 52 as a temporary storage area, and a non-volatile storage section 53 .
  • the computer 50 also includes an input/output I/F (Interface) 54 to which external devices such as the camera 30n, an input device, and a display device are connected, and an R/W (Read /Write) section 55 .
  • the computer 50 also has a communication I/F 56 connected to a network such as the Internet.
  • the CPU 51 , memory 52 , storage section 53 , input/output I/F 54 , R/W section 55 and communication I/F 56 are connected to each other via a bus 57 .
  • the storage unit 53 may be implemented by a HDD (Hard Disk Drive), SSD (Solid State Drive), flash memory, or the like.
  • An area detection program 60 for causing the computer 50 to function as the area detection device 10 is stored in the storage unit 53 as a storage medium.
  • Region detection program 60 has an acquisition process 62 , a detection process 64 and an interpolation process 66 .
  • the storage unit 53 also has an information storage area 70 in which information forming the detection model 20 is stored.
  • the CPU 51 reads out the area detection program 60 from the storage unit 53, develops it in the memory 52, and sequentially executes the processes of the area detection program 60.
  • the CPU 51 operates as the acquisition unit 12 shown in FIG. 2 by executing the acquisition process 62 . Further, the CPU 51 operates as the detection unit 14 shown in FIG. 2 by executing the detection process 64 . Also, the CPU 51 operates as the interpolation unit 16 shown in FIG. 2 by executing the interpolation process 66 .
  • the CPU 51 also reads information from the information storage area 70 and develops the detection model 20 in the memory 52 . Thereby, the computer 50 executing the area detection program 60 functions as the area detection device 10 . Note that the CPU 51 that executes the program is hardware.
  • the function realized by the area detection program 60 can also be realized by, for example, a semiconductor integrated circuit, more specifically a GPU (Graphics Processing Unit) or ASIC (Application Specific Integrated Circuit).
  • a semiconductor integrated circuit more specifically a GPU (Graphics Processing Unit) or ASIC (Application Specific Integrated Circuit).
  • the area detection processing shown in FIG. 10 is executed in the area detection device 10.
  • FIG. Note that the area detection process is an example of the area detection method of technology disclosed herein.
  • step S10 the acquisition unit 12 acquires the multi-viewpoint image input to the area detection device 10.
  • step S12 the detection unit 14 inputs each image 40n included in the obtained multi-viewpoint images to the detection model 20, and detects the 2D-BB 42n from each of the images 40n.
  • step S14 the detection unit 14 determines whether or not there is an image 40n in which the 2D-BB 42n is not detected among the images 40n included in the multi-viewpoint images. If there is an image 40n in which the 2D-BB 42n is not detected, the process proceeds to step S16, and if not, the process proceeds to step S24.
  • step S16 the interpolating unit 16 calculates the coordinates [x k +w k /2, y k ] of the upper end point of the vertical center line of the detected 2D-BB 42k, and the coordinates [x k +w k /2 of the lower end point , y k +h k ].
  • step S18 the interpolation unit 16 converts the coordinates of the upper end point and the lower end point into three-dimensional coordinates using the parameter matrix of the camera 30k (denoted as "OK camera" in FIG. 10) to obtain points P3d top and Identify the P3d bot . Then, the interpolation unit 16 identifies the line connecting the P3d top and the P3d bot as the human center line.
  • step S20 the interpolation unit 16 converts the three-dimensional coordinates of the P3d top and the P3d bot based on the parameter matrix of the camera 30_miss (denoted as "miss camera” in FIG. 10) to the image 40_miss (" project on the "miss image”). This identifies the vertical centerline of 2D-BB42_miss and the length of the centerline as the height of 2D-BB42_miss.
  • step S22 the interpolation unit 16 estimates the width of the 2D-BB42_miss based on the specified height of the 2D-BB42_miss and statistical information regarding the posture of the gymnast 90.
  • 2D-BB42_miss specified by the vertical center line and height of 2D-BB42_miss specified in step S20 and the width of 2D-BB42_miss estimated in this step is interpolated in image 40_miss.
  • step S24 the interpolation unit 16 outputs a multi-viewpoint image in which 2D-BB42n is detected from each image 40n.
  • the detected 2D-BB 42n includes the 2D-BB interpolated in step S22.
  • step S ⁇ b>26 the acquisition unit 12 determines whether or not the next multi-viewpoint image has been input to the area detection device 10 . If the next multi-viewpoint image has been input to the area detection device 10, the process returns to step S10, and if not, the area detection process ends.
  • FIG. 11 shows an example of interpolating 2D-BB that was not detected in a multi-view image containing four images.
  • Frame: 852 is a frame number associated with each image, and corresponds to the time information in the above embodiment.
  • the missing 2D-BB in cam_id:3 included in the multi-view image of frame number 852 is interpolated as shown in the right diagram of FIG. 11 by applying this embodiment. be done.
  • the area detection device acquires a multi-viewpoint image, which is a set of images captured by each of a plurality of cameras that capture images of a person from different directions.
  • the region detection device inputs each of the images included in the acquired multi-view images to a detection model generated in advance by machine learning so as to detect the bounding box indicating the region of the person included in the image, Detect bounding boxes from each of the images. Then, the area detection device detects a bounding box in a second image out of the acquired images based on the bounding box detected from the first image out of the acquired images and the parameters of each of the plurality of cameras. Interpolate.
  • the region detector projects the detected 2D-BB from 2D to 3D using the camera's intrinsic and extrinsic parameters to identify the vertical human centerline in 3D space.
  • the area detection device is the statistical information indicating the average height and width of the 3D-BB and the length of the person's center line, which are calculated in advance from the three-dimensional models of gymnasts in various postures. Estimate the width of the 3D-BB based on the height of the 3D-BB.
  • the region detection device projects the 3D-BB, which is specified from the human center line, the height and width of the 3D-BB, from three dimensions to two dimensions using the camera's internal and external parameters, so that the image Interpolate the 2D-BB in .
  • undetected bounding boxes in multi-viewpoint images can be interpolated appropriately.
  • the 2D-BB-detected multi-viewpoint images output from the region detection device according to the present embodiment are used, for example, for learning-type skeleton recognition of gymnasts, as shown in FIG.
  • learning-type skeleton recognition of gymnasts as shown in FIG.
  • 3D joint coordinates the correct three-dimensional coordinates of each joint of the gymnast represented by the multi-view images.
  • a skeleton recognition model is generated in advance by machine learning.
  • a skeleton recognition model is, for example, a neural network or the like.
  • the machine-learned skeleton recognition model uses the multi-view image. Images are input and 3D joint coordinates are output. Then, the 3D joint coordinates output from the skeleton recognition model are used as the primary skeleton recognition results, and the 3D joint coordinates obtained by performing position search for each joint position based on constraints such as the length and positional relationship between each joint are used as the fitting results. output.
  • the present invention is not limited to this.
  • multi-viewpoint images captured by a plurality of cameras arranged in substantially the same vertical plane may provide more accurate recognition results for skeleton recognition or the like.
  • the width of the 2D-BB detected from the first image and the width of the 3D-BB are specified based on the parameters of the camera that captured the first image, and the width of the 3D-BB and the gymnast
  • the height of the 3D-BB can be estimated based on the statistical information about the three-dimensional model of .
  • the present invention is not limited to this.
  • each image included in the multi-view image is set as the first image
  • the other images are set as the second image
  • the 2D-BB detected in the first image is interpolated in the same manner as in the above embodiment. Correction may be made based on 2D-BB.
  • the detection model when the detection model outputs the reliability of the detection along with the detection of the bounding box, if the reliability of the detection is equal to or less than a predetermined value, it is handled in the same manner as the case where the 2D-BB in the above embodiment is not detected. You may do so.
  • the present invention is not limited to this.
  • the program according to the technology disclosed herein can also be provided in a form stored in a storage medium such as a CD-ROM, DVD-ROM, USB memory, or the like.
  • area detection device 10 area detection device 12 acquisition unit 14 detection unit 16 interpolation unit 20 detection models 301, 302, 303 cameras 401, 402, 403 images 421, 422 2D-BB 50 computer 51 CPU 52 memory 53 storage unit 59 storage medium 60 area detection program

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Geometry (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
PCT/JP2021/037958 2021-10-13 2021-10-13 領域検出プログラム、装置、及び方法 Ceased WO2023062760A1 (ja)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202180102809.3A CN118043856A (zh) 2021-10-13 2021-10-13 区域检测程序、装置以及方法
PCT/JP2021/037958 WO2023062760A1 (ja) 2021-10-13 2021-10-13 領域検出プログラム、装置、及び方法
EP21960618.3A EP4418203A4 (en) 2021-10-13 2021-10-13 REGION DETECTION PROGRAM, DEVICE AND METHOD
JP2023553831A JP7639931B2 (ja) 2021-10-13 2021-10-13 領域検出プログラム、装置、及び方法
US18/603,752 US20240242464A1 (en) 2021-10-13 2024-03-13 Computer-readable recording medium storing region detection program, apparatus, and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/037958 WO2023062760A1 (ja) 2021-10-13 2021-10-13 領域検出プログラム、装置、及び方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/603,752 Continuation US20240242464A1 (en) 2021-10-13 2024-03-13 Computer-readable recording medium storing region detection program, apparatus, and method

Publications (1)

Publication Number Publication Date
WO2023062760A1 true WO2023062760A1 (ja) 2023-04-20

Family

ID=85987320

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/037958 Ceased WO2023062760A1 (ja) 2021-10-13 2021-10-13 領域検出プログラム、装置、及び方法

Country Status (5)

Country Link
US (1) US20240242464A1 (https=)
EP (1) EP4418203A4 (https=)
JP (1) JP7639931B2 (https=)
CN (1) CN118043856A (https=)
WO (1) WO2023062760A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120558098A (zh) * 2025-05-30 2025-08-29 内蒙古师范大学 基于三维重建和点云分割的羊只体尺测量方法、设备、介质及程序产品

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002290962A (ja) * 2001-03-27 2002-10-04 Mitsubishi Electric Corp 侵入者自動追尾方法および装置並びに画像処理装置
JP2009143722A (ja) * 2007-12-18 2009-07-02 Mitsubishi Electric Corp 人物追跡装置、人物追跡方法及び人物追跡プログラム
JP2021071749A (ja) 2019-10-29 2021-05-06 Kddi株式会社 3dモデル生成装置および方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9692964B2 (en) * 2003-06-26 2017-06-27 Fotonation Limited Modification of post-viewing parameters for digital images using image region or feature information
JP2009025874A (ja) 2007-07-17 2009-02-05 Nec Corp 顔画像登録装置、顔識別装置、顔画像登録方法、顔識別方法、顔画像登録プログラム
US9877012B2 (en) * 2015-04-01 2018-01-23 Canon Kabushiki Kaisha Image processing apparatus for estimating three-dimensional position of object and method therefor
US10885396B2 (en) * 2017-05-24 2021-01-05 Amazon Technologies, Inc. Generating composite images using audio/video recording and communication devices
US11113887B2 (en) * 2018-01-08 2021-09-07 Verizon Patent And Licensing Inc Generating three-dimensional content from two-dimensional images
US20190356885A1 (en) * 2018-05-16 2019-11-21 360Ai Solutions Llc Camera System Securable Within a Motor Vehicle
WO2020145255A1 (ja) * 2019-01-11 2020-07-16 日本電気株式会社 監視装置、監視方法、および記録媒体
EP4064206B1 (en) * 2019-11-20 2026-02-25 Panasonic Intellectual Property Management Co., Ltd. Three-dimensional model generation method and three-dimensional model generation device
JP2021152724A (ja) 2020-03-24 2021-09-30 キヤノン株式会社 情報処理装置、情報処理方法、およびプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002290962A (ja) * 2001-03-27 2002-10-04 Mitsubishi Electric Corp 侵入者自動追尾方法および装置並びに画像処理装置
JP2009143722A (ja) * 2007-12-18 2009-07-02 Mitsubishi Electric Corp 人物追跡装置、人物追跡方法及び人物追跡プログラム
JP2021071749A (ja) 2019-10-29 2021-05-06 Kddi株式会社 3dモデル生成装置および方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HIDEO SAITOMAKOTO KIMURASATOSHI YAGUCHINAHO INAMOTO, VIEW INTERPOLATION OF MULTIPLE CAMERAS BASED ON PROJECTIVE GEOMETRY, 2002
See also references of EP4418203A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120558098A (zh) * 2025-05-30 2025-08-29 内蒙古师范大学 基于三维重建和点云分割的羊只体尺测量方法、设备、介质及程序产品

Also Published As

Publication number Publication date
US20240242464A1 (en) 2024-07-18
JP7639931B2 (ja) 2025-03-05
EP4418203A1 (en) 2024-08-21
EP4418203A4 (en) 2024-11-27
CN118043856A (zh) 2024-05-14
JPWO2023062760A1 (https=) 2023-04-20

Similar Documents

Publication Publication Date Title
US11727637B2 (en) Method for generating 3D skeleton using joint-based calibration acquired from multi-view camera
JP5328979B2 (ja) 物体認識方法、物体認識装置、自律移動ロボット
JP6793151B2 (ja) オブジェクトトラッキング装置、オブジェクトトラッキング方法およびオブジェクトトラッキングプログラム
CN109215117B (zh) 基于ORB与U-net的花卉三维重建方法
KR102152436B1 (ko) 3차원 포인트 클라우드 기반의 동적 3차원 모델 생성을 위한 뼈대 정보 처리 시스템 및 방법
US20120306874A1 (en) Method and system for single view image 3 d face synthesis
US10438412B2 (en) Techniques to facilitate accurate real and virtual object positioning in displayed scenes
JP7164045B2 (ja) 骨格認識方法、骨格認識プログラムおよび骨格認識システム
JP7660284B2 (ja) 三次元モデル生成方法及び三次元モデル生成装置
JP2021174554A (ja) 画像深度確定方法及び生き物認識方法、回路、装置、記憶媒体
JP2017123087A (ja) 連続的な撮影画像に映り込む平面物体の法線ベクトルを算出するプログラム、装置及び方法
US20190340773A1 (en) Method and apparatus for a synchronous motion of a human body model
WO2024146165A1 (zh) 人眼定位方法、装置、计算设备及存储介质
JP2011170487A (ja) 大空間カメラ配置における幾何情報に基づく仮想視点画像生成方法およびプログラム
JP6310288B2 (ja) 画像処理装置および3次元物体トラッキング方法
JP6347610B2 (ja) 画像処理装置および3次元空間情報取得方法
CN109902675B (zh) 物体的位姿获取方法、场景重构的方法和装置
US20240242464A1 (en) Computer-readable recording medium storing region detection program, apparatus, and method
US11145048B2 (en) Image processing apparatus, image processing method, and non-transitory computer-readable storage medium for storing program
JP2011146762A (ja) 立体モデル生成装置
JP2009237846A (ja) 情報処理装置、および情報処理方法、並びにコンピュータ・プログラム
CN110288707B (zh) 一种三维动态建模的方法及其系统
KR102875314B1 (ko) 전신 통합 모션 캡쳐 방법
US20240265660A1 (en) Information processing apparatus, information processing method, and computer program
JP2002077941A (ja) 奥行き画像生成装置、奥行き画像生成方法およびその方法をコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960618

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023553831

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202180102809.3

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2021960618

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021960618

Country of ref document: EP

Effective date: 20240513