WO2019188392A1 - Dispositif de traitement d'informations, procédé de traitement d'informations, programme, et corps mobile - Google Patents

Dispositif de traitement d'informations, procédé de traitement d'informations, programme, et corps mobile Download PDF

Info

Publication number
WO2019188392A1
WO2019188392A1 PCT/JP2019/010761 JP2019010761W WO2019188392A1 WO 2019188392 A1 WO2019188392 A1 WO 2019188392A1 JP 2019010761 W JP2019010761 W JP 2019010761W WO 2019188392 A1 WO2019188392 A1 WO 2019188392A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
area
region
estimation
unit
Prior art date
Application number
PCT/JP2019/010761
Other languages
English (en)
Japanese (ja)
Inventor
真一郎 阿部
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2019188392A1 publication Critical patent/WO2019188392A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Definitions

  • the present technology relates to an information processing device, an information processing method, a program, and a moving object, and in particular, an information processing device, an information processing method, and a program suitable for use when a detection result of an area for each object in an image is used. And a moving body.
  • Non-Patent Document 1 a technique for improving the accuracy of self-position estimation of a vehicle by detecting moving objects around the vehicle and removing the influence of the detected moving objects.
  • SLAM Simultaneous Localization and Mapping
  • semantic segmentation is used to detect a moving object region in an image around a moving object.
  • semantic segmentation increases processing load and processing time. Therefore, in processing that requires real-time performance such as automatic driving, the processing of semantic segmentation is delayed, which may adversely affect other processing.
  • the present technology has been made in view of such a situation, and makes it possible to quickly recognize the position of a region for each object in an image.
  • An information processing apparatus includes an area detection unit that detects an object area that is an area for each object in a plurality of images, a motion estimation unit that performs motion estimation between images, and a first image The object in the second image based on the detection result of the object area in the image and the result of motion estimation between the first image and the second image after the first image An area estimation unit for estimating the position of the area.
  • the information processing apparatus detects an object region that is a region for each object in a plurality of images, performs motion estimation between images, and performs the motion estimation in the first image. Based on the detection result of the object region and the result of motion estimation between the first image and the second image after the first image, the position of the object region in the second image Is estimated.
  • the program according to the first aspect of the present technology detects an object region that is a region for each object in a plurality of images, performs motion estimation between images, and detects the object region in the first image, and , Based on the result of motion estimation between the first image and the second image after the first image, a process for estimating the position of the object region in the second image Let it run.
  • the moving body includes an imaging unit that captures the surroundings, and an area detection unit that detects an object area that is an area for each object in a plurality of images among images captured by the imaging unit. Between the motion estimation unit that performs motion estimation between images, the detection result of the object region in the first image, and the second image after the first image and the first image. An area estimation unit that estimates a position of the object area in the second image based on a result of motion estimation; and an execution unit that executes a predetermined process based on the estimation result of the position of the object area. Prepare.
  • an object region which is a region for each object in a plurality of images, is detected, motion estimation between images is performed, a detection result of the object region in the first image, and The position of the object region in the second image is estimated based on the result of motion estimation between the first image and the second image after the first image.
  • the surroundings are photographed, an object region that is a region for each object in the plurality of images is detected, motion estimation between the images is performed, and the first Based on the detection result of the object region in the image and the result of motion estimation between the first image and the second image after the first image, the The position of the object area is estimated, and predetermined processing is executed based on the estimation result of the position of the object area.
  • the position of the area for each object in the image can be quickly recognized.
  • processing using the area for each object in the recognized image can be performed quickly and appropriately.
  • FIG. 1 is a block diagram showing an embodiment of a vehicle to which the present technology is applied. It is a flowchart for demonstrating the self-position estimation process performed by a vehicle. It is a timing chart for explaining self-position estimation processing performed by vehicles. It is a schematic diagram which shows the example of a surrounding image. It is a schematic diagram which shows the example of an area division
  • FIG. 1 shows a configuration example of a vehicle 10 according to an embodiment of the present technology.
  • the vehicle 10 includes a data acquisition unit 11, an information processing unit 12, and an operation control unit 13.
  • the data acquisition unit 11 acquires various data used for control of the vehicle 10.
  • the data acquisition unit 11 includes a photographing unit 21 and a sensor unit 22.
  • the photographing unit 21 includes a camera that photographs the surroundings of the vehicle 10.
  • the type of camera is not particularly limited, and any type of camera may be used depending on the application.
  • the photographing unit 21 includes one or more of a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, an infrared camera, and the like.
  • the imaging unit 21 supplies an image of the surroundings of the vehicle 10 (hereinafter referred to as a surrounding image) to the information processing unit 12.
  • the sensor unit 22 includes a sensor that acquires various data used for controlling the vehicle 10 other than the image.
  • the sensor unit 22 includes a depth sensor, an inertial measurement device (IMU), an ultrasonic sensor, a radar, a LiDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging), a sonar, a GNSS (Global Navigation Satellite System) receiver, and the like.
  • the sensor unit 22 supplies the sensor data acquired by each sensor to the information processing unit 12.
  • the information processing unit 12 performs self-position estimation processing of the vehicle 10, recognition processing of objects around the vehicle 10, and the like.
  • the information processing unit 12 includes a self-position estimation unit 31, a depth estimation unit 32, a region detection unit 33, a storage unit 34, a motion estimation unit 35, a region estimation unit 36, a storage unit 37, a mask information generation unit 38, and an object recognition
  • the unit 39 is provided.
  • the storage unit 34 and the storage unit 37 can be combined into one.
  • the self-position estimation unit 31 performs a self-position estimation process of the vehicle 10 based on the surrounding image and the mask information generated by the mask information generation unit 38 and, if necessary, the sensor data.
  • the self position estimation unit 31 supplies information indicating the estimation result of the self position of the vehicle 10 to the motion estimation unit 35 and the motion control unit 13.
  • the depth estimation unit 32 performs a depth estimation process indicating a distance to an object around the vehicle 10 based on at least one of the surrounding image and the sensor data.
  • the depth estimation unit 32 supplies information indicating the depth estimation result to the motion estimation unit 35.
  • the region detection unit 33 detects a region for each object (hereinafter referred to as an object region) in the surrounding image, and generates an image indicating each object region (hereinafter referred to as a region divided image). In addition, the area detection unit 33 generates an area management table indicating information regarding each object area in the area divided image. The area detection unit 33 stores the area division image and the area management table in the storage unit 34.
  • the object includes all kinds of objects.
  • the object includes not only objects that can exist around the vehicle 10 such as other vehicles, pedestrians, obstacles, and road surfaces, but also objects that can be included in surrounding images such as the sky.
  • the motion estimation unit 35 performs motion estimation between surrounding images of different frames based on surrounding images, self-position estimation results, depth estimation results, an area management table, and an estimated area management table described later. More specifically, the motion estimation unit 35 performs a pixel flow estimation process indicating pixel motion between surrounding images of different frames. The motion estimation unit 35 supplies information indicating the result of motion estimation (pixel flow estimation result) to the region estimation unit 36.
  • the region estimation unit 36 estimates an object region in the surrounding image based on the region management table or the estimated region management table and the estimated pixel flow, and an image indicating each estimated object region (hereinafter, estimated region division). Image).
  • the area estimation unit 36 generates an estimated area management table indicating information on each object area in the estimated area divided image.
  • the region estimation unit 36 causes the storage unit 37 to store the estimated region divided image and the estimated region management table.
  • the mask information generation unit 38 is an execution unit that generates mask information indicating a region used and a region not used in the subsequent processing in the surrounding image based on the estimated region management table.
  • the mask information generation unit 38 generates mask information in the data acquisition unit 11, the self-position estimation unit 31, and the object recognition unit 39.
  • the object recognition unit 39 performs recognition processing of objects around the vehicle 10 based on the surrounding image, the mask information, and, if necessary, sensor data.
  • the object recognition unit 39 supplies information indicating the recognition result of objects around the vehicle 10 to the operation control unit 13.
  • the operation control unit 13 controls the operation of the vehicle 10 based on the estimation result of the self-position of the vehicle 10, the recognition result of objects around the vehicle 10, and the like. For example, the operation control unit 13 controls acceleration, deceleration, stopping, steering, automatic driving, and the like of the vehicle 10.
  • time t0 to time t11 indicate timings when surrounding images are taken.
  • Periods T0 to T11 indicate periods between shooting timings of surrounding images of adjacent frames.
  • the period T0 is a period between time t0 and time t1
  • the period T1 is a period between time t1 and time t2.
  • surrounding images P (t0) to surrounding images P (t11) are referred to as surrounding images P (t0) to surrounding images P (t11).
  • This process is started, for example, when an operation for starting the vehicle 10 and starting driving is performed, for example, when an ignition switch, a power switch, a start switch, or the like of the vehicle 10 is turned on. Moreover, this process is complete
  • step S1 the photographing unit 21 acquires a surrounding image. That is, the photographing unit 21 photographs the surroundings of the vehicle 10 and supplies the surrounding image obtained as a result to the information processing unit 12.
  • step S2 the depth estimation unit 32 performs a depth estimation process. That is, the depth estimation unit 32 performs a process of estimating a depth that is a distance to an object around the vehicle 10 based on at least one of the surrounding image and the sensor data.
  • An arbitrary method can be used for the depth estimation process.
  • the depth estimation unit 32 performs depth estimation processing by stereo matching.
  • the sensor unit 22 includes a depth sensor such as LiDAR, the depth estimation unit 32 performs a depth estimation process based on sensor data from the depth sensor.
  • the depth estimation unit 32 generates, for example, 3D information (for example, 3D point cloud) indicating the depth estimation result, and supplies the generated 3D information to the motion estimation unit 35.
  • 3D information for example, 3D point cloud
  • step S3 the self-position estimation unit 31 performs self-position estimation processing. That is, the self-position estimating unit 31 estimates the position and orientation of the vehicle 10.
  • the self-position estimation unit 31 performs self-position estimation by performing SLAM based on the surrounding image and sensor data of a depth sensor such as LiDAR.
  • the self-position estimation unit 31 may use, for example, a GNSS signal received by the GNSS receiver, sensor data detected by the IMU, sensor data detected by the radar, and the like.
  • the self-position estimation unit 31 estimates the absolute position and the absolute posture of the vehicle 10 in the world coordinate system by the first self-position estimation process. Thereafter, the self-position estimation unit 31 estimates the amount of change from the position and posture of the vehicle 10 estimated by the previous self-position estimation process, for example, by the second and subsequent self-position estimation processes. Then, the self-position estimation unit 31 estimates the absolute position and absolute posture of the vehicle 10 based on the estimated change amount of the position and posture of the vehicle 10.
  • the self-position estimation unit 31 supplies information indicating the estimation result of the self-position of the vehicle 10 to the motion estimation unit 35.
  • step S4 the region detection unit 33 determines whether it is time to perform region detection processing. If it is determined that it is time to perform the area detection process, the process proceeds to step S5.
  • the region detection processing has a larger processing load and a longer required time than the depth estimation processing, self-position estimation processing, and pixel flow estimation processing described later. For this reason, the execution frequency of the region detection process is set lower than the execution frequencies of the depth estimation process, the self-position estimation process, and the pixel flow estimation process.
  • the depth estimation process, the self-position estimation process, and the pixel flow estimation process are executed for each frame of the surrounding image in synchronization with the shooting of the surrounding image.
  • the area detection process is executed every five frames of the surrounding image, for example, as shown in FIG. Specifically, region detection processing is performed on surrounding images taken at time t0, time t5, and time t10. Therefore, it is determined that it is time to perform the area detection processing in the period T0, the period T5, and the period T10 after the surrounding images captured at the time t0, the time t5, and the time t10 are acquired.
  • step S5 the region detection unit 33 performs region detection processing.
  • the region detection unit 33 performs semantic segmentation on surrounding images using a region-divided image generator obtained by learning processing using CNN (Convolutional-Neural-Network) or the like in advance.
  • CNN Convolutional-Neural-Network
  • each pixel in the surrounding image is labeled to indicate the type of object to which each pixel belongs.
  • an object area corresponding to each object in the surrounding image is detected, and the surrounding image is divided into one or more object areas.
  • the region detection unit 33 generates an image (hereinafter referred to as a region divided image) indicating the label of each pixel of the surrounding image.
  • the label of each pixel indicates the position of each object region in the region divided image.
  • the region divided image becomes an image divided by each object region.
  • FIG. 4 schematically shows a specific example of the surrounding image.
  • the surrounding image P in FIG. 4 includes a vehicle 111, a road surface 112, trees 113-1 to 113-8, and the sky 114.
  • the region detection unit 33 performs the semantic segmentation on the surrounding image P, thereby generating a region divided image PS schematically shown in FIG.
  • the area-divided image PS includes object areas 211, object areas 212, object areas 213-1 to object areas corresponding to the vehicle 111, road surface 112, trees 113-1 to 113-8, and sky 114 of the surrounding image P, respectively. 213-8 and an object area 214.
  • the coordinate system of the region-divided image PS is a coordinate system in which the pixel in the upper left corner is the origin, the horizontal direction is the u axis, and the vertical direction is the v axis.
  • the region detection unit 33 detects the contour of each object region in the region divided image.
  • the region detection unit 33 detects a contour image composed of pixels (hereinafter referred to as contour pixels) constituting the contour of the object region 213-7 in the frame A1.
  • contour pixels a contour image composed of pixels (hereinafter referred to as contour pixels) constituting the contour of the object region 213-7 in the frame A1.
  • the area detection unit 33 performs this process on all object areas in the area divided image PS.
  • the contour image 211A and the contour image 212A respectively corresponding to the object region 211, the object region 212, the object region 213-1 to the object region 213-8, and the object region 214.
  • the contour images 213A-1 to 213A-8 and the contour image 214A are detected.
  • the area detection unit 33 generates an area management table indicating information on each object area.
  • FIG. 7 shows an example of a region management table generated based on the region divided image PS of FIG.
  • the area management table includes items of ID, label, moving object flag, and contour pixel.
  • ID indicates an identification number uniquely assigned to each object area.
  • the label indicates the type of object corresponding to each object area.
  • the moving object flag indicates whether or not the object corresponding to each object area is a moving object.
  • the moving object flag is set to 1 when the object is a moving object, and is set to 0 when the object is a stationary object.
  • the contour pixel indicates the coordinates of the contour pixel constituting the contour image of each object area.
  • the area detection unit 33 stores the area division image and the area management table in the storage unit 34.
  • the area detection process is executed over a plurality of frame periods of the surrounding image. For example, the region detection process for the surrounding image P (t0) photographed at time t0 is performed over the period T0 to the period T4, and the detection result is acquired within the period T5.
  • step S4 determines that it is not time to perform the region detection process. If it is determined in step S4 that it is not time to perform the region detection process, the process of step S5 is skipped, and the process proceeds to step S6.
  • step S6 the motion estimation unit 35 performs a process of estimating a pixel flow between frames.
  • the motion estimation unit 35 estimates the pixel flow of all the pixels of the surrounding image one frame before. For example, in the period T1 to the period T4 in FIG. 3, the region detection processing has not yet been performed, and the contour pixels of the surrounding image one frame before are not detected and estimated. The pixel flow of this pixel is estimated.
  • the motion estimation unit 35 estimates the pixel flow (for example, optical flow) of each pixel by image matching between the surrounding image one frame before and the current surrounding image.
  • any method such as SIFT (Scale-Invariant Feature Transform), template image matching, Lukas-kanade tracker, etc. can be used.
  • SIFT Scale-Invariant Feature Transform
  • template image matching Lukas-kanade tracker, etc.
  • the motion estimation unit 35 supplies information indicating the estimation result of the pixel flow to the region estimation unit 36.
  • the motion estimation unit 35 estimates the pixel flow of each contour pixel. For example, after the period T5 in FIG. 3, since the contour pixels of the surrounding image one frame before are detected or estimated, the pixel flow of each contour pixel is estimated.
  • the motion estimation unit 35 includes the vehicle 10 between the 3D information generated by the depth estimation unit 32 and the surrounding image of the previous frame estimated by the self-position estimation unit 31 and the surrounding image of the current frame.
  • the pixel flow of the contour pixel of the object area of the stationary object is estimated based on the change amount of the position and orientation of the object.
  • FIG. 8A shows a contour image of an object area of a stationary object among the object areas of the area divided image PS of FIG. That is, A in FIG. 8 shows the contour image 212A, the contour image 213A-1 through the contour image 213A- corresponding to the object region 212 of the stationary object, the object region 213-1 through the object region 213-8, and the object region 214, respectively. 8 and an outline image 214A.
  • the amount of calculation is smaller than when using image matching. As a result, the estimation accuracy is improved.
  • the motion estimation unit 35 estimates the pixel flow of the contour pixels of the object area of the moving object using image matching.
  • a in FIG. 9 shows a contour image of an object region of a moving object among the object regions in the region divided image PS in FIG. That is, A of FIG. 9 shows a contour image 211A corresponding to the object area 211 of the moving object.
  • the motion estimation unit 35 integrates the estimation results of the pixel flow of the contour pixel of the object area of the stationary object and the pixel flow of the contour pixel of the object area of the moving object.
  • the motion estimation unit 35 supplies information indicating the estimation result of the pixel flow after integration to the region estimation unit 36.
  • step S7 the motion estimation unit 35 determines whether or not a region detection result is obtained. If it is determined that the region detection result is not obtained, the process returns to step S1.
  • step S1 to step S7 is repeatedly executed until it is determined in step S7 that the region detection result is obtained.
  • step S7 determines whether an area detection result has been obtained. If it is determined in step S7 that an area detection result has been obtained, the process proceeds to step S8.
  • step S8 the motion estimation unit 35 determines whether or not a new region detection result has been obtained. If it is determined that a new region detection result has been obtained, the process proceeds to step S9.
  • the area detection processing is completed before time t5 and before time t10, and the area detection result is obtained. Accordingly, it is determined that a new region detection result is obtained in the subsequent period T5 and period T10.
  • step S9 the motion estimation unit 35 performs a pixel flow estimation process during the area detection process.
  • the motion estimation unit 35 performs pixel flow estimation processing between the surrounding image used in the region detection processing and the latest surrounding image among the surrounding images obtained during the region detection processing.
  • the surrounding image P (t0) is used for the region detection processing, and the surrounding image P (t1) to the surrounding image P (t4) are obtained during the region detection processing.
  • a pixel flow estimation process between the surrounding image P (t0) and the surrounding image P (t4) is performed.
  • the pixel flow of each pixel between the surrounding image P (t0) and the surrounding image P (t4) is estimated by adding the pixel flow estimation results obtained in the periods T1 to T4 for each pixel.
  • the surrounding image P (t5) is used for the area detection process, and the surrounding image P (t6) to the surrounding image P (t9) are obtained during the area detection process.
  • a pixel flow estimation process between the surrounding image P (t5) and the surrounding image P (t9) is performed. For example, by adding the pixel flow estimation results obtained in the periods T6 to T9 for each contour pixel, the pixel flow of each contour pixel between the surrounding image P (t5) and the surrounding image P (t9) Presumed.
  • the motion estimation unit 35 supplies information indicating the estimation result of the pixel flow during the region detection process to the region estimation unit 36.
  • step S10 the region estimation unit 36 performs region estimation processing based on the region detection result and the estimation result of the pixel flow during the region detection processing.
  • the region estimation unit 36 moves the contour pixel of each object region detected in the surrounding image using the pixel flow estimated in the process of step S9.
  • the contour pixel of each object area detected in the surrounding image P (t0) is moved using the pixel flow between the surrounding image P (t0) and the surrounding image P (t4). Thereby, the position of the contour pixel of each object area in the surrounding image P (t4) is estimated.
  • the contour pixel of each object region detected in the surrounding image P (t5) is moved using the pixel flow between the surrounding image P (t5) and the surrounding image P (t9). . Thereby, the position of the contour pixel of each object area in the surrounding image P (t9) is estimated.
  • the region estimation unit 36 estimates a contour image of each object region after movement by performing polygon approximation based on the contour pixels of each object region after movement.
  • the region estimation unit 36 determines the label of each pixel in the contour image by performing polygonal inside / outside determination for each estimated contour image.
  • each contour pixel of the contour image 211A is moved to a position indicated by a circle in the frame A2 using the pixel flow.
  • the contour image 211A moves in a direction approaching the vehicle 10
  • a gap is generated between the contour pixels after movement.
  • the contour image 251A in FIG. 12 is estimated by performing polygon approximation on the contour pixels in the frame A2.
  • the inside / outside determination of the polygon is performed on the estimated contour image 251A, and the label of each pixel in the contour image 251A is determined. Thereby, as schematically shown in FIG. 13, the object area 261 to which the object area 211 is moved is estimated.
  • the filling process of the object area 211 before the movement is performed.
  • the label of each pixel in the object area 211 before movement is determined by a voting process using the labels of pixels around the object area 211.
  • the position of each object area in the surrounding image P (t0) in the surrounding image P (t4) is estimated, and the filling process of each object area before the movement is performed.
  • an estimated area divided image obtained by dividing the surrounding image P (t4) by each object area is generated.
  • the position of each object area in the surrounding image P (t5) in the surrounding image P (t9) is estimated, and the filling process of each object area before the movement is performed.
  • an estimated area divided image obtained by dividing the surrounding image P (t9) by each object area is generated.
  • the area estimation unit 36 generates an estimated area management table indicating information on each object area of the generated estimated divided area image.
  • the estimated area management table has the same configuration as the area management table of FIG.
  • the region estimation unit 36 stores the estimated region divided image and the estimated region management table in the storage unit 37.
  • step S8 if it is determined in step S8 that a new region detection result has not been obtained, the process proceeds to step S11. For example, in the period T6 to the period T9 and the period T11 in FIG. 3, since the area detection process has not been completed in the previous period, it is determined that a new area detection result has not been obtained.
  • step S11 the region estimation unit 36 performs region estimation processing based on the previous region estimation result and the estimation result of the pixel flow between frames.
  • the area estimation unit 36 reads the estimated area divided image generated by the area estimation process of the immediately preceding period T5 and the estimated area management table from the storage unit 37. Then, the region estimation unit 36 uses the pixel flow between the surrounding image P (t5) and the surrounding image P (t6) for each object region in the read estimated region divided image by the same process as in step S10. Move.
  • the position in the surrounding image P (t6) of each object region in the surrounding image P (t5) is estimated. Further, the filling process of each object area before the movement is performed. As a result, an estimated area divided image obtained by dividing the surrounding image P (t6) by each object area is generated.
  • the area estimation unit 36 reads the estimated area divided image and the estimated area management table generated by the area estimation process of the immediately preceding period T6 from the storage unit 37 in the period T7. Then, the region estimation unit 36 uses the pixel flow between the surrounding image P (t6) and the surrounding image P (t7) for each object region in the read estimated region divided image by the same process as in step S10. Move.
  • the position of each object region in the surrounding image P (t6) in the surrounding image P (t7) is estimated. Further, the area filling process corresponding to each object area before the movement is performed. As a result, an estimated area divided image is generated by dividing the surrounding image P (t7) by each object area whose position is estimated.
  • the area estimation unit 36 generates an estimated area management table indicating information on each object area of the generated estimated divided area image.
  • the region estimation unit 36 stores the estimated region divided image and the estimated region management table in the storage unit 37.
  • an estimated area divided image having higher real-time characteristics than the area divided image obtained by the area detection process is obtained.
  • an estimated region divided image in which the position of each object region is closer to the current position than the region divided image is obtained.
  • the position of each object area can be recognized more quickly and in detail than when only the area division process is performed.
  • the region divided image PS (t0) corresponding to the surrounding image P (t0) is obtained in the period T4
  • the region corresponding to the surrounding image P (t5) in the period T9.
  • the estimated area divided image PSe (t4) to the estimated area divided image PSe (t8) corresponding to the surrounding image P (t4) to the surrounding image P (t8) are obtained. .
  • the positions of the object areas of the surrounding images P (t4) to P (t8) photographed from the time t4 to the time t8 are estimated in the periods T5 to T9 ( Interpolated). That is, the position of each object area in each surrounding image is recognized quickly and in detail.
  • the mask information generation unit 38 in step S12, the mask information generation unit 38 generates mask information based on the region estimation result. Specifically, the mask information generation unit 38 reads the latest estimated area management table from the storage unit 37. Then, the mask information generation unit 38 generates mask information that masks an object region (hereinafter referred to as a moving object region) for which the moving object flag is set to 1 in the estimated region management table. In this mask information, the pixel value of each pixel in the contour image composed of the contour pixels of the moving object region is set to 0, and the pixel values of the other pixels are set to 1. The mask information generation unit 38 supplies the generated mask information to the self-position estimation unit 31.
  • a moving object region an object region for which the moving object flag is set to 1 in the estimated region management table.
  • the pixel value of each pixel in the contour image composed of the contour pixels of the moving object region is set to 0, and the pixel values of the other pixels are set to 1.
  • the mask information generation unit 38 supplies the generated mask information to the self-position estimation unit
  • the self-position estimation unit 31 performs self-position estimation processing by masking the surrounding image using the mask information in the next self-position estimation processing in step S3. That is, the self-position estimation unit 31 performs self-position estimation processing without using an image in the moving object region indicated by the mask information among the surrounding images.
  • the self-position estimation process is performed without using the image of the vehicle 301 in the frame A11 of the surrounding image P11.
  • step S1 the process returns to step S1, and the processes after step S1 are executed.
  • This process is started, for example, when an operation for starting the vehicle 10 and starting driving is performed, for example, when an ignition switch, a power switch, a start switch, or the like of the vehicle 10 is turned on. Moreover, this process is complete
  • steps S101 to S111 processing similar to that in steps S1 to S11 in FIG. 2 is performed.
  • the mask information generation unit 38 generates mask information based on the region estimation result. Specifically, the mask information generation unit 38 reads the latest estimated area management table from the storage unit 37. Then, the mask information generation unit 38 generates mask information that masks an area other than an object area (hereinafter referred to as a calculation target area) in which a label of an object to be calculated is set in the estimation area management table. In this mask information, the pixel value of each pixel in the contour image composed of the contour pixels of the calculation target region is set to 1, and the pixel values of the other pixels are set to 0. The mask information generation unit 38 supplies the generated mask information to the data acquisition unit 11 or the object recognition unit 39.
  • the type of object to be calculated may be set in advance or may be set by the user, for example.
  • step S113 the vehicle 10 performs AE or object recognition processing based on the mask information.
  • the object recognizing unit 39 performs object recognition processing on only the image in the calculation target area indicated by the mask information among the surrounding images. For example, as schematically shown in FIG. 17, the object recognition process is performed only on the image of the sign 321 in the frame A12 of the surrounding image P12. Thereby, a desired object can be recognized more quickly and accurately.
  • the photographing unit 21 performs AE using only the image in the calculation target area indicated by the mask information among the surrounding images. For example, as schematically shown in FIG. 18, AE is performed except for a whiteout region in the frame A13 of the surrounding image P13. As a result, AE can be performed quickly and appropriately. For example, it is possible to prevent a whiteout portion of a building estimated to have many feature points from being overexposed.
  • step S101 Thereafter, the process returns to step S101, and the processes after step S101 are executed.
  • estimation result of the depth and the estimation result of the self-position of the vehicle 10 may not be used for the pixel flow estimation process.
  • the motion estimator 35 may estimate the motion in units of blocks, objects, etc. instead of in units of pixels.
  • the region detection unit 33 may detect an object region in the surrounding image by using a method other than semantic segmentation.
  • the type of vehicle to which the present technology can be applied is not particularly limited as long as the vehicle uses the detection result of the object area in the image.
  • the present technology can also be applied to various types of moving objects that use the detection result of each object region in the image.
  • the present technology can be applied to mobile bodies such as personal mobility, airplanes, ships, construction machines, and agricultural machines (tractors).
  • mobile bodies to which the present technology can be applied include, for example, mobile bodies that are operated (operated) remotely without a user such as a drone or a robot.
  • FIG. 19 is a block diagram showing an example of a hardware configuration of a computer that executes the above-described series of processing by a program.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • An input / output interface 505 is further connected to the bus 504.
  • An input unit 506, an output unit 507, a storage unit 508, a communication unit 509, and a drive 510 are connected to the input / output interface 505.
  • the input unit 506 includes an input switch, a button, a microphone, an image sensor, and the like.
  • the output unit 507 includes a display, a speaker, and the like.
  • the storage unit 508 includes a hard disk, a nonvolatile memory, and the like.
  • the communication unit 509 includes a network interface or the like.
  • the drive 510 drives a removable recording medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the CPU 501 loads the program recorded in the storage unit 508 to the RAM 503 via the input / output interface 505 and the bus 504 and executes the program, as described above. A series of processing is performed.
  • the program executed by the computer 500 can be provided by being recorded in a removable recording medium 511 as a package medium or the like, for example.
  • the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be installed in the storage unit 508 via the input / output interface 505 by mounting the removable recording medium 511 in the drive 510. Further, the program can be received by the communication unit 509 via a wired or wireless transmission medium and installed in the storage unit 508. In addition, the program can be installed in the ROM 502 or the storage unit 508 in advance.
  • the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
  • the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
  • the present technology can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is jointly processed.
  • each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
  • An area detection unit for detecting an object area that is an area for each object in a plurality of images; A motion estimation unit that performs motion estimation between images; Based on the detection result of the object region in the first image and the result of motion estimation between the first image and the second image after the first image, the second image
  • An information processing apparatus comprising: an area estimation unit that estimates a position of the object area in the area.
  • the region estimation unit is configured to estimate the position of the object region in the second image and the result of motion estimation between the second image and a third image after the second image.
  • the information processing apparatus according to (1), wherein a position of the object region in the third image is estimated based on the information.
  • the information processing apparatus is an image preceding the fourth image in which the region detection unit detects the object region next to the first image. .
  • the information processing apparatus is an image immediately before the fourth image.
  • the motion estimation unit estimates a motion of a contour pixel which is a pixel constituting a contour of the object region;
  • the information processing apparatus according to any one of (1) to (4), wherein the region estimation unit estimates a position of the object region based on a motion pixel estimation result.
  • the information processing apparatus according to any one of (1) to (5), further including an execution unit that executes predetermined processing based on an estimation result of the position of the object region.
  • the first image and the second image are images taken around a moving body,
  • the execution unit generates mask information indicating an area used for self-position estimation of the moving body and an area not used based on the estimation result of the position of the object area in the second image.
  • the information processing apparatus described.
  • the first image and the second image are images taken around a moving body,
  • the execution unit generates mask information indicating a region used for object recognition around the moving body and a region not used based on the estimation result of the position of the object region in the second image (6)
  • the information processing apparatus described in 1. (9) The information processing apparatus according to (6), wherein the execution unit generates mask information indicating an area used for exposure control and an area not used for exposure control of the imaging unit that captured the first image and the second image.
  • a shooting section for shooting the surroundings An area detection unit for detecting an object area, which is an area for each object in a plurality of images, of images captured by the imaging unit; A motion estimation unit that performs motion estimation between images; Based on the detection result of the object region in the first image and the result of motion estimation between the first image and the second image after the first image, the second image
  • An area estimation unit for estimating the position of the object area in A moving body comprising: an execution unit that executes predetermined processing based on an estimation result of the position of the object region.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

La présente invention concerne un dispositif de traitement d'informations, un procédé de traitement d'informations et un programme, ainsi qu'un corps mobile conçus de façon à permettre une reconnaissance rapide d'une position d'une zone pour chaque objet à l'intérieur d'une image. Le dispositif de traitement d'informations comprend : une unité de détection de zone qui détecte une pluralité de zones d'objet, constituant les zones respectives de chaque objet à l'intérieur d'une image ; une unité d'estimation de mouvement qui estime un mouvement entre des images ; et une unité d'estimation de zone qui estime la position des zones d'objet dans une seconde image en fonction d'un résultat de détection des zones d'objet dans une première image, et d'un résultat d'estimation du mouvement entre la première image et la seconde image, postérieure à la première image. La présente technologie peut être appliquée, par exemple, à un véhicule.
PCT/JP2019/010761 2018-03-29 2019-03-15 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, et corps mobile WO2019188392A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018063527 2018-03-29
JP2018-063527 2018-03-29

Publications (1)

Publication Number Publication Date
WO2019188392A1 true WO2019188392A1 (fr) 2019-10-03

Family

ID=68059995

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/010761 WO2019188392A1 (fr) 2018-03-29 2019-03-15 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, et corps mobile

Country Status (1)

Country Link
WO (1) WO2019188392A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022137290A1 (fr) * 2020-12-21 2022-06-30 日本電信電話株式会社 Dispositif de détermination de mouvement, procédé de détermination de mouvement et programme de détermination de mouvement

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010113466A (ja) * 2008-11-05 2010-05-20 Toshiba Corp 対象追跡装置、対象追跡プログラム及び方法
JP2010244462A (ja) * 2009-04-09 2010-10-28 Fujifilm Corp 人物追跡装置、人物追跡方法及びプログラム
JP2012073971A (ja) * 2010-09-30 2012-04-12 Fujifilm Corp 動画オブジェクト検出装置、方法、及びプログラム
JP2016213781A (ja) * 2015-05-13 2016-12-15 キヤノン株式会社 撮像装置、その制御方法、および制御プログラム
JP2018009918A (ja) * 2016-07-15 2018-01-18 株式会社リコー 自己位置検出装置、移動体装置及び自己位置検出方法
WO2018051459A1 (fr) * 2016-09-15 2018-03-22 三菱電機株式会社 Dispositif et procédé de détection d'objets

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010113466A (ja) * 2008-11-05 2010-05-20 Toshiba Corp 対象追跡装置、対象追跡プログラム及び方法
JP2010244462A (ja) * 2009-04-09 2010-10-28 Fujifilm Corp 人物追跡装置、人物追跡方法及びプログラム
JP2012073971A (ja) * 2010-09-30 2012-04-12 Fujifilm Corp 動画オブジェクト検出装置、方法、及びプログラム
JP2016213781A (ja) * 2015-05-13 2016-12-15 キヤノン株式会社 撮像装置、その制御方法、および制御プログラム
JP2018009918A (ja) * 2016-07-15 2018-01-18 株式会社リコー 自己位置検出装置、移動体装置及び自己位置検出方法
WO2018051459A1 (fr) * 2016-09-15 2018-03-22 三菱電機株式会社 Dispositif et procédé de détection d'objets

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022137290A1 (fr) * 2020-12-21 2022-06-30 日本電信電話株式会社 Dispositif de détermination de mouvement, procédé de détermination de mouvement et programme de détermination de mouvement

Similar Documents

Publication Publication Date Title
US11915502B2 (en) Systems and methods for depth map sampling
US10803600B2 (en) Information processing device, information processing method, and program
JP5782088B2 (ja) 歪みのあるカメラ画像を補正するシステム及び方法
CN110796692A (zh) 用于同时定位与建图的端到端深度生成模型
US20230110116A1 (en) Advanced driver assist system, method of calibrating the same, and method of detecting object in the same
US11042999B2 (en) Advanced driver assist systems and methods of detecting objects in the same
US10929986B2 (en) Techniques for using a simple neural network model and standard camera for image detection in autonomous driving
US11741720B2 (en) System and method for tracking objects using using expanded bounding box factors
JP2023530762A (ja) 3dバウンディングボックスからの単眼深度管理
CN113240813B (zh) 三维点云信息确定方法及装置
WO2019163576A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
Srigrarom et al. Hybrid motion-based object detection for detecting and tracking of small and fast moving drones
US20200258240A1 (en) Method of detecting moving objects via a moving camera, and related processing system, device and computer-program product
WO2019188392A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, programme, et corps mobile
WO2020026798A1 (fr) Dispositif de commande, procédé de commande et programme
US20220277480A1 (en) Position estimation device, vehicle, position estimation method and position estimation program
US20230135230A1 (en) Electronic device and method for spatial synchronization of videos
US10832444B2 (en) System and method for estimating device pose in a space
Cimarelli et al. A case study on the impact of masking moving objects on the camera pose regression with CNNs
Paracchini et al. Accurate omnidirectional multi-camera embedded structure from motion
CN113850209A (zh) 一种动态物体检测方法、装置、交通工具及存储介质
JP2022175900A (ja) 情報処理装置、情報処理方法、およびプログラム
CN113661513A (zh) 一种图像处理方法、设备、图像处理系统及存储介质
CN111179312A (zh) 基于3d点云和2d彩色图像相结合的高精度目标跟踪方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19776726

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19776726

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP