WO2023085017A1 - Procédé d'apprentissage, programme d'apprentissage, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations - Google Patents

Procédé d'apprentissage, programme d'apprentissage, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations Download PDF

Info

Publication number
WO2023085017A1
WO2023085017A1 PCT/JP2022/038868 JP2022038868W WO2023085017A1 WO 2023085017 A1 WO2023085017 A1 WO 2023085017A1 JP 2022038868 W JP2022038868 W JP 2022038868W WO 2023085017 A1 WO2023085017 A1 WO 2023085017A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
cloud data
dimensional point
image
information processing
Prior art date
Application number
PCT/JP2022/038868
Other languages
English (en)
Japanese (ja)
Inventor
周平 花澤
Original Assignee
ソニーセミコンダクタソリューションズ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーセミコンダクタソリューションズ株式会社 filed Critical ソニーセミコンダクタソリューションズ株式会社
Publication of WO2023085017A1 publication Critical patent/WO2023085017A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present disclosure relates to a learning method, a learning program, an information processing device, an information processing method, and an information processing program.
  • CNNs Convolutional Neural Networks
  • a plurality of depth images generated from three-dimensional point cloud data scanned multiple times are added (synthesized), and a stereo image is used from the synthesized image to remove incorrect point cloud data. Generate stripped ground truth.
  • the present disclosure proposes a learning method, a learning program, an information processing device, an information processing method, and an information processing program that can reduce the amount of processing required to generate ground truth.
  • a learning method is a computer-executed learning method, and includes a predetermined number of point cloud data thinned out from three-dimensional point cloud data acquired by LiDAR (Light Detection And Ranging), and the three-dimensional point cloud generating a depth image corresponding to the image based on the image corresponding to the data; It includes performing machine learning by adjusting the coefficients of the convolutional neural network so that the difference between the depth image and the ground truth becomes small.
  • LiDAR Light Detection And Ranging
  • FIG. 1 is a block diagram showing a configuration example of a vehicle control system according to the present disclosure
  • FIG. FIG. 4 is an explanatory diagram of a learning method according to the present disclosure
  • FIG. 3 is a diagram illustrating an example of an image according to the present disclosure
  • FIG. FIG. 4 is an explanatory diagram of label data added to an image according to the present disclosure
  • FIG. 3 is an explanatory diagram of processing executed by an information processing apparatus according to the present disclosure
  • FIG. 1 is a block diagram showing a configuration example of a vehicle control system 11, which is an example of a mobile device control system to which the present technology is applied.
  • the vehicle control system 11 is provided in the vehicle 1 and performs processing related to driving support and automatic driving of the vehicle 1.
  • the vehicle control system 11 includes a vehicle control ECU (Electronic Control Unit) 21, a communication unit 22, a map information accumulation unit 23, a position information acquisition unit 24, an external recognition sensor 25, an in-vehicle sensor 26, a vehicle sensor 27, a storage unit 28, a driving It has a support/automatic driving control unit 29, a DMS (Driver Monitoring System) 30, an HMI (Human Machine Interface) 31, and a vehicle control unit 32.
  • vehicle control ECU Electronic Control Unit
  • communication unit 22 includes a communication unit 22, a map information accumulation unit 23, a position information acquisition unit 24, an external recognition sensor 25, an in-vehicle sensor 26, a vehicle sensor 27, a storage unit 28, a driving It has a support/automatic driving control unit 29, a DMS (Driver Monitoring System) 30, an HMI (Human Machine Interface) 31, and a vehicle control unit 32.
  • Vehicle control ECU 21, communication unit 22, map information storage unit 23, position information acquisition unit 24, external recognition sensor 25, in-vehicle sensor 26, vehicle sensor 27, storage unit 28, driving support/automatic driving control unit 29, driver monitoring system ( DMS) 30 , human machine interface (HMI) 31 , and vehicle control unit 32 are connected via a communication network 41 so as to be able to communicate with each other.
  • the communication network 41 is, for example, a CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), FlexRay (registered trademark), Ethernet (registered trademark), and other digital two-way communication standards. It is composed of a communication network, a bus, and the like.
  • the communication network 41 may be used properly depending on the type of data to be transmitted.
  • CAN may be applied to data related to vehicle control
  • Ethernet may be applied to large-capacity data.
  • each part of the vehicle control system 11 performs wireless communication assuming relatively short-range communication such as near field communication (NFC (Near Field Communication)) or Bluetooth (registered trademark) without going through the communication network 41. may be connected directly using NFC (Near Field Communication) or Bluetooth (registered trademark)
  • the communication unit 22 communicates with various devices inside and outside the vehicle, other vehicles, servers, base stations, etc., and transmits and receives various data.
  • the map information accumulation unit 23 accumulates one or both of the map obtained from the outside and the map created by the vehicle 1. For example, the map information accumulation unit 23 accumulates a three-dimensional high-precision map, a global map covering a wide area, and the like, which is lower in accuracy than the high-precision map.
  • the position information acquisition unit 24 receives GNSS signals from GNSS (Global Navigation Satellite System) satellites and acquires the position information of the vehicle 1 .
  • the acquired position information is supplied to the driving support/automatic driving control unit 29 .
  • the location information acquisition unit 24 is not limited to the method using GNSS signals, and may acquire location information using beacons, for example.
  • the external recognition sensor 25 includes various sensors used for recognizing situations outside the vehicle 1 and supplies sensor data from each sensor to each part of the vehicle control system 11 .
  • the type and number of sensors included in the external recognition sensor 25 are arbitrary.
  • the external recognition sensor 25 includes a camera 51, a radar 52, a LiDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) 53, and an ultrasonic sensor 54.
  • a camera 51 a radar 52
  • a LiDAR Light Detection and Ranging, Laser Imaging Detection and Ranging
  • an ultrasonic sensor 54 an ultrasonic sensor 54.
  • the in-vehicle sensor 26 includes various sensors for detecting information inside the vehicle, and supplies sensor data from each sensor to each part of the vehicle control system 11 .
  • the types and number of various sensors included in the in-vehicle sensor 26 are not particularly limited as long as they are the types and number that can be realistically installed in the vehicle 1 .
  • in-vehicle sensors 26 may comprise one or more of cameras, radar, seat sensors, steering wheel sensors, microphones, biometric sensors.
  • the vehicle sensor 27 includes various sensors for detecting the state of the vehicle 1, and supplies sensor data from each sensor to each section of the vehicle control system 11.
  • the types and number of various sensors included in the vehicle sensor 27 are not particularly limited as long as the types and number are practically installable in the vehicle 1 .
  • the vehicle sensor 27 includes a velocity sensor, an acceleration sensor, an angular velocity sensor (gyro sensor), and an inertial measurement unit (IMU (Inertial Measurement Unit)) integrating them.
  • IMU Inertial Measurement Unit
  • the storage unit 28 includes at least one of a nonvolatile storage medium and a volatile storage medium, and stores data and programs.
  • the storage unit 28 is used as, for example, EEPROM (Electrically Erasable Programmable Read Only Memory) and RAM (Random Access Memory), and storage media include magnetic storage devices such as HDD (Hard Disc Drive), semiconductor storage devices, optical storage devices, And a magneto-optical storage device can be applied.
  • the storage unit 28 stores various programs and data used by each unit of the vehicle control system 11 .
  • the driving support/automatic driving control unit 29 controls driving support and automatic driving of the vehicle 1 .
  • the driving support/automatic driving control unit 29 includes an analysis unit 61 , an action planning unit 62 and an operation control unit 63 .
  • the analysis unit 61 analyzes the vehicle 1 and its surroundings.
  • the analysis unit 61 includes a self-position estimation unit 71 , a sensor fusion unit 72 and a recognition unit 73 .
  • the self-position estimation unit 71 estimates the self-position of the vehicle 1 based on the sensor data from the external recognition sensor 25 and the high-precision map accumulated in the map information accumulation unit 23.
  • the sensor fusion unit 72 combines a plurality of different types of sensor data (for example, image data supplied from the camera 51, LiDAR 53, and sensor data supplied from the radar 52) to perform sensor fusion processing to obtain new information. I do. Methods for combining different types of sensor data include integration, fusion, federation, and the like.
  • the recognition unit 73 executes a detection process for detecting the situation outside the vehicle 1 and a recognition process for recognizing the situation outside the vehicle 1 .
  • the recognition unit 73 performs detection processing and recognition processing of the external situation of the vehicle 1 based on information from the external recognition sensor 25, information from the self-position estimation unit 71, information from the sensor fusion unit 72, and the like. .
  • the recognition unit 73 performs detection processing and recognition processing of objects around the vehicle 1 .
  • Object detection processing is, for example, processing for detecting the presence or absence, size, shape, position, movement, and the like of an object.
  • Object recognition processing is, for example, processing for recognizing an attribute such as the type of an object or identifying a specific object.
  • detection processing and recognition processing are not always clearly separated, and may overlap.
  • the recognition unit 73 detects objects around the vehicle 1 by clustering the point cloud based on sensor data from the radar 52 or the LiDAR 53 or the like for each cluster of point groups. As a result, presence/absence, size, shape, and position of objects around the vehicle 1 are detected.
  • the recognition unit 73 detects the movement of objects around the vehicle 1 by performing tracking that follows the movement of the masses of point groups classified by clustering. As a result, the speed and traveling direction (movement vector) of the object around the vehicle 1 are detected.
  • the recognition unit 73 detects or recognizes vehicles, people, bicycles, obstacles, structures, roads, traffic lights, traffic signs, road markings, etc. based on image data supplied from the camera 51 . Further, the recognition unit 73 may recognize types of objects around the vehicle 1 by performing recognition processing such as semantic segmentation.
  • the action plan section 62 creates an action plan for the vehicle 1.
  • the action planning unit 62 creates an action plan by performing route planning and route following processing.
  • global path planning is the process of planning a rough route from the start to the goal. This route planning is called trajectory planning, and in the planned route, trajectory generation (local path planning) that can proceed safely and smoothly in the vicinity of the vehicle 1 in consideration of the motion characteristics of the vehicle 1. It also includes the processing to be performed.
  • the motion control unit 63 controls the motion of the vehicle 1 in order to implement the action plan created by the action planning unit 62.
  • the DMS 30 performs driver authentication processing, driver state recognition processing, etc., based on sensor data from the in-vehicle sensor 26 and input data input to the HMI 31, which will be described later.
  • the driver's state to be recognized includes, for example, physical condition, alertness, concentration, fatigue, gaze direction, drunkenness, driving operation, posture, and the like.
  • the HMI 31 inputs various data, instructions, etc., and presents various data to the driver or the like.
  • the vehicle control unit 32 controls each unit of the vehicle 1.
  • the vehicle control section 32 includes a steering control section 81 , a brake control section 82 , a drive control section 83 , a body system control section 84 , a light control section 85 and a horn control section 86 .
  • the steering control unit 81 detects and controls the state of the steering system of the vehicle 1 .
  • the steering system includes, for example, a steering mechanism including a steering wheel, an electric power steering, and the like.
  • the steering control unit 81 includes, for example, a steering ECU that controls the steering system, an actuator that drives the steering system, and the like.
  • the brake control unit 82 detects and controls the state of the brake system of the vehicle 1 .
  • the brake system includes, for example, a brake mechanism including a brake pedal, an ABS (Antilock Brake System), a regenerative brake mechanism, and the like.
  • the brake control unit 82 includes, for example, a brake ECU that controls the brake system, an actuator that drives the brake system, and the like.
  • the drive control unit 83 detects and controls the state of the drive system of the vehicle 1 .
  • the drive system includes, for example, an accelerator pedal, a driving force generator for generating driving force such as an internal combustion engine or a driving motor, and a driving force transmission mechanism for transmitting the driving force to the wheels.
  • the drive control unit 83 includes, for example, a drive ECU that controls the drive system, an actuator that drives the drive system, and the like.
  • the body system control unit 84 detects and controls the state of the body system of the vehicle 1 .
  • the body system includes, for example, a keyless entry system, smart key system, power window device, power seat, air conditioner, air bag, seat belt, shift lever, and the like.
  • the body system control unit 84 includes, for example, a body system ECU that controls the body system, an actuator that drives the body system, and the like.
  • the light control unit 85 detects and controls the states of various lights of the vehicle 1 .
  • Lights to be controlled include, for example, headlights, backlights, fog lights, turn signals, brake lights, projections, bumper displays, and the like.
  • the light control unit 85 includes a light ECU that controls the light, an actuator that drives the light, and the like.
  • the horn control unit 86 detects and controls the state of the car horn of the vehicle 1 .
  • the horn control unit 86 includes, for example, a horn ECU for controlling the car horn, an actuator for driving the car horn, and the like.
  • a general object detection model is an SSD (Single Shot MultiBox Detector).
  • the SSD comprises a Convolutional Neural Network (CNN) that is machine-learned to detect objects from input images.
  • CNN Convolutional Neural Network
  • CNN machine learning uses teacher data to which images are given the types (classes) of objects contained in the images and the ground truth (GT) that indicates the area of the objects in the image.
  • GT ground truth
  • the object is scanned multiple times (for example, 11 scans) by a general LiDAR (Light Detection And Ranging) equipped with 60 vertical lasers. Generate an image.
  • LiDAR Light Detection And Ranging
  • the information processing device included in the recognition unit 73 extracts a predetermined number of point cloud data D2 from the three-dimensional point cloud data D1 acquired by the LiDAR 53 having 128 vertical lasers. and the image Pc corresponding to the three-dimensional point cloud data D1, the depth image Dm corresponding to the image Pc is generated.
  • the information processing device uses point cloud data D3, which is left after a predetermined number of point cloud data D2 are thinned out from the three-dimensional point cloud data D1, as ground truth, so that the difference between the depth image Dm and the ground truth becomes small. Then, machine learning is performed by adjusting the coefficients of the CNN.
  • ground truth is generated only by generating point cloud data D3 in which a predetermined number of point cloud data D2 is thinned out from the three-dimensional point cloud data D1 acquired by the LiDAR 53 and remains. (Point cloud data D3 remaining after being thinned out) can be generated. Therefore, according to the learning method according to the present disclosure, it is possible to greatly reduce the amount of processing required to generate ground truth compared to the general method of generating teacher data described above.
  • the data amount (number of points) of the predetermined number of point cloud data D2 to be thinned is at least the data amount (number of points) of the three-dimensional point cloud data D1. number) less than 50%.
  • a method of thinning out a predetermined number of point cloud data D2 from the three-dimensional point cloud data D1 includes, for example, a method of thinning out random data points from the three-dimensional point cloud data D1 arranged in a matrix, and a method of thinning out data points in one column for each sequence.
  • some images Pc are added with label data including data indicating the area of the subject in the image Pc and data indicating the type (class) of the subject. Therefore, when a label indicating the type of an object appearing in the image Pc and data indicating the area of the object in the image Pc are associated with the image, the information processing device can identify the area of the object for each type of object. Change the thinning rate of the point cloud data thinned from the three-dimensional point cloud data corresponding to .
  • the information processing apparatus does not uniformly thin out a predetermined number of point cloud data D2 from the entire area of the three-dimensional point cloud data D1, but rather selects the three-dimensional point cloud data according to the characteristics of the object and the purpose of object detection.
  • An appropriate amount of predetermined number of point cloud data D2 can be thinned out for each region of D1.
  • the image Pc includes a vehicle Vc, a plurality of poles Po, and a background Bg.
  • the image Pc is added with label data LVc indicating that the area of the vehicle Vc and the object in the area are the vehicle Vc.
  • the image Pc includes the area of the pole Po and label data LPo indicating that the object in the area is the pole Po, and the area of the background Bg and label data LBg indicating that the object in the area is the background Bg. is added.
  • the information processing apparatus sets the thinning rate in the area of the main object as the detection target lower than the thinning rate in the area of the non-main object as the detection target. In other words, the information processing apparatus makes the amount of point cloud data to be left in the area of the main object as the detection target larger than the amount of the point cloud data in the area of the object that is not the main detection target. Thereby, the information processing apparatus can generate ground truth with higher reliability for the region of the main object as the detection target.
  • the information processing device detects a predetermined number of 50% of the entire area from the three-dimensional point cloud data D1 for the area of the vehicle Vc. of the point cloud data D2 is thinned out, and the remaining 50% of the point cloud data (the point cloud data D3 remaining after the thinning out) is left as ground truth.
  • the information processing device thins out a predetermined number of point cloud data D2, which is 80% of the whole, from the three-dimensional point cloud data D1 for the area of the pole Po, and thins out the remaining 20% of the point cloud data D2.
  • the data point cloud data D3 remaining after thinning is left as ground truth.
  • the information processing apparatus mainly uses the data of the image Pc when detecting the pole Po.
  • the information processing device executes the information processing program stored in the storage unit 28 to perform the above-described CNN machine learning and object detection processing.
  • FIG. 5 is an explanatory diagram of processing executed by the information processing apparatus according to the present disclosure.
  • the LiDAR shown in FIG. 5 is data obtained by converting the three-dimensional point cloud data D1 acquired from the LiDAR 53 into a vertical image.
  • LiDAR' shown in FIG. 5 is a predetermined number of point cloud data D2 obtained by thinning out the point cloud data from the elevation image.
  • Frames t ⁇ 1, t, and t+1 shown in FIG. 5 are RGB images captured three times in succession in time series corresponding to the point cloud data D3 remaining after thinning out.
  • the camera parameter K shown in FIG. 5 is an internal parameter of the camera 51, and is a parameter used for converting from UV coordinates with the origin at the upper left of the image Pc to camera coordinates centered on the camera 51.
  • Velocity shown in FIG. 5 is the speed of the vehicle in t frames obtained from the communication network 41 (CAN).
  • Depthencorder shown in FIG. 5 is a network that extracts features from 3D point cloud data.
  • RGBencoder shown in FIG. 5 is a network for extracting features from RGB images.
  • the Decorder shown in FIG. 5 is a network that transforms the extracted features into a DepthMap.
  • Pose shown in FIG. 5 is a network for estimating the moving distance and direction of the own vehicle from time-series images.
  • the information processing device thins out the point cloud data from the three-dimensional point cloud data D1 (LiDAR shown in FIG. 5) (step S1), A predetermined number of point cloud data D2 (LiDAR' shown in FIG. 5) are generated.
  • the information processing device extracts features from a predetermined number of point cloud data D2 (LiDAR' shown in FIG. 5) by Depthencorder (step S2).
  • the information processing device extracts features from the t-frame image corresponding to the three-dimensional point cloud data D1 (LiDAR shown in FIG. 5) using the RGBencoder.
  • the information processing device converts the features extracted by the Depthencorder and the RGBencorder into a DepthMap by the Decoder (step S3). Then, the information processing device calculates SmoothLoss that makes the depth map smooth (step S4). Further, the information processing device calculates DepthLoss, which is the difference between the DepthMap and the ground truth serving as a LiDAR teacher shown in FIG. 5 (step S5).
  • the information processing device calculates DepthLoss by, for example, Equation (1) below.
  • the information processing device performs machine learning by adjusting CNN parameters so that SmoothLoss and DepthLoss are minimized.
  • the information processing device estimates the moving distance and direction of the own vehicle from the time-series images of the t ⁇ 1 frame, the t frame, and the t+1 frame using Pose (step S6).
  • the output of Pose is represented by the following formula (2).
  • the information processing device converts the estimated moving distance into speed (step S7), and calculates speed Loss (step S8).
  • Velocity Loss is the difference in distance traveled from the distance estimated by Pose and the velocity.
  • Speed Loss is calculated by the following formula (3).
  • the information processing device generates an image of the previous t frames from the Pose output, the DepthMap, and the camera parameters (step S9). After that, the information processing device generates a mask for removing the same object based on the time-series images of t ⁇ 1, t, and t+1 frames and the image generated in step S9 (step S10).
  • the information processing device uses a mask to remove the same object from the image generated in step S9 to generate a composite image. Then, the information processing device calculates an image Loss, which is the difference between the synthesized image and the true image (image of t frames). The information processing device performs machine learning by adjusting CNN parameters so that image loss is minimized.
  • the learning method according to the embodiment is a learning method executed by a computer, and corresponds to a predetermined number of point cloud data D2 thinned out from the three-dimensional point cloud data D1 acquired by the LiDAR 53 and the three-dimensional point cloud data D1.
  • the depth image Dm corresponding to the image Pc is generated based on the image Pc, and the point cloud data D3 remaining after thinning out a predetermined number of point cloud data D2 from the three-dimensional point cloud data D1 is used as the ground truth, and the depth image Dm Machine learning is performed by adjusting the coefficients of the convolutional neural network so that the difference between the ground truth and the ground truth becomes smaller.
  • ground truth can be generated using only raw data acquired from the LiDAR 53, so the amount of processing required to generate ground truth can be greatly reduced.
  • the learning method when a label indicating the type of an object appearing in the image Pc and data indicating the area of the object in the image Pc are associated with the image, for each type of object, object The thinning rate of the point cloud data thinned out from the three-dimensional point cloud data D1 corresponding to the area of is changed.
  • object The thinning rate of the point cloud data thinned out from the three-dimensional point cloud data D1 corresponding to the area of is changed.
  • the learning method instead of uniformly thinning out a predetermined number of point cloud data D2 from the entire area of the three-dimensional point cloud data D1, three-dimensional point An appropriate amount of a predetermined number of point cloud data D2 can be thinned out for each region of the group data D1.
  • the learning method according to the embodiment sets the thinning rate in areas of objects that are primary as detection targets to be lower than the thinning rate in areas of objects that are not primary as detection targets. According to the learning method according to the embodiment, it is possible to generate ground truth with higher reliability for regions of main objects as detection targets.
  • the learning program according to the embodiment corresponds to the image Pc based on a predetermined number of point cloud data D2 thinned out from the three-dimensional point cloud data D1 acquired by the LiDAR 53 and the image Pc corresponding to the three-dimensional point cloud data D1. and the point cloud data D3 remaining after a predetermined number of point cloud data D2 are thinned out from the three-dimensional point cloud data D1 is used as the ground truth, and the difference between the depth image Dm and the ground truth is A computer is caused to execute a procedure of machine learning by adjusting the coefficients of the convolutional neural network so as to reduce the value. As a result, the computer can generate the ground truth using only the raw data obtained from the LiDAR 53, so the amount of processing required to generate the ground truth can be greatly reduced.
  • the information processing apparatus creates an image Pc based on a predetermined number of point cloud data D2 thinned out from the three-dimensional point cloud data D1 acquired by the LiDAR 53 and an image Pc corresponding to the three-dimensional point cloud data D1.
  • a corresponding depth image Db is generated, and the point cloud data D3 remaining after a predetermined number of point cloud data D2 is thinned out from the three-dimensional point cloud data D1 is used as the ground truth, and the difference between the depth image Dm and the ground truth is small.
  • the information processing unit adjusts the coefficients of the convolutional neural network so as to perform machine learning, and detects an object from the three-dimensional point cloud data and images input to the convolutional neural network.
  • the information processing device can generate the ground truth using only the raw data acquired from the LiDAR 53, so that the amount of processing required for generating the ground truth can be greatly reduced.
  • the information processing method is an information processing method executed by a computer.
  • a depth image Dm corresponding to the image Pc is generated based on the corresponding image Pc, and the point cloud data D3 remaining after thinning out a predetermined number of point cloud data D2 from the three-dimensional point cloud data D1 is used as ground truth,
  • Machine learning is performed by adjusting the coefficients of the convolutional neural network so that the difference between the depth image Dm and the ground truth becomes small, and objects are detected from the three-dimensional point cloud data and images input to the convolutional neural network.
  • the computer can generate the ground truth using only the raw data obtained from the LiDAR 53, so the amount of processing required to generate the ground truth can be greatly reduced.
  • the information processing program creates an image Pc based on a predetermined number of point cloud data D2 thinned out from the three-dimensional point cloud data D1 acquired by the LiDAR 53 and an image Pc corresponding to the three-dimensional point cloud data D1.
  • a procedure for generating a corresponding depth image Dm, and a point cloud data D3 remaining after a predetermined number of point cloud data D2 are thinned out from the three-dimensional point cloud data D1 is used as the ground truth, and the difference between the depth image Dm and the ground truth.
  • a computer is caused to execute a procedure for machine learning by adjusting the coefficients of the convolutional neural network so that . As a result, the computer can generate the ground truth using only the raw data obtained from the LiDAR 53, so the amount of processing required to generate the ground truth can be greatly reduced.
  • a computer implemented learning method comprising: A depth image corresponding to the image is generated based on a predetermined number of point cloud data thinned out from 3D point cloud data acquired by LiDAR (Light Detection And Ranging) and an image corresponding to the 3D point cloud data. death, The point cloud data remaining after the predetermined number of point cloud data has been thinned out from the three-dimensional point cloud data is used as the ground truth, and the coefficients of the convolutional neural network are used to reduce the difference between the depth image and the ground truth.
  • a learning method that involves adjusting and machine learning.
  • the point cloud data remaining after the predetermined number of point cloud data has been thinned out from the three-dimensional point cloud data is used as the ground truth, and the coefficients of the convolutional neural network are used to reduce the difference between the depth image and the ground truth.
  • a learning program that makes a computer perform a procedure for machine learning by adjusting .
  • An information processing device comprising: an information processing unit that detects an object from three-dimensional point cloud data and images input to the convolutional neural network.
  • a computer-executed information processing method comprising: generating a depth image corresponding to the image based on a predetermined number of point cloud data thinned out from the three-dimensional point cloud data acquired by LiDAR and an image corresponding to the three-dimensional point cloud data; The point cloud data remaining after the predetermined number of point cloud data has been thinned out from the three-dimensional point cloud data is used as the ground truth, and the coefficients of the convolutional neural network are used to reduce the difference between the depth image and the ground truth.
  • machine learning by adjusting An information processing method comprising detecting an object from three-dimensional point cloud data and images input to the convolutional neural network.
  • the point cloud data remaining after the predetermined number of point cloud data has been thinned out from the three-dimensional point cloud data is used as the ground truth, and the coefficients of the convolutional neural network are used to reduce the difference between the depth image and the ground truth.
  • a procedure for machine learning by adjusting An information processing program for causing a computer to execute a procedure for detecting an object from three-dimensional point cloud data and images input to the convolutional neural network.

Abstract

Un procédé d'apprentissage selon la présente invention est exécuté par un ordinateur. Le procédé d'apprentissage consiste à : générer, sur la base d'un nombre prédéterminé d'éléments de données de nuage de points (D2) affinées à partir de données de nuage de points tridimensionnelles (D1) acquises par LiDAR et d'une image (Pc) correspondant aux données de nuage de points tridimensionnelles (D1), une image de profondeur (Dm) correspondant à l'image (Pc) ; et réaliser un apprentissage automatique par l'ajustement d'un coefficient d'un réseau neuronal à convolution de telle sorte que, lorsque des données de nuage de points (D3), subsistant après que le nombre prédéterminé d'éléments de données de nuage de points (D2) sont affinées à partir des données de nuage de points tridimensionnelles (D1), sont définies comme étant la réalité du terrain, une différence entre l'image de profondeur (Dm) et la réalité du terrain devient faible.
PCT/JP2022/038868 2021-11-09 2022-10-19 Procédé d'apprentissage, programme d'apprentissage, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations WO2023085017A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-182831 2021-11-09
JP2021182831 2021-11-09

Publications (1)

Publication Number Publication Date
WO2023085017A1 true WO2023085017A1 (fr) 2023-05-19

Family

ID=86335676

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/038868 WO2023085017A1 (fr) 2021-11-09 2022-10-19 Procédé d'apprentissage, programme d'apprentissage, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations

Country Status (1)

Country Link
WO (1) WO2023085017A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000090272A (ja) * 1998-09-16 2000-03-31 Hitachi Zosen Corp 靴の選定方法
WO2020053611A1 (fr) * 2018-09-12 2020-03-19 Toyota Motor Europe Dispositif électronique, système et procédé de détermination d'une grille sémantique de l'environnement d'un véhicule
WO2020116195A1 (fr) * 2018-12-07 2020-06-11 ソニーセミコンダクタソリューションズ株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, dispositif de commande de corps mobile et corps mobile
JP2020146449A (ja) * 2019-03-06 2020-09-17 国立大学法人九州大学 磁気共鳴画像高速再構成法及び磁気共鳴イメージング装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000090272A (ja) * 1998-09-16 2000-03-31 Hitachi Zosen Corp 靴の選定方法
WO2020053611A1 (fr) * 2018-09-12 2020-03-19 Toyota Motor Europe Dispositif électronique, système et procédé de détermination d'une grille sémantique de l'environnement d'un véhicule
WO2020116195A1 (fr) * 2018-12-07 2020-06-11 ソニーセミコンダクタソリューションズ株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations, programme, dispositif de commande de corps mobile et corps mobile
JP2020146449A (ja) * 2019-03-06 2020-09-17 国立大学法人九州大学 磁気共鳴画像高速再構成法及び磁気共鳴イメージング装置

Similar Documents

Publication Publication Date Title
US11531354B2 (en) Image processing apparatus and image processing method
JP7188394B2 (ja) 画像処理装置及び画像処理方法
JP7180670B2 (ja) 制御装置、制御方法、並びにプログラム
JPWO2019082670A1 (ja) 情報処理装置、情報処理方法、プログラム、及び、移動体
JPWO2019077999A1 (ja) 撮像装置、画像処理装置、及び、画像処理方法
US20240054793A1 (en) Information processing device, information processing method, and program
WO2021241189A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
CN110281934A (zh) 车辆控制装置、车辆控制方法及存储介质
US20230215151A1 (en) Information processing apparatus, information processing method, information processing system, and a program
WO2019150918A1 (fr) Dispositif de traitement d'information, procédé de traitement d'information, programme, et corps mobile
US20220277556A1 (en) Information processing device, information processing method, and program
JP7198742B2 (ja) 自動運転車両、画像表示方法およびプログラム
WO2023153083A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, programme de traitement d'informations et dispositif de déplacement
JPWO2020036043A1 (ja) 情報処理装置と情報処理方法とプログラム
WO2023085017A1 (fr) Procédé d'apprentissage, programme d'apprentissage, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations
US20230245423A1 (en) Information processing apparatus, information processing method, and program
US20230289980A1 (en) Learning model generation method, information processing device, and information processing system
WO2021193103A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
WO2023085190A1 (fr) Procédé de génération de données d'enseignement, programme de génération de données d'enseignement, dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations
WO2023021755A1 (fr) Dispositif de traitement d'informations, système de traitement d'informations, modèle, et procédé de génération de modèle
WO2023054090A1 (fr) Dispositif de traitement de reconnaissance, procédé de traitement de reconnaissance et système de traitement de reconnaissance
WO2023090001A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
WO2024024471A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et système de traitement d'informations
US20230410486A1 (en) Information processing apparatus, information processing method, and program
WO2020203241A1 (fr) Procédé de traitement d'informations, programme et dispositif de traitement d'informations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22892517

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023559513

Country of ref document: JP