WO2019113749A1 - Systems and methods for identifying and positioning objects around a vehicle - Google Patents

Systems and methods for identifying and positioning objects around a vehicle Download PDF

Info

Publication number
WO2019113749A1
WO2019113749A1 PCT/CN2017/115491 CN2017115491W WO2019113749A1 WO 2019113749 A1 WO2019113749 A1 WO 2019113749A1 CN 2017115491 W CN2017115491 W CN 2017115491W WO 2019113749 A1 WO2019113749 A1 WO 2019113749A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
point cloud
objects
shape
lidar point
Prior art date
Application number
PCT/CN2017/115491
Other languages
French (fr)
Inventor
Jian Li
Zhenzhe Ying
Original Assignee
Beijing Didi Infinity Technology And Development Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology And Development Co., Ltd. filed Critical Beijing Didi Infinity Technology And Development Co., Ltd.
Priority to EP17916456.1A priority Critical patent/EP3523753A4/en
Priority to CN201780041308.2A priority patent/CN110168559A/en
Priority to CA3028659A priority patent/CA3028659C/en
Priority to AU2017421870A priority patent/AU2017421870B2/en
Priority to PCT/CN2017/115491 priority patent/WO2019113749A1/en
Priority to JP2018569058A priority patent/JP2020507137A/en
Priority to TW107144499A priority patent/TW201937399A/en
Priority to US16/234,701 priority patent/US20190180467A1/en
Publication of WO2019113749A1 publication Critical patent/WO2019113749A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • G01S17/931Lidar systems specially adapted for specific applications for anti-collision purposes of land vehicles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/521Depth or shape recovery from laser ranging, e.g. using interferometry; from the projection of structured light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/77Determining position or orientation of objects or cameras using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/803Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30261Obstacle

Definitions

  • the present disclosure generally relates to object identification, and in particular, to methods and systems for identifying and positioning objects around a vehicle during autonomous driving.
  • Autonomous driving technology is developing rapidly in recent years. Vehicles using autonomous driving technology may sense its environment and navigate automatically. Some of the autonomous vehicles still require human’s input and work as a driving aid. Some of the autonomous vehicles drives completely on their own. However, the ability of correctly identifying and positioning objects around the vehicle is important for any type of autonomous vehicles.
  • the conventional method may include mounting a camera on the vehicle and analyzing the objects in images captured by the camera. However, the camera images are normally 2-dimentional (2D) and hence depth information of objects cannot be obtained easily.
  • a radio detection and ranging (Radar) and a Light Detection and Ranging (LiDAR) device may be employed to obtain 3-dimentional (3D) images around the vehicle, but the objects therein are generally mixed with noises and difficult to be identified and positioned. Also, images generated by Radar and LiDAR device are difficult for humans to understand.
  • a system for driving aid may include a control unit including one or more storage media including a set of instructions for identifying and positioning one or more objects around a vehicle, and one or more microchips electronically connected to the one or more storage media.
  • the one or more microchips may execute the set of instructions to obtain a first light detection and ranging (LiDAR) point cloud image around a detection base station;
  • the one or more microchips may further execute the set of instructions to identify one or more objects in the first LiDAR point cloud image and determine one or more locations of the one or more objects in the first LiDAR point image.
  • LiDAR light detection and ranging
  • the one or more microchips may further execute the set of instructions to generate a 3D shape for each of the one or more objects, and generate a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
  • the system may further include at least one LiDAR device in communication with the control unit to send the LiDAR point cloud image to the control unit, at least one camera in communication with the control unit to send a camera image to the control unit, and at least one radar device in communication with the control unit to send a radar image to the control unit.
  • at least one LiDAR device in communication with the control unit to send the LiDAR point cloud image to the control unit
  • at least one camera in communication with the control unit to send a camera image to the control unit
  • at least one radar device in communication with the control unit to send a radar image to the control unit.
  • the base station may be a vehicle, and the system may further include at least one LiDAR device mounted on a steering wheel, a cowl or reflector of the vehicle, wherein the mounting of the at least one LiDAR device may include at least one of an adhesive bonding, a bolt and nut connection, a bayonet fitting, or a vacuum fixation.
  • the one or more microchips may further obtain a first camera image including at least one of the one or more objects, identify at least one target object of the one or more objects in the first camera image and at least one target location of the at least one target object in the first camera image, and generate a second camera image by marking the at least one target object in the first camera image based on the at least one target location in the first camera image and the 3D shape of the at least one target object in the LiDAR point cloud image.
  • the one or more microchips may further obtain a 2D shape of the at least one target object in the first camera image, correlate the LiDAR point cloud image with the first camera image, generate a 3D shape of the at least one target object in the first camera image based on the 2D shape of the at least one target object and the correlation between the LiDAR point cloud image and the first camera image, and generate a second camera image by marking the at least one target object in the first camera image based on the identified location in the first camera image and the 3D shape of the at least one target object in the first camera image.
  • the one or more microchips may operate a you only look once (YOLO) network or a Tiny-YOLO network to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image.
  • YOLO you only look once
  • Tiny-YOLO Tiny-YOLO
  • the one or more microchips may further obtain coordinates of a plurality of points in the first LiDAR point cloud image, wherein the plurality of points includes uninterested points and remaining points, remove the uninterested points from the plurality of points according to the coordinates, cluster the remaining points into one or more clusters based on a point cloud clustering algorithm, and select at least one of the one or more clusters as a target cluster, each of the target cluster corresponding to an object.
  • the one or more microchips may further determine a preliminary 3D shape of the object, adjust at least one of a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal, calculate a score of the 3D shape proposal, and determine whether the score of the 3D shape proposal satisfies a preset condition.
  • the one or more microchips may further adjust the 3D shape proposal.
  • the one or more microchips may determine the 3D shape proposal or further adjusted 3D shape proposal as the 3D shape of the object.
  • the score of the 3D shape proposal is calculated based on at least one of a number of points of the first LiDAR point cloud image inside the 3D shape proposal, a number of points of the first LiDAR point cloud image outside the 3D shape proposal, or distances between points and the 3D shape.
  • the one or more microchips may further obtain a first radio detection and ranging (Radar) image around the detection base station, identify the one or more objects in the first Radar image, determine one or more locations of the one or more objects in the first Radar image, generate a 3D shape for each of the one or more objects in the first Radar image, generate a second Radar image by marking the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image, and fuse the second Radar image and the second LiDAR point cloud image to generate a compensated image.
  • Radar radio detection and ranging
  • the one or more microchips may further obtain two first LiDAR point cloud images around the base station at two different time frames, generate two second LiDAR point cloud images at the two different time frames based on the two first LiDAR point cloud images, and generate a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
  • the one or more microchips may further obtain a plurality of LiDAR point cloud images around the base station at a plurality of different time frames; generate a plurality of second LiDAR point cloud images at the plurality of different time frames based on the plurality of first LiDAR point cloud images; and generate a video based on the plurality of second LiDAR point cloud images.
  • a method may be implemented on a computing device having one or more storage media storing instructions for identifying and positioning one or more objects around a vehicle, and one or more microchips electronically connected to the one or more storage media.
  • the method may include obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station.
  • the method may further include identifying one or more objects in the first LiDAR point cloud image, and determining one or more locations of the one or more objects in the first LiDAR point image.
  • LiDAR light detection and ranging
  • the method may further include generating a 3D shape for each of the one or more objects, and generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
  • a non-transitory computer readable medium may include at least one set of instructions for identifying and positioning one or more objects around a vehicle.
  • the at least one set of instructions may direct the microchips to perform acts of obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station.
  • the at least one set of instructions may further direct the microchips to perform acts of identifying one or more objects in the first LiDAR point cloud image, and determining one or more locations of the one or more objects in the first LiDAR point image.
  • the at least one set of instructions may further direct the microchips to perform acts of generating a 3D shape for each of the one or more objects, and generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
  • FIG. 1 is a schematic diagram illustrating an exemplary scenario for autonomous vehicle according to some embodiments of the present disclosure
  • Fig. 2 is a block diagram of an exemplary vehicle with an autonomous driving capability according to some embodiments of the present disclosure
  • FIG. 3 is a schematic diagram illustrating exemplary hardware components of a computing device 300
  • FIG. 4 is a block diagram illustrating an exemplary sensing module according to some embodiments of the present disclosure
  • FIG. 5 is a flowchart illustrating an exemplary process for generating a LiDAR point cloud image on which 3D shape of objects are marked according to some embodiments of the present disclosure
  • FIGs. 6A-6C are a series of schematic diagrams of generating and marking a 3D shape of an object in LiDAR point cloud image according to some embodiments of the present disclosure
  • FIG. 7 is a flowchart illustrating an exemplary process for generating a marked camera image according to some embodiments of the present disclosure
  • FIG. 8 is a flowchart illustrating an exemplary process for generating 2D representations of 3D shapes of the one or more objects in the camera image according to some embodiments of the present disclosure
  • FIGs. 9A and 9B are schematic diagrams of same 2D camera images of a car according to some embodiments of the present disclosure.
  • FIG. 10 is a schematic diagram of a you only look once (yolo) network according to some embodiments of the present disclosure.
  • FIG. 11 is a flowchart illustrating an exemplary process for identifying the objects in a LiDAR point cloud image according to some embodiments of the present disclosure
  • FIGs. 12A-12E are a series of schematic diagrams of identifying an object in a LiDAR point cloud image according to some embodiments of the present disclosure.
  • FIG. 13 is a flowchart illustrating an exemplary process for generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure
  • FIGs. 14A-14D are a series of schematic diagrams of generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure
  • FIG. 15 is a flow chart illustrating an exemplary process for generating a compensated image according to some embodiments of the present disclosure
  • FIG. 16 is a schematic diagram of a synchronization between camera, LiDAR device, and/or radar device according to some embodiments of the present disclosure
  • FIG. 17 is a flow chart illustrating an exemplary process for generating a LiDAR point cloud image or a video based on existing LiDAR point cloud images according to some embodiments of the present disclosure
  • FIG. 18 is a schematic diagram of validating and interpolating frames of images according to some embodiments of the present disclosure.
  • autonomous vehicle may refer to a vehicle capable of sensing its environment and navigating without human (e.g., a driver, a pilot, etc. ) input.
  • autonomous vehicle and “vehicle” may be used interchangeably.
  • autonomous driving may refer to ability of navigating without human (e.g., a driver, a pilot, etc. ) input.
  • the flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments in the present disclosure. It is to be expressly understood, the operations of the flowchart may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.
  • the positioning technology used in the present disclosure may be based on a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a Galileo positioning system, a quasi-zenith satellite system (QZSS) , a wireless fidelity (WiFi) positioning technology, or the like, or any combination thereof.
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • COMPASS compass navigation system
  • Galileo positioning system Galileo positioning system
  • QZSS quasi-zenith satellite system
  • WiFi wireless fidelity positioning technology
  • the systems and methods disclosed in the present disclosure are described primarily regarding a driving aid for identifying and positioning objects around a vehicle, it should be understood that this is only one exemplary embodiment.
  • the system or method of the present disclosure may be applied to any other kind of navigation system.
  • the system or method of the present disclosure may be applied to transportation systems of different environments including land, ocean, aerospace, or the like, or any combination thereof.
  • the autonomous vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a train, a bullet train, a high-speed rail, a subway, a vessel, an aircraft, a spaceship, a hot-air balloon, a driverless vehicle, or the like, or any combination thereof.
  • the system or method may find applications in, e.g., logistic warehousing, military affairs.
  • An aspect of the present disclosure relates to a driving aid for identifying and positioning objects around a vehicle during autonomous driving.
  • a camera, a LiDAR device, a Radar device may be mounted on a roof of an autonomous car.
  • the camera, the LiDAR device and the Radar device may obtain a camera image, a LiDAR point cloud image, and a Radar image around the car respectively.
  • the LiDAR point cloud image may include a plurality of points.
  • a control unit may cluster the plurality of points into multiple clusters, wherein each cluster may correspond to an object.
  • the control unit may determine a 3D shape for each object and mark the 3D shape on the LiDAR point cloud image.
  • the control unit may also correlate the LiDAR point cloud image with the camera image to generate and mark a 2D representation of 3D shape of the objects on the camera image.
  • the marked LiDAR point cloud image and camera image are better in understanding the location and movement of the objects.
  • the control unit may further generate a video of the movement of the objects based on marked camera images.
  • the vehicle or a driver therein may adjust speed and movement direction of the vehicle based on the generate video or images to avoid colliding the objects.
  • FIG. 1 is a schematic diagram illustrating an exemplary scenario for autonomous vehicle according to some embodiments of the present disclosure.
  • an autonomous vehicle 130 may travel along a road 121 without human input along a path autonomously determined by the autonomous vehicle 130.
  • the road 121 may be a space prepared for a vehicle to travel along.
  • the road 121 may be a road for vehicles with wheel (e.g. a car, a train, a bicycle, a tricycle, etc. ) or without wheel (e.g., a hovercraft) , may be an air lane for an air plane or other aircraft, and may be a water lane for ship or submarine, may be an orbit for satellite.
  • Travel of the autonomous vehicle 130 may not break traffic law of the road 121 regulated by law or regulation. For example, speed of the autonomous vehicle 130 may not exceed speed limit of the road 121.
  • the autonomous vehicle 130 may not collide an obstacle 110 by travelling along a path 120 determined by the autonomous vehicle 130.
  • the obstacle 110 may be a static obstacle or a dynamic obstacle.
  • the static obstacle may include a building, tree, roadblock, or the like, or any combination thereof.
  • the dynamic obstacle may include moving vehicles, pedestrians, and/or animals, or the like, or any combination thereof.
  • the autonomous vehicle 130 may include conventional structures of a non-autonomous vehicle, such as an engine, four wheels, a steering wheel, etc.
  • the autonomous vehicle 130 may further include a sensing system 140, including a plurality of sensors (e.g., a sensor 142, a sensor 144, a sensor 146) and a control unit 150.
  • the plurality of sensors may be configured to provide information that is used to control the vehicle.
  • the sensors may sense status of the vehicle.
  • the status of the vehicle may include dynamic situation of the vehicle, environmental information around the vehicle, or the like, or any combination thereof.
  • the plurality of sensors may be configured to sense dynamic situation of the autonomous vehicle 130.
  • the plurality of sensors may include a distance sensor, a velocity sensor, an acceleration sensor, a steering angle sensor, a traction-related sensor, a camera, and/or any sensor.
  • the distance sensor may determine a distance between a vehicle (e.g., the autonomous vehicle 130) and other objects (e.g., the obstacle 110) .
  • the distance sensor may also determine a distance between a vehicle (e.g., the autonomous vehicle 130) and one or more obstacles (e.g., static obstacles, dynamic obstacles) .
  • the velocity sensor e.g., a Hall sensor
  • the acceleration sensor e.g., an accelerometer
  • the steering angle sensor e.g., a tilt sensor
  • the traction-related sensor e.g., a force sensor
  • the traction-related sensor may determine a traction of a vehicle (e.g., the autonomous vehicle 130) .
  • the plurality of sensors may sense environment around the autonomous vehicle 130.
  • one or more sensors may detect a road geometry and obstacles (e.g., static obstacles, dynamic obstacles) .
  • the road geometry may include a road width, road length, road type (e.g., ring road, straight road, one-way road, two-way road) .
  • the static obstacles may include a building, tree, roadblock, or the like, or any combination thereof.
  • the dynamic obstacles may include moving vehicles, pedestrians, and/or animals, or the like, or any combination thereof.
  • the plurality of sensors may include one or more video cameras, laser-sensing systems, infrared-sensing systems, acoustic-sensing systems, thermal-sensing systems, or the like, or any combination thereof.
  • the control unit 150 may be configured to control the autonomous vehicle 130.
  • the control unit 150 may control the autonomous vehicle 130 to drive along a path 120.
  • the control unit 150 may calculate the path 120 based on the status information from the plurality of sensors.
  • the path 120 may be configured to avoid collisions between the vehicle and one or more obstacles (e.g., the obstacle 110) .
  • the path 120 may include one or more path samples.
  • Each of the one or more path samples may include a plurality of path sample features.
  • the plurality of path sample features may include a path velocity, a path acceleration, a path location, or the like, or a combination thereof.
  • the autonomous vehicle 130 may drive along the path 120 to avoid a collision with an obstacle.
  • the autonomous vehicle 130 may pass each path location at a corresponding path velocity and a corresponding path accelerated velocity for each path location.
  • the autonomous vehicle 130 may also include a positioning system to obtain and/or determine the position of the autonomous vehicle 130.
  • the positioning system may also be connected to another party, such as a base station, another vehicle, or another person, to obtain the position of the party.
  • the positioning system may be able to establish a communication with a positioning system of another vehicle, and may receive the position of the other vehicle and determine the relative positions between the two vehicles.
  • Fig. 2 is a block diagram of an exemplary vehicle with an autonomous driving capability according to some embodiments of the present disclosure.
  • the vehicle with an autonomous driving capability may include a control system, including but not limited to a control unit 150, a plurality of sensors 142, 144, 146, a storage 220, a network 230, a gateway module 240, a Controller Area Network (CAN) 250, an Engine Management System (EMS) 260, an Electric Stability Control (ESC) 270, an Electric Power System (EPS) 280, a Steering Column Module (SCM) 290, a throttling system 265, a braking system 275 and a steering system 295.
  • a control system including but not limited to a control unit 150, a plurality of sensors 142, 144, 146, a storage 220, a network 230, a gateway module 240, a Controller Area Network (CAN) 250, an Engine Management System (EMS) 260, an Electric Stability Control (ESC) 270, an Electric Power System
  • the control unit 150 may process information and/or data relating to vehicle driving (e.g., autonomous driving) to perform one or more functions described in the present disclosure.
  • the control unit 150 may be configured to drive a vehicle autonomously.
  • the control unit 150 may output a plurality of control signals.
  • the plurality of control signal may be configured to be received by a plurality of electronic control units (ECUs) to control the drive of a vehicle.
  • the control unit 150 may determine a reference path and one or more candidate paths based on environment information of the vehicle.
  • the control unit 150 may include one or more processing engines (e.g., single-core processing engine (s) or multi-core processor (s) ) .
  • control unit 150 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • ASIP application-specific instruction-set processor
  • GPU graphics processing unit
  • PPU physics processing unit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • PLD programmable logic device
  • controller a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • RISC reduced instruction-
  • the storage 220 may store data and/or instructions. In some embodiments, the storage 220 may store data obtained from the autonomous vehicle 130. In some embodiments, the storage 220 may store data and/or instructions that the control unit 150 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage 220 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc.
  • Exemplary volatile read-and-write memory may include a random access memory (RAM) .
  • RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyrisor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc.
  • Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc.
  • MROM mask ROM
  • PROM programmable ROM
  • EPROM erasable programmable ROM
  • EEPROM electrically-erasable programmable ROM
  • CD-ROM compact disk ROM
  • digital versatile disk ROM etc.
  • the storage may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the storage 220 may be connected to the network 230 to communicate with one or more components of the autonomous vehicle 130 (e.g., the control unit 150, the sensor 142) .
  • One or more components in the autonomous vehicle 130 may access the data or instructions stored in the storage 220 via the network 230.
  • the storage 220 may be directly connected to or communicate with one or more components in the autonomous vehicle 130 (e.g., the control unit 150, the sensor 142) .
  • the storage 220 may be part of the autonomous vehicle 130.
  • the network 230 may facilitate exchange of information and/or data.
  • one or more components in the autonomous vehicle 130 e.g., the control unit 150, the sensor 142
  • the control unit 150 may obtain/acquire dynamic situation of the vehicle and/or environment information around the vehicle via the network 230.
  • the network 230 may be any type of wired or wireless network, or combination thereof.
  • the network 230 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof.
  • the network 230 may include one or more network access points.
  • the network 230 may include wired or wireless network access points such as base stations and/or internet exchange points 230-1, ..., through which one or more components of the autonomous vehicle 130 may be connected to the network 230 to exchange data and/or information.
  • the gateway module 240 may determine a command source for the plurality of ECUs (e.g., the EMS 260, the EPS 280, the ESC 270, the SCM 290) based on a current driving status of the vehicle.
  • the command source may be from a human driver, from the control unit 150, or the like, or any combination thereof.
  • the gateway module 240 may determine the current driving status of the vehicle.
  • the driving status of the vehicle may include a manual driving status, a semi-autonomous driving status, an autonomous driving status, an error status, or the like, or any combination thereof.
  • the gateway module 240 may determine the current driving status of the vehicle to be a manual driving status based on an input from a human driver.
  • the gateway module 240 may determine the current driving status of the vehicle to be a semi-autonomous driving status when the current road condition is complex.
  • the gateway module 240 may determine the current driving status of the vehicle to be an error status when abnormalities (e.g., a signal interruption, a processor crash) happen.
  • abnormalities e.g., a signal interruption, a processor crash
  • the gateway module 240 may transmit operations of the human driver to the plurality of ECUs in response to a determination that the current driving status of the vehicle is a manual driving status. For example, the gateway module 240 may transmit a press on the accelerator done by the human driver to the EMS 260 in response to a determination that the current driving status of the vehicle is a manual driving status. The gateway module 240 may transmit the control signals of the control unit 150 to the plurality of ECUs in response to a determination that the current driving status of the vehicle is an autonomous driving status. For example, the gateway module 240 may transmit a control signal associated with steering to the SCM 290 in response to a determination that the current driving status of the vehicle is an autonomous driving status.
  • the gateway module 240 may transmit the operations of the human driver and the control signals of the control unit 150 to the plurality of ECUs in response to a determination that the current driving status of the vehicle is a semi-autonomous driving status.
  • the gateway module 240 may transmit an error signal to the plurality of ECUs in response to a determination that the current driving status of the vehicle is an error status.
  • a Controller Area Network is a robust vehicle bus standard (e.g., a message-based protocol) allowing microcontrollers (e.g., the control unit 150) and devices (e.g., the EMS 260, the EPS 280, the ESC 270, and/or the SCM 290, etc. ) to communicate with each other in applications without a host computer.
  • the CAN 250 may be configured to connect the control unit 150 with the plurality of ECUs (e.g., the EMS 260, the EPS 280, the ESC 270, the SCM 290) .
  • the EMS 260 may be configured to determine an engine performance of the autonomous vehicle 130. In some embodiments, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on the control signals from the control unit 150. For example, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on a control signal associated with an acceleration from the control unit 150 when the current driving status is an autonomous driving status. In some embodiments, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on operations of a human driver. For example, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on a press on the accelerator done by the human driver when the current driving status is a manual driving status.
  • the EMS 260 may include a plurality of sensors and a micro-processor.
  • the plurality of sensors may be configured to detect one or more physical signals and convert the one or more physical signals to electrical signals for processing.
  • the plurality of sensors may include a variety of temperature sensors, an air flow sensor, a throttle position sensor, a pump pressure sensor, a speed sensor, an oxygen sensor, a load sensor, a knock sensor, or the like, or any combination thereof.
  • the one or more physical signals may include an engine temperature, an engine intake air volume, a cooling water temperature, an engine speed, or the like, or any combination thereof.
  • the micro-processor may determine the engine performance based on a plurality of engine control parameters.
  • the micro-processor may determine the plurality of engine control parameters based on the plurality of electrical signals.
  • the plurality of engine control parameters may be determined to optimize the engine performance.
  • the plurality of engine control parameters may include an ignition timing, a fuel delivery, an idle air flow, or the like, or any combination thereof.
  • the throttling system 265 may be configured to change motions of the autonomous vehicle 130. For example, the throttling system 265 may determine a velocity of the autonomous vehicle 130 based on an engine output. For another example, the throttling system 265 may cause an acceleration of the autonomous vehicle 130 based on the engine output.
  • the throttling system 365 may include fuel injectors, a fuel pressure regulator, an auxiliary air valve, a temperature switch, a throttle, an idling speed motor, a fault indicator, ignition coils, relays, or the like, or any combination thereof.
  • the throttling system 265 may be an external executor of the EMS 260.
  • the throttling system 265 may be configured to control the engine output based on the plurality of engine control parameters determined by the EMS 260.
  • the ESC 270 may be configured to improve the stability of the vehicle.
  • the ESC 270 may improve the stability of the vehicle by detecting and reducing loss of traction.
  • the ESC 270 may control operations of the braking system 275 to help steer the vehicle in response to a determination that a loss of steering control is detected by the ESC 270.
  • the ESC 270 may improve the stability of the vehicle when the vehicle starts on an uphill slope by braking.
  • the ESC 270 may further control the engine performance to improve the stability of the vehicle.
  • the ESC 270 may reduce an engine power when a probable loss of steering control happens. The loss of steering control may happen when the vehicle skids during emergency evasive swerves, when the vehicle understeers or oversteers during poorly judged turns on slippery roads, etc.
  • the braking system 275 may be configured to control a motion state of the autonomous vehicle 130. For example, the braking system 275 may decelerate the autonomous vehicle 130. For another example, the braking system 275 may stop the autonomous vehicle 130 in one or more road conditions (e.g., a downhill slope) . As still another example, the braking system 275 may keep the autonomous vehicle 130 at a constant velocity when driving on a downhill slope.
  • road conditions e.g., a downhill slope
  • the braking system 275 man include a mechanical control component, a hydraulic unit, a power unit (e.g., a vacuum pump) , an executing unit, or the like, or any combination thereof.
  • the mechanical control component may include a pedal, a handbrake, etc.
  • the hydraulic unit may include a hydraulic oil, a hydraulic hose, a brake pump, etc.
  • the executing unit may include a brake caliper, a brake pad, a brake disc, etc.
  • the EPS 280 may be configured to control electric power supply of the autonomous vehicle 130.
  • the EPS 280 may supply, transfer, and/or store electric power for the autonomous vehicle 130.
  • the EPS 280 may control power supply to the steering system 295.
  • the EPS 280 may supply a large electric power to the steering system 295 to create a large steering torque for the autonomous vehicle 130, in response to a determination that a steering wheel is turned to a limit (e.g., a left turn limit, a right turn limit) .
  • a limit e.g., a left turn limit, a right turn limit
  • the SCM 290 may be configured to control the steering wheel of the vehicle.
  • the SCM 290 may lock/unlock the steering wheel of the vehicle.
  • the SCM 290 may lock/unlock the steering wheel of the vehicle based on the current driving status of the vehicle.
  • the SCM 290 may lock the steering wheel of the vehicle in response to a determination that the current driving status is an autonomous driving status.
  • the SCM 290 may further retract a steering column shaft in response to a determination that the current driving status is an autonomous driving status.
  • the SCM 290 may unlock the steering wheel of the vehicle in response to a determination that the current driving status is a semi-autonomous driving status, a manual driving status, and/or an error status.
  • the SCM 290 may control the steering of the autonomous vehicle 130 based on the control signals of the control unit 150.
  • the control signals may include information related to a turning direction, a turning location, a turning angle, or the like, or any combination thereof.
  • the steering system 295 may be configured to steer the autonomous vehicle 130.
  • the steering system 295 may steer the autonomous vehicle 130 based on signals transmitted from the SCM 290.
  • the steering system 295 may steer the autonomous vehicle 130 based on the control signals of the control unit 150 transmitted from the SCM 290 in response to a determination that the current driving status is an autonomous driving status.
  • the steering system 295 may steer the autonomous vehicle 130 based on operations of a human driver. For example, the steering system 295 may turn the autonomous vehicle 130 to a left direction when the human driver turns the steering wheel to a left direction in response to a determination that the current driving status is a manual driving status.
  • FIG. 3 is a schematic diagram illustrating exemplary hardware components of a computing device 300..
  • the computing device 300 may be a special purpose computing device for autonomous driving, such as a single-board computing device including one or more microchips. Further, the control unit 150 may include one or more of the computing device 300. The computing device 300 may be used to implement the method and/or system described in the present disclosure via its hardware, software program, firmware, or a combination thereof.
  • the computing device 300 may include COM ports 350 connected to and from a network connected thereto to facilitate data communications.
  • the computing device 300 may also include a processor 320, in the form of one or more processors, for executing computer instructions.
  • the computer instructions may include, for example, routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions described herein.
  • the processor 320 may access instructions for operating the autonomous vehicle 130 and execute the instructions to determine a driving path for the autonomous vehicle.
  • the processor 320 may include one or more hardware processors built in one or more microchips, such as a microcontroller, a microprocessor, a reduced instruction set computer (RISC) , an application specific integrated circuits (ASICs) , an application-specific instruction-set processor (ASIP) , a central processing unit (CPU) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a microcontroller unit, a digital signal processor (DSP) , a field programmable gate array (FPGA) , an advanced RISC machine (ARM) , a programmable logic device (PLD) , any circuit or processor capable of executing one or more functions, or the like, or any combinations thereof.
  • a microcontroller such as a microcontroller, a microprocessor, a reduced instruction set computer (RISC) , an application specific integrated circuits (ASICs) , an application-specific instruction-set processor (ASIP) , a central processing unit (CPU
  • the exemplary computer device 300 may include an internal communication bus 310, program storage and data storage of different forms, for example, a disk 270, and a read only memory (ROM) 330, or a random access memory (RAM) 340, for various data files to be processed and/or transmitted by the computer.
  • the exemplary computer device 300 may also include program instructions stored in the ROM 330, RAM 340, and/or other type of non-transitory storage medium to be executed by the processor 320.
  • the methods and/or processes of the present disclosure may be implemented as the program instructions.
  • the computing device 300 also includes an I/O component 360, supporting input/output between the computer and other components (e.g., user interface elements) .
  • the computing device 300 may also receive programming and data via network communications.
  • the computing device 300 in the present disclosure may also include multiple processors, thus operations and/or method steps that are performed by one processor as described in the present disclosure may also be jointly or separately performed by the multiple processors.
  • the processor 320 of the computing device 300 executes both step A and step B, it should be understood that step A and step B may also be performed by two different processors jointly or separately in the computing device 300 (e.g., the first processor executes step A and the second processor executes step B, or the first and second processors jointly execute steps A and B) .
  • the element when an element in the control system in FIG. 2 performs, the element may perform through electrical signals and/or electromagnetic signals.
  • the element may perform through electrical signals and/or electromagnetic signals.
  • a sensor 142, 144, or 146 sends out detected information, such as a digital photo or a LiDAR cloud point image
  • the information may be transmitted to a receiver in a form of electronic signals.
  • the control unit 150 may receive the electronic signals of the detected information and may operate logic circuits in its processor to process such information.
  • a processor of the control unit 159 may generate electrical signals encoding the command and then send the electrical signals to an output port. Further, when the processor retrieves data from a storage medium, it may send out electrical signals to a read device of the storage medium, which may read structured data in the storage medium. The structured data may be transmitted to the processor in the form of electrical signals via a bus of the control unit 150.
  • an electrical signal may refer to one electrical signal, a series of electrical signals, and/or a plurality of discrete electrical signals.
  • FIG. 4 is a block diagram illustrating an exemplary sensing system according to some embodiments of the present disclosure.
  • the sensing system 140 may be in communication with a control unit 150 to send raw sensing data (e.g., images) or preprocessed sensing data to the control unit 150.
  • the sensing system 140 may include at least one camera 410, at least one LiDAR detector 420, at least radar detector 430, and a processing unit 440.
  • the camera 410, the LiDAR detector 420, and the radar detector 430 may correspond to the sensors 142, 144, and 146, respectively.
  • the camera 410 may be configured to capture camera image (s) of environmental data around a vehicle.
  • the camera 410 may include an unchangeable lens camera, a compact camera, a 3D camera, a panoramic camera, an audio camera, an infrared camera, a digital camera, or the like, or any combination thereof.
  • multiple cameras of the same or different types may be mounted on a vehicle.
  • an infrared camera may be mounted on a back hood of the vehicle to capture infrared images of objects behind the vehicle, especially, when the vehicle is backing up at night.
  • an audio camera may be mounted on a reflector of the vehicle to capture images of objects at a side of the vehicle. The audio camera may mark a sound level of different sections or objects on the images obtained.
  • the images captured by the multiple cameras 410 mounted on the vehicle may collectively cover a whole region around the vehicle.
  • the multiple cameras 410 may be mounted on different parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate.
  • the window may include a front window, a back window, a side window, etc.
  • the car body may include a front hood, a back hood, a roof, a chassis, a side, etc.
  • the multiple cameras 410 may be attached to or mounted on accessories in the compartment of the vehicle (e.g., a steering wheel, a cowl, a reflector) .
  • the method of mounting may include adhesive bonding, bolt and nut connection, bayonet fitting, vacuum fixation, or the like, or any combination thereof.
  • the LiDAR device 420 may be configured to obtain high resolution images with certain range from the vehicle.
  • the LiDAR device 420 may be configured to detect objects within 35 meters of the vehicle.
  • the LiDAR device 420 may be configured to generate LiDAR point cloud images of the surrounding environment of the vehicle to which the LiDAR device 420 is mounted.
  • the LiDAR device 420 may include a laser generator and a sensor.
  • the laser beam may include an ultraviolet light, a visible light, a near infrared light, etc.
  • the laser generator may illuminate the objects with a pulsed laser beam at a fixed predetermine frequency or predetermined varying frequencies.
  • the laser beam may reflect back after contacting the surface of the objects and the sensor may receive the reflected laser beam. Through the reflected laser beam, the LiDAR device 420 may measure the distance between the surface of the objects and the LiDAR device 420.
  • the LiDAR device 420 may rotate and use the laser beam to scan the surrounding environment of the vehicle, thereby generating a LiDAR point cloud image according to the reflected laser beam. Since the LiDAR device 420 rotates and scans along limited heights of the vehicle’s surrounding environment, the LiDAR point cloud image measures the 360° environment surrounding the vehicle between the predetermined heights of the vehicle.
  • the LiDAR point cloud image may be a static or dynamic image. Further, since each point in the LiDAR point cloud image measures the distance between the LiDAR device and a surface of an object from which the laser beam is reflected, the LiDAR point cloud image is a 3D image.
  • the LiDAR point cloud image may be a real-time image illustrating a real-time propagation of the laser beam.
  • the LiDAR device 420 may be mounted on the roof or front window of the vehicle, however, it should be noted that the LiDAR device 420 may also be installed on other parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate.
  • the radar device 430 may be configured to generate a radar image by measuring distance to objects around a vehicle via radio waves. Comparing with LiDAR device 420, the radar device 430 may be less precise (with less resolution) but may have a wider detection range. Accordingly, the radar device 430 may be used to measure objects farther than the detection range of the LiDAR device 420. For example, the radar device 430 may be configured to measure objects between 35 meters and 100 meters from the vehicle.
  • the radar device 430 may include a transmitter for producing electromagnetic waves in the radio or microwaves domain, a transmitting antenna for transmitting or broadcasting the radio waves, a receiving antenna for receiving the radio waves and a processor for generating a radar image.
  • the radar device 430 may be mounted on the roof or front window of the vehicle, however, it should be noted that the radar device 430 may also be installed on other parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate.
  • the LiDAR image and the radar image may be fused to generate a compensated image.
  • Detailed methods regarding the fusion of the LiDAR image and the radar image may be found elsewhere in present disclosure (See, e.g., FIG. 15 and the descriptions thereof) .
  • the camera 410, the LiDAR device 420 and the radar device 430 may work concurrently or individually. In a case that they are working individually at different time frame rates, a synchronization method may be employed. Detailed method regarding the synchronization of the frames of the camera 410, the LiDAR device 420 and/or the radar device 430 may be found elsewhere in the present disclosure (See e.g., FIG. 16 and the descriptions thereof) .
  • the sensing system 140 may further include a processing unit 440 configured to pre-process the generated images (e.g., camera image, LiDAR image, and radar image) .
  • the pre-processing of the images may include smoothing, filtering, denoising, reconstructing, or the like, or any combination thereof.
  • FIG. 5 is a flowchart illustrating an exemplary process for generating a LiDAR point cloud image on which 3D shape of objects are marked according to some embodiments of the present disclosure.
  • the process 500 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 500 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • control unit 150 may obtain a LiDAR point cloud image (also referred to as a first LiDAR point cloud image) around a base station.
  • a LiDAR point cloud image also referred to as a first LiDAR point cloud image
  • the base station may be any device that the LiDAR device, the radar, and the camera are mounted on.
  • the base station may be a movable platform, such as a vehicle (e.g., a car, an aircraft, a ship etc. ) .
  • the base station may also be a stationary platform, such as a detection station or a airport control tower.
  • the present disclosure takes a vehicle or a device (e.g., a rack) mounted on the vehicle as an example of the base station.
  • the first LiDAR point cloud image may be generated by the LiDAR device 420.
  • the first LiDAR point cloud image may be a 3D point cloud image including voxels corresponding to one or more objects around the base station.
  • the first LiDAR point cloud image may correspond to a first time frame (also referred to as a first time point) .
  • control unit 150 may identify one or more objects in the first LiDAR point cloud image.
  • the one or more objects may include pedestrians, vehicles, obstacles, buildings, signs, traffic lights, animals, or the like, or any combination thereof.
  • the control unit 150 may identify the regions and types of the one or more objects in 520. In some embodiments, the control unit 150 may only identify the regions. For example, the control unit 150 may identify a first region of the LiDAR point cloud image as a first object, a second region of the LiDAR point cloud image as a second object and remaining regions as ground (or air) . As another example, the control unit 150 may identify the first region as a pedestrian and the second region as a vehicle.
  • control unit 150 may first determine the height of the points (or voxels) around the vehicle-mounted base station (e.g., the height of the vehicle where the vehicle-mounted device is plus the height of the vehicle mounted device) .
  • the points that are too low (ground) , or too high (e.g., at a height that is unlikely to be an object to avoid or to consider during driving) may be removed by the control unit 150 before identifying the one or more objects.
  • the remaining points may be clustered into a plurality of clusters.
  • the remaining points may be clustered based on their 3D coordinates (e.g., cartesian coordinates) in the 3D point cloud image (e.g., distance between points that are less than a threshold is clustered into a same cluster) .
  • the remaining points may be swing scanned before clustered into the plurality of clusters.
  • the swing scanning may include converting the remaining points in the 3D point cloud image from a 3D cartesian coordinate system to a polar coordinate system.
  • the polar coordinate system may include an origin or a reference point.
  • the polar coordinate of each of the remaining points may be expressed as a straight-line distance from the origin, and an angle from the origin to the point.
  • a graph may be generated based on the polar coordinates of the remaining points (e.g., angle from the origin as x-axis or horizontal axis and distance from the origin as y-axis or vertical axis) .
  • the points in the graph may be connected to generate a curve that includes sections with large curvatures and sections with small curvatures. Points on a section with a small curvature are likely the points on a same object and may be clustered into a same cluster. Points on a section with a large curvature are likely the points on different objects and may be clustered into different clusters. Each cluster may correspond to an object. The method of identifying the one or more objects may be found in FIG. 11.
  • control unit 150 may obtain a camera image that is taken at a same (or substantially the same or similar) time and angle as the first LiDAR point cloud image.
  • the control unit 150 may identify the one or more objects in the camera image and directly treat them as the one or more objects in the LiDAR point cloud image.
  • the control unit 150 may determine one or more locations of the one or more objects in the first LiDAR point image. The control unit 150 may consider each identified object separately and perform operation 530 for each of the one or more objects individually.
  • the locations of the one or more objects may be a geometric center or gravity point of the clustered region of the one or more objects.
  • the locations of the one or more objects may be preliminary locations that are adjusted or re-determined after the 3D shape of the one or more objects are generated in 540.
  • the operations 520 and 530 may be performed in any order, or combined as one operation.
  • the control unit 150 may determine locations of points corresponding to one or more unknown objects, cluster the points into a plurality of clusters and then identify the clusters as objects.
  • the control unit 150 may obtain a camera image.
  • the camera image may be taken by the camera at the same (or substantially the same, or similar) time and angle as the LiDAR point cloud image.
  • the control unit 150 may determine locations of the objects in the camera image based on a neural network (e.g., a tiny yolo network as described in FIG. 10) .
  • the control unit 150 may determine the locations of the one or more objects in the LiDAR point cloud image by mapping locations in the camera image to the LiDAR point cloud image.
  • the mapping of locations from a 2D camera image to a 3D LiDAR point cloud image may include a conic projection, etc.
  • the operations 520 and 530 for identifying the objects and determining the locations of the objects may be referred to as a coarse detection.
  • control unit 150 may generate a 3D shape (e.g., a 3D box) for each of the one or more objects.
  • a 3D shape e.g., a 3D box
  • the operation 540 for generating a 3D shape for the objects may be referred to as a fine detection.
  • the control unit 150 may generate a second LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects. For example, the control unit 150 may mark the first LiDAR point cloud image by the 3D shapes of the one or more objects at their corresponding locations to generate the second LiDAR point cloud image.
  • FIGs. 6A-6C are a series of schematic diagrams of generating and marking a 3D shape of an object in LiDAR point cloud image according to some embodiments of the present disclosure
  • a base station e.g., a rack of the LiDAR point or a vehicle itself
  • the control unit 150 may identify and position the object 620 by a method disclosed in process 500.
  • the control unit 150 may mark the object 620 after identifying and positioning it as shown in FIG. 6B.
  • the control unit 150 may further determine a 3D shape of the object 620 and mark the object 620 in the 3D shape as shown in FIG. 6C.
  • FIG. 7 is a flowchart illustrating an exemplary process for generating a marked camera image according to some embodiments of the present disclosure.
  • the process 700 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 700 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • the control unit 150 may obtain a first camera image.
  • the camera image may be obtained by the camera 410.
  • the camera image may be a 2D image, including one or more objects around a vehicle.
  • the control unit 150 may identify the one or more objects and the locations of the one or more objects.
  • the identification may be performed based on a neural network.
  • the neural network may include an artificial neural network, a convolutional neural network, a you only look once network, a tiny yolo network, or the like, or any combination thereof.
  • the neural network may be trained by a plurality of camera image samples in which the objects are identified manually or artificially.
  • the control unit 150 may input the first camera image into the trained neural network and the trained neural network may output the identifications and locations of the one or more objects.
  • control unit 150 may generate and mark 2D representations of 3D shapes of the one or more objects in the camera image.
  • the 2D representations of 3D shapes of the one or more objects may be generated by mapping 3D shapes of the one or more objects in LiDAR point cloud image to the camera image at the corresponding locations of the one or more objects. Detailed methods regarding the generation of the 2D representations of 3D shapes of the one or more objects in the camera image may be found in FIG. 8.
  • FIG. 8 is a flowchart illustrating an exemplary process for generating 2D representations of 3D shapes of the one or more objects in the camera image according to some embodiments of the present disclosure.
  • the process 800 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 800 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • control unit 150 may obtain a 2D shape of the one or more target objects in the first camera image.
  • the first camera image may only include part of all the objects in the first LiDAR point cloud image.
  • objects that occur in both the first camera image and the first LiDAR point cloud image may be referred to as target objects in the present application.
  • a 2D shape described in present disclosure may include but not limited to a triangle, a rectangle (also referred to as a 2D box) , a square, a circle, an oval, and a polygon.
  • a 3D shape described in present disclosure may include but not limited to a cuboid (also referred to as a 3D box) , a cube, a sphere, a polyhedral, and a cone.
  • the 2D representation of 3D shape may be a 2D shape that looks like a 3D shape.
  • the 2D shape of the one or more target objects may be generated by executing a neural network.
  • the neural network may include an artificial neural network, a convolutional neural network, a you only look once (yolo) network, a tiny yolo network, or the like, or any combination thereof.
  • the neural network may be trained by a plurality of camera image samples in which 2D shapes, locations, and types of the objects are identified manually or artificially.
  • the control unit 150 may input the first camera image into the trained neural network and the trained neural network may output the types, locations and 2D shapes of the one or more target objects.
  • the neural network may generate a camera image in which the one or more objects are marked with 2D shapes (e.g., 2D boxes) based on the first camera image.
  • control unit 150 may correlate the first camera image with the first LiDAR point cloud image.
  • a distance between each of the one or more target objects and the base station (e.g., the vehicle or the rack of the LiDAR device and camera on the vehicle) in the first camera image and the first LiDAR point cloud image may be measured and correlated.
  • the control unit 150 may correlate the distance between a target object and the base station in the first camera image with that in the first LiDAR point cloud image.
  • the size of 2D or 3D shape of the target object in the first camera image may be correlated with that in the first LiDAR point cloud image by the control unit 150.
  • the size of the target object and the distance between the target object and the base station in the first camera image may be proportional to that in the first LiDAR point cloud image.
  • the correlation between the first camera image and the first LiDAR point cloud image may include a mapping relationship or a conversion of coordinate between them.
  • the correlation may include a conversion from a 3D cartesian coordinate to a 2D plane of a 3D spherical coordinate centered at the base station.
  • control unit 150 may generate 2D representations of 3D shapes of the target objects based on the 2D shapes of the target objects and the correlation between the LiDAR point cloud image and the first camera image.
  • control unit 150 may first perform a registration between the 2D shapes of the target objects in the camera image and the 3D shapes of the target objects in the LiDAR point cloud image. The control unit 150 may then generate the 2D representations of 3D shapes of the target objects based on the 3D shapes of the target objects in the LiDAR point cloud image and the correlation. For example, the control unit 150 may perform a simulated conic projection from a center at the base station, and generate 2D representations of 3D shapes of the target objects at the plane of the 2D camera image based on the correlation between the LiDAR point cloud image and the first camera image.
  • control unit 150 may generate a second camera image by marking the one or more target objects in the first camera image based on their 2D representations of 3D shapes and the identified location in the first camera image.
  • FIGs. 9A and 9B are schematic diagrams of same 2D camera images of a car according to some embodiments of the present disclosure.
  • a vehicle 910 is identified and positioned, and a 2D box is marked on it.
  • the control unit 150 may perform a method disclosed in present application (e.g., process 800) to generate a 2D representation of a 3D box of the car.
  • the 2D representation of the 3D box of the car is marked on the car as shown in FIG. 9B.
  • FIG. 9B indicates not only the size of the car but also a depth of car in an axis perpendicular to the plane of the camera image and thus is better in understanding the location of the car.
  • FIG. 10 is a schematic diagram of a you only look once (yolo) network according to some embodiments of the present disclosure.
  • a yolo network may be a neural network that divides a camera image into regions and predicts bounding boxes and probabilities for each region.
  • the yolo network may be a multilayer neural network (e.g., including multiple layers) .
  • the multiple layers may include at least one convolutional layer (CONV) , at least one pooling layer (POOL) , and at least one fully connected layer (FC) .
  • CONV convolutional layer
  • POOL pooling layer
  • FC fully connected layer
  • the multiple layers of the yolo network may correspond to neurons arranged multiple dimensions, including but not limited to width, height, center coordinate, confidence, and classification.
  • the CONV layer may connect neurons to local regions and compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and the regions they are connected to.
  • the POOL layer may perform a down sampling operation along the spatial dimensions (width, height) resulting in a reduced volume.
  • the function of the POOL layer may include progressively reducing the spatial size of the representation to reduce the number of parameters and computation in the network, and hence to also control overfitting.
  • the POOL Layer operates independently on every depth slice of the input and resizes it spatially, using the MAX operation.
  • each neuron in the FC layer may be connected to all the values in the previous volume and the FC layer may compute the classification scores.
  • 1010 may be an initial image in a volume of e.g., [448*448*3] , wherein “448” relates to a resolution (or number of pixels) and “3” relates to channels (RGB 3 channels) .
  • Images 1020-1070 may be intermediate images generated by multiple CONV layers and POOL layers. It may be noticed that the size of the image reduces and the dimension increases from image 1010 to 1070.
  • the volume of image 1070 may be [7*7*1024] , and the size of the image 1070 may not be reduced any more by extra CONV layers. Two fully connected layers may be arranged after 1070 to generate images 1080 and 1090.
  • Image 1090 may divide the original image into 49 regions, each region containing 30 dimensions and responsible for predicting a bounding box.
  • the 30 dimensions may include x, y, width, height for the bounding box’s rectangle, a confidence score, and a probability distribution over 20 classes. If a region is responsible for predicting a number of bounding boxes, the dimension may be multiplied by the corresponding number. For example, if a region is responsible for predicting 5 bounding boxes, the dimension of 1090 may be 150.
  • a tiny yolo network may be a network with similar structure but fewer layers than a yolo network, e.g., fewer convolutional layers and fewer pooling layers.
  • the tiny yolo network may be based off of the Darknet reference network and may be much faster but less accurate than a normal yolo network.
  • FIG. 11 is a flowchart illustrating an exemplary process for identifying the objects in a LiDAR point cloud image according to some embodiments of the present disclosure.
  • the process 1100 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 1100 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • control unit 150 may obtain coordinates of a plurality of points (or voxels) in the LiDAR point cloud image (e.g., the first LiDAR point image) .
  • the coordinate of each of the plurality of points may be a relative coordinate corresponding to an origin (e.g., the base station or the source of the laser beam) .
  • the control unit 150 may remove uninterested points from the plurality of points according to their coordinates.
  • the uninterested points may be points that are of too low (e.g., ground) position or too high (e.g., at a height that cannot be an object to avoid or to consider when driving) position in the LiDAR point cloud image.
  • the control unit 150 may cluster the remaining points in the plurality of points in the LiDAR point cloud image into one or more clusters based on a point cloud clustering algorithm.
  • a spatial distance (or a Euclidean distance) between any two of the remaining points in a 3D cartesian coordinate system may be measured and compared with a threshold. If the spatial distance between two points is less than or equal to the threshold, the two points are considered from a same object and clustered into a same cluster.
  • the threshold may vary dynamically based on the distances between remaining points.
  • the remaining points may be swing scanned before clustered into the plurality of clusters.
  • the swing scanning may include converting the remaining points in the 3D point cloud image from a 3D cartesian coordinate system to a polar coordinate system.
  • the polar coordinate system may include an origin or a reference point.
  • the polar coordinate of each of the remaining points may be expressed as a straight-line distance from the origin, and an angle from the origin to the point.
  • a graph may be generated based on the polar coordinates of the remaining points (e.g., angle from the origin as x-axis or horizontal axis and distance from the origin as y-axis or vertical axis) .
  • the points in the graph may be connected to generate a curve that includes sections with large curvatures and sections with small curvatures.
  • Points on a section of the curve with a small curvature are likely the points on a same object and may be clustered into a same cluster. Points on a section of the curve with a large curvature are likely the points on different objects and may be clustered into different clusters.
  • the point cloud clustering algorithm may include employing a pre-trained clustering model.
  • the clustering model may include a plurality of classifiers with pre-trained parameters. The clustering model may be further updated when clustering the remaining points.
  • control unit 150 may select at least one of the one or more clusters as a target cluster. For example, some of the one or more clusters are not at a size of any meaningful object, such as a size of a leave, a plastic bag, or a water bottle and may be removed. In some embodiments, only the cluster that satisfies a predetermined size of the objects may be selected as the target cluster.
  • FIGs. 12A-12E are a series of schematic diagrams of identifying an object in a LiDAR point cloud image according to some embodiments of the present disclosure.
  • FIG. 12A is a schematic LiDAR point cloud image around a vehicle 1210.
  • the control unit 150 may obtain the coordinates of the points in FIG. 12 A and may remove points that are too low or too high to generate the FIG. 12B. Then the control unit 150 may swing scan the points in the FIG. 12 B and measure a distance and angle of each of the points in the FIG. 12B from a reference point or origin as shown in FIG. 12C. The control unit 150 may further cluster the points into one or more clusters based on the distances and angles as shown in FIG. 12D.
  • the control unit 150 may extract the cluster of the one or more clusters individually as shown in FIG. 12E and generate a 3D shape of the objects in the extracted cluster. Detailed methods regarding the generation of 3D shape of the objects in the extracted cluster may be found elsewhere in present disclosure (See, e.g., FIG. 13 and the descriptions thereof) .
  • FIG. 13 is a flowchart illustrating an exemplary process for generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure.
  • the process 1300 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 1300 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • control unit 150 may determine a preliminary 3D shape of the object.
  • the preliminary 3D shape may be a voxel, a cuboid (also referred to as 3D box) , a cube, etc.
  • the control unit 150 may determine a center point of the object. The center point of the object may be determined based on the coordinates of the points in the object. For example, the control unit 150 may determine the center point as the average value of the coordinates of the points in the object. Then the control unit 150 may place the preliminary 3D shape at the centered point of the object (e.g., clustered and extracted LiDAR point cloud image of the object) . For example, a cuboid of a preset size may be placed on the center point of the object by the control unit 150.
  • the LiDAR point cloud image only includes points of the surface of objects that reflect a laser beam, the points only reflects the surface shape of the objects.
  • the distribution of the points of an object may tight along a contour of the shape of the object. No points are inside the contour and no points are outside the contour. In reality, however, because of measurement errors, the points are scattered around the contour. Therefore, a shape proposal may be needed to identify a rough shape of the object for the purpose of autonomous driving.
  • the control unit 150 may tune up the 3D shape to obtain an ideal size, shape, orientation, and position and use the 3D shape to serve as the shape proposal.
  • the control unit 150 may adjust at least one of parameters including a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal.
  • the operations 1320 (and operations 1330 and 1340) may be performed iteratively. In each iteration, one or more of the parameters may be adjusted. For example, the height of the 3D shape is adjusted in the first iteration, and the length of the 3D shape is adjusted in the second iteration. As another example, both the height and length of the 3D shape are adjusted in the first iteration, and the height and width of the 3D shape are adjusted in the second iteration.
  • the adjustment of the parameters may be an increment or a decrement. Also, the adjustment of the parameter in each iteration may be same or different. In some embodiments, the adjustment of height, width, length and yaw may be employed based on a grid searching method.
  • An ideal shape proposal should serve as a reliable reference shape for the autonomous vehicle to plan its driving path.
  • the driving path should guarantee that the vehicle can accurately plan its driving path to safely drive around the object, but at the same time operate a minimum degree of turning to left or right to ensure the driving as smooth as possible.
  • the shape proposal may not be required to precisely describe the shape of the object, but must be big enough to cover the object so that the an autonomous vehicle may reliably rely on the shape proposal to determine a driving path without colliding and/or crashing into the object.
  • the shape proposal may not be unnecessarily big either to affect the efficiency of driving path in passing around the object.
  • control unit 150 may evaluate a loss function, which serves as a measure how good the shape proposal is in describing the object for purpose of autonomous driving path planning. The lesser the score or value of the loss function, the better the shape proposal describes the object.
  • the control unit 150 may calculate a score (or a value) of the loss function of the 3D shape proposal.
  • the loss function may include three parts: L inbox , L suf and L other .
  • the loss function of the 3D shape proposal may be expressed as follows:
  • L may denote an overall score of the 3D shape proposal
  • L inbox may denote a score of the 3D shape proposal relating to the number of points of the object inside the 3D shape proposal
  • L suf may denote a score describing how close the 3D shape proposal is to the true shape of the object, measured by distances of the points to the surface of the shape proposal.
  • a smaller score of L suf means the 3D shape proposal is closer to the surface shape or contour of the object.
  • L suf (car) may denote a score of the 3D shape proposal relating to distances between points of a car and the surface of the 3D shape proposal
  • L suf (ped) may denote a score of the 3D shape proposal relating to distances between points of a pedestrian and the surface of the 3D shape proposal
  • L other may denote a score of the 3D shape proposal due to other bonuses or penalties.
  • N may denote number of points
  • P _all may denote all the points of the object
  • P _out may denote points outside the 3D shape proposal
  • P _in may denote points inside the 3D shape proposal
  • P _behind may denote points behind the 3D shape proposal (e.g., points on the back side of the 3D shape proposal)
  • dis may denote distance from the points of the object to the surface of the 3D shape proposal.
  • m, n, a, b and c are constants.
  • m may be 2.0
  • n may be 1.5
  • a may be 2.0
  • b may be 0.6
  • c may be 1.2.
  • L inbox may be configured to minimize the number of points inside the 3D shape proposal. Therefore, the less the number of points inside, the smaller the score of L inbox .
  • L surf may be configured to encourage certain shape and orientation of the 3D shape proposal so that the points close to the surface of the 3D shape proposal are as much as possible. Accordingly, the smaller the accumulative distances of the points to the surface of the 3D shape proposal, the smaller the score of L surf .
  • L other is configured to encourage a nice and dense cluster of points, i.e., the number of point cluster is larger and the volume of the 3D shape proposal is smaller.
  • f (N) is defined as a function with respect to the total number of points in the 3D shape proposal, i.e., the more points in the 3D shape proposal, the better the loss function, thereby the lesser the score of f (N) ;
  • L min (V) is defined as a restrain to the volume of the 3D shape proposal, which try to minimize the volume of the 3D shape proposal, i.e., the smaller the volume of the 3D shape proposal, the smaller the score of L min (V) .
  • the loss function L in equation (1) incorporates balanced consideration of different factors that encourage the 3D shape proposal to be close to the contour of the object without being unnecessarily big.
  • the control unit 150 may determine whether the score of the 3D shape proposal satisfies a preset condition.
  • the preset condition may include that the score is less than or equal to a threshold, the score doesn’t change over a number of iterations, a certain number of iterations is performed, etc.
  • the process 1300 may proceed back to 1320; otherwise, the process 1300 may proceed to 1360.
  • the control unit 150 may further adjust the 3D shape proposal.
  • the parameters that are adjusted in subsequent iterations may be different from the current iteration.
  • the control unit 150 may perform a first set of adjustments on the height of the 3D shape proposal in the first five iterations. After finding that the score of the 3D shape proposal cannot be reduced lower than the threshold by only adjusting the height.
  • the control unit 150 may perform a second set of adjustments on the width, the length, the yaw of the 3D shape proposal in the next 10 iterations.
  • the score of the 3D shape proposal may still be higher than the threshold after the second adjustment, and the control unit 150 may perform a third set of adjustments on the orientation (e.g., the location or center point) of the 3D shape proposal.
  • the adjustments of parameters may be performed in any order and the number and type of parameters in each adjustment may be same or different.
  • control unit 150 may determine the 3D shape proposal as the 3D shape of the object (or nominal 3D shape of the object) .
  • FIGs. 14A-14D are a series of schematic diagrams of generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure.
  • FIG. 14A is a clustered and extracted LiDAR point cloud image of an object.
  • the control unit 150 may generate a preliminary 3D shape and may adjust a height, a width, a length, and a yaw of the preliminary 3D shape to generate a 3D shape proposal as shown in FIG. 14B. After the adjustment of the height, width, length, and yaw, the control unit 150 may further adjust the orientation of the 3D shape proposal as shown in FIG. 14C. Finally, a 3D shape proposal that satisfies a preset condition as described in the description of the process 1300 may be determined as an 3D shape of the object and may be marked on the object as shown in FIG. 14D.
  • FIG. 15 is a flow chart illustrating an exemplary process for generating a compensated image according to some embodiments of the present disclosure.
  • the process 1500 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 1500 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • the control unit 150 may obtain a first radar image around a base station.
  • the first radar image may be generated by the radar device 430.
  • the radar device 430 may be less precise (with less resolution) but may have a wider detection range. For example, a LiDAR device 420 may only receive a reflected laser beam at a reasonable quality from an object within 35 meters. However, the radar device 430 may receive reflected radio waves from an object hundreds of meters away.
  • control unit 150 may identify the one or more objects in the first radar image.
  • the method of identifying the one or more objects in the first radar image may be similar to that of the first LiDAR point cloud image, and is not repeated herein.
  • control unit 150 may determine one or more locations of the one or more objects in the first radar image.
  • the method of determining the one or more locations of the one or more objects in the first radar image may be similar to that in the first LiDAR point cloud image, and is not repeated herein.
  • the control unit 150 may generate a 3D shape for each of the one or more objects in the first radar image.
  • the method of generating the 3D shape for each of the one or more objects in the first radar image may be similar to that in the first LiDAR point cloud image.
  • the control unit 150 may obtain dimensions and center point of a front surface each of the one or more objects.
  • the 3D shape of an object may be generated simply by extending the front surface in a direction of the body of the object.
  • control unit 150 may mark the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image to generate a second Radar image.
  • the control unit 150 may fuse the second Radar image and the second LiDAR point cloud image to generate a compensated image.
  • the LiDAR point cloud image may have higher resolution and reliability near the base station than the radar image, and the radar image may have higher resolution and reliability away from the base station than the LiDAR point cloud image.
  • the control unit 150 may divide the second radar image and second LiDAR point cloud image into 3 sections, 0 to 30 meters, 30 to 50 meters, and greater than 50 meters from the base station.
  • the second radar image and second LiDAR point cloud image may be fused in a manner that only the LiDAR point cloud image is retained from 0 to 30 meters, and only the radar image is retained greater than 50 meters.
  • the greyscale value of voxels from 30 to 50 meters of the second radar image and the second LiDAR point cloud image may be averaged.
  • FIG. 16 is a schematic diagram of a synchronization between camera, LiDAR device, and/or radar device according to some embodiments of the present disclosure.
  • the frame rates of a camera e.g., camera 410) , a LiDAR device (e.g., LiDAR device 420) and a radar device (e.g., radar device 430) are different.
  • a camera image, a LiDAR point cloud image, and a radar image may be generated roughly at the same time (e.g., synchronized) .
  • the subsequent images are not synchronized due to the different frame rates.
  • a device with slowest frame rate among the camera, the LiDAR device, and the radar device may be determined (In the example of FIG. 16, it’s the camera) .
  • the control unit 150 may record each of the time frames of the camera images that camera captured and may search for other LiDAR images and radar images that are close to the time of each of the time frames of the camera images.
  • a corresponding LiDAR image and a corresponding radar image may be obtained.
  • a camera image 1610 is obtained at T2
  • the control unit 150 may search for a LiDAR image and a radar image that are closest to T2 (e.g., the LiDAR image 1620 and radar image 1630) .
  • the camera image and the corresponding LiDAR image and radar image are extracted as a set.
  • the three images in a set is assumed to be obtained at the same time and synchronized.
  • FIG. 17 is a flow chart illustrating an exemplary process for generating a LiDAR point cloud image or a video based on existing LiDAR point cloud images according to some embodiments of the present disclosure.
  • the process 1700 may be implemented in the autonomous vehicle as illustrated in FIG. 1.
  • the process 1700 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) .
  • the present disclosure takes the control unit 150 as an example to execute the instruction.
  • control unit 150 may obtain two first LiDAR point cloud images around a base station at two different time frames.
  • the two different time frames may be taken successively by a same LiDAR device.
  • control unit 150 may generate two second LiDAR point cloud images based on the two first LiDAR point cloud images.
  • the method of generating the two second LiDAR point cloud images from the two first LiDAR point cloud images may be found in process 500.
  • control unit 150 may generate a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
  • FIG. 18 is a schematic diagram of validating and interpolating frames of images according to some embodiments of the present disclosure.
  • the radar images, the camera images, and the LiDAR images are synchronized (e.g., by a method disclosed in FIG. 16) .
  • Additional camera images are generated between existing camera images by an interpolation method.
  • the control unit 150 may generate a video based on the camera images.
  • the control unit 150 may validate and modify each frame of the camera images, LiDAR images and/or radar images based on historical information.
  • the historical information may include the same or different type of images in the preceding frame or previous frames. For example, a car is not properly identified and positioned in a particular frame of a camera image. However, all of the previous 5 frames correctly identified and positioned the car.
  • the control unit 150 may modify the camera image at the incorrect frame based on the camera images at previous frames and LiDAR images and/or radar images at the incorrect frame and previous frames.
  • aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “unit, ” “module, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
  • a non-transitory computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the "C" programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .
  • LAN local area network
  • WAN wide area network
  • SaaS Software as a Service
  • the numbers expressing quantities, properties, and so forth, used to describe and claim certain embodiments of the application are to be understood as being modified in some instances by the term “about, ” “approximate, ” or “substantially. ”
  • “about, ” “approximate, ” or “substantially” may indicate ⁇ 20%variation of the value it describes, unless otherwise stated.
  • the numerical parameters set forth in the written description and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by a particular embodiment.
  • the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the application are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Remote Sensing (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Optics & Photonics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

Systems and methods for identifying and positioning one or more objects around a vehicle are provided. The method may include obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station. The method may further include identifying one or more objects in the first LiDAR point cloud image and determining one or more locations of the one or more objects in the first LiDAR point image. The method may further include generating a 3D shape for each of the one or more objects; and generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.

Description

SYSTEMS AND METHODS FOR IDENTIFYING AND POSITIONING OBJECTS AROUND A VEHICLE TECHNICAL FIELD
The present disclosure generally relates to object identification, and in particular, to methods and systems for identifying and positioning objects around a vehicle during autonomous driving.
BACKGROUND
Autonomous driving technology is developing rapidly in recent years. Vehicles using autonomous driving technology may sense its environment and navigate automatically. Some of the autonomous vehicles still require human’s input and work as a driving aid. Some of the autonomous vehicles drives completely on their own. However, the ability of correctly identifying and positioning objects around the vehicle is important for any type of autonomous vehicles. The conventional method may include mounting a camera on the vehicle and analyzing the objects in images captured by the camera. However, the camera images are normally 2-dimentional (2D) and hence depth information of objects cannot be obtained easily. A radio detection and ranging (Radar) and a Light Detection and Ranging (LiDAR) device may be employed to obtain 3-dimentional (3D) images around the vehicle, but the objects therein are generally mixed with noises and difficult to be identified and positioned. Also, images generated by Radar and LiDAR device are difficult for humans to understand.
SUMMARY
In one aspect of the present disclosure, a system for driving aid is provided. The system may include a control unit including one or more  storage media including a set of instructions for identifying and positioning one or more objects around a vehicle, and one or more microchips electronically connected to the one or more storage media. During operation of the system, the one or more microchips may execute the set of instructions to obtain a first light detection and ranging (LiDAR) point cloud image around a detection base station; The one or more microchips may further execute the set of instructions to identify one or more objects in the first LiDAR point cloud image and determine one or more locations of the one or more objects in the first LiDAR point image. The one or more microchips may further execute the set of instructions to generate a 3D shape for each of the one or more objects, and generate a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
In some embodiments, the system may further include at least one LiDAR device in communication with the control unit to send the LiDAR point cloud image to the control unit, at least one camera in communication with the control unit to send a camera image to the control unit, and at least one radar device in communication with the control unit to send a radar image to the control unit.
In some embodiments, the base station may be a vehicle, and the system may further include at least one LiDAR device mounted on a steering wheel, a cowl or reflector of the vehicle, wherein the mounting of the at least one LiDAR device may include at least one of an adhesive bonding, a bolt and nut connection, a bayonet fitting, or a vacuum fixation.
In some embodiments, the one or more microchips may further obtain a first camera image including at least one of the one or more objects, identify at least one target object of the one or more objects in the first camera image and at least one target location of the at least one target object in the first camera image, and generate a second camera image by marking the at least  one target object in the first camera image based on the at least one target location in the first camera image and the 3D shape of the at least one target object in the LiDAR point cloud image.
In some embodiments, in marking the at least one target object in the first camera image, the one or more microchips may further obtain a 2D shape of the at least one target object in the first camera image, correlate the LiDAR point cloud image with the first camera image, generate a 3D shape of the at least one target object in the first camera image based on the 2D shape of the at least one target object and the correlation between the LiDAR point cloud image and the first camera image, and generate a second camera image by marking the at least one target object in the first camera image based on the identified location in the first camera image and the 3D shape of the at least one target object in the first camera image.
In some embodiments, to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image, the one or more microchips may operate a you only look once (YOLO) network or a Tiny-YOLO network to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image.
In some embodiments, to identify the one or more objects in the first LiDAR point cloud image, the one or more microchips may further obtain coordinates of a plurality of points in the first LiDAR point cloud image, wherein the plurality of points includes uninterested points and remaining points, remove the uninterested points from the plurality of points according to the coordinates, cluster the remaining points into one or more clusters based on a point cloud clustering algorithm, and select at least one of the one or more clusters as a target cluster, each of the target cluster corresponding to an object.
In some embodiments, to generate a 3D shape for each of the one or  more objects, the one or more microchips may further determine a preliminary 3D shape of the object, adjust at least one of a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal, calculate a score of the 3D shape proposal, and determine whether the score of the 3D shape proposal satisfies a preset condition. In response to the determination that the score of the 3D shape proposal does not satisfy a preset condition, the one or more microchips may further adjust the 3D shape proposal. In response to the determination that the score of the 3D shape proposal or further adjusted 3D shape proposal satisfies the preset condition, the one or more microchips may determine the 3D shape proposal or further adjusted 3D shape proposal as the 3D shape of the object.
In some embodiments, the score of the 3D shape proposal is calculated based on at least one of a number of points of the first LiDAR point cloud image inside the 3D shape proposal, a number of points of the first LiDAR point cloud image outside the 3D shape proposal, or distances between points and the 3D shape.
In some embodiments, the one or more microchips may further obtain a first radio detection and ranging (Radar) image around the detection base station, identify the one or more objects in the first Radar image, determine one or more locations of the one or more objects in the first Radar image, generate a 3D shape for each of the one or more objects in the first Radar image, generate a second Radar image by marking the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image, and fuse the second Radar image and the second LiDAR point cloud image to generate a compensated image.
In some embodiments, the one or more microchips may further obtain two first LiDAR point cloud images around the base station at two different time frames, generate two second LiDAR point cloud images at the two different time frames based on the two first LiDAR point cloud images, and  generate a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
In some embodiments, the one or more microchips may further obtain a plurality of LiDAR point cloud images around the base station at a plurality of different time frames; generate a plurality of second LiDAR point cloud images at the plurality of different time frames based on the plurality of first LiDAR point cloud images; and generate a video based on the plurality of second LiDAR point cloud images.
In another aspect of the present disclosure, a method is provided. The method may be implemented on a computing device having one or more storage media storing instructions for identifying and positioning one or more objects around a vehicle, and one or more microchips electronically connected to the one or more storage media. The method may include obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station. The method may further include identifying one or more objects in the first LiDAR point cloud image, and determining one or more locations of the one or more objects in the first LiDAR point image. The method may further include generating a 3D shape for each of the one or more objects, and generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
In another aspect of the present disclosure, a non-transitory computer readable medium is provided. The non-transitory computer readable medium may include at least one set of instructions for identifying and positioning one or more objects around a vehicle. When executed by microchips of an electronic terminal, the at least one set of instructions may direct the microchips to perform acts of obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station. The at least one set of instructions may further direct the microchips to perform acts  of identifying one or more objects in the first LiDAR point cloud image, and determining one or more locations of the one or more objects in the first LiDAR point image. The at least one set of instructions may further direct the microchips to perform acts of generating a 3D shape for each of the one or more objects, and generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. The drawings are not to scale. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
FIG. 1 is a schematic diagram illustrating an exemplary scenario for autonomous vehicle according to some embodiments of the present disclosure;
Fig. 2 is a block diagram of an exemplary vehicle with an autonomous driving capability according to some embodiments of the present disclosure;
FIG. 3 is a schematic diagram illustrating exemplary hardware components of a computing device 300;
FIG. 4 is a block diagram illustrating an exemplary sensing module according to some embodiments of the present disclosure;
FIG. 5 is a flowchart illustrating an exemplary process for generating a LiDAR point cloud image on which 3D shape of objects are marked according to some embodiments of the present disclosure;
FIGs. 6A-6C are a series of schematic diagrams of generating and marking a 3D shape of an object in LiDAR point cloud image according to some embodiments of the present disclosure;
FIG. 7 is a flowchart illustrating an exemplary process for generating a marked camera image according to some embodiments of the present disclosure;
FIG. 8 is a flowchart illustrating an exemplary process for generating 2D representations of 3D shapes of the one or more objects in the camera image according to some embodiments of the present disclosure;
FIGs. 9A and 9B are schematic diagrams of same 2D camera images of a car according to some embodiments of the present disclosure;
FIG. 10 is a schematic diagram of a you only look once (yolo) network according to some embodiments of the present disclosure;
FIG. 11 is a flowchart illustrating an exemplary process for identifying the objects in a LiDAR point cloud image according to some embodiments of the present disclosure;
FIGs. 12A-12E are a series of schematic diagrams of identifying an object in a LiDAR point cloud image according to some embodiments of the present disclosure;
FIG. 13 is a flowchart illustrating an exemplary process for generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure;
FIGs. 14A-14D are a series of schematic diagrams of generating a 3D shape of an object in a LiDAR point cloud image according to some  embodiments of the present disclosure;
FIG. 15 is a flow chart illustrating an exemplary process for generating a compensated image according to some embodiments of the present disclosure;
FIG. 16 is a schematic diagram of a synchronization between camera, LiDAR device, and/or radar device according to some embodiments of the present disclosure;
FIG. 17 is a flow chart illustrating an exemplary process for generating a LiDAR point cloud image or a video based on existing LiDAR point cloud images according to some embodiments of the present disclosure;
FIG. 18 is a schematic diagram of validating and interpolating frames of images according to some embodiments of the present disclosure;
DETAILED DESCRIPTION
The following description is presented to enable any person skilled in the art to make and use the present disclosure, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the claims.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a, ” “an, ” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprise, ” “comprises, ” and/or “comprising, ” “include, ” “includes, ” and/or “including, ” when used in this specification, specify the presence of stated features, integers, steps,  operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the present disclosure, the term “autonomous vehicle” may refer to a vehicle capable of sensing its environment and navigating without human (e.g., a driver, a pilot, etc. ) input. The term “autonomous vehicle” and “vehicle” may be used interchangeably. The term “autonomous driving” may refer to ability of navigating without human (e.g., a driver, a pilot, etc. ) input.
These and other features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this disclosure. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure. It is understood that the drawings are not to scale.
The flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments in the present disclosure. It is to be expressly understood, the operations of the flowchart may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.
The positioning technology used in the present disclosure may be based on a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a Galileo positioning system, a quasi-zenith satellite system (QZSS) , a wireless fidelity (WiFi) positioning technology, or the like, or any combination thereof. One or  more of the above positioning systems may be used interchangeably in the present disclosure.
Moreover, while the systems and methods disclosed in the present disclosure are described primarily regarding a driving aid for identifying and positioning objects around a vehicle, it should be understood that this is only one exemplary embodiment. The system or method of the present disclosure may be applied to any other kind of navigation system. For example, the system or method of the present disclosure may be applied to transportation systems of different environments including land, ocean, aerospace, or the like, or any combination thereof. The autonomous vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a train, a bullet train, a high-speed rail, a subway, a vessel, an aircraft, a spaceship, a hot-air balloon, a driverless vehicle, or the like, or any combination thereof. In some embodiments, the system or method may find applications in, e.g., logistic warehousing, military affairs.
An aspect of the present disclosure relates to a driving aid for identifying and positioning objects around a vehicle during autonomous driving. For example, a camera, a LiDAR device, a Radar device may be mounted on a roof of an autonomous car. The camera, the LiDAR device and the Radar device may obtain a camera image, a LiDAR point cloud image, and a Radar image around the car respectively. The LiDAR point cloud image may include a plurality of points. A control unit may cluster the plurality of points into multiple clusters, wherein each cluster may correspond to an object. The control unit may determine a 3D shape for each object and mark the 3D shape on the LiDAR point cloud image. The control unit may also correlate the LiDAR point cloud image with the camera image to generate and mark a 2D representation of 3D shape of the objects on the camera image. The marked LiDAR point cloud image and camera image are better in understanding the location and movement of the objects. The  control unit may further generate a video of the movement of the objects based on marked camera images. The vehicle or a driver therein may adjust speed and movement direction of the vehicle based on the generate video or images to avoid colliding the objects.
FIG. 1 is a schematic diagram illustrating an exemplary scenario for autonomous vehicle according to some embodiments of the present disclosure. As shown in FIG. 1, an autonomous vehicle 130 may travel along a road 121 without human input along a path autonomously determined by the autonomous vehicle 130. The road 121 may be a space prepared for a vehicle to travel along. For example, the road 121 may be a road for vehicles with wheel (e.g. a car, a train, a bicycle, a tricycle, etc. ) or without wheel (e.g., a hovercraft) , may be an air lane for an air plane or other aircraft, and may be a water lane for ship or submarine, may be an orbit for satellite. Travel of the autonomous vehicle 130 may not break traffic law of the road 121 regulated by law or regulation. For example, speed of the autonomous vehicle 130 may not exceed speed limit of the road 121.
The autonomous vehicle 130 may not collide an obstacle 110 by travelling along a path 120 determined by the autonomous vehicle 130. The obstacle 110 may be a static obstacle or a dynamic obstacle. The static obstacle may include a building, tree, roadblock, or the like, or any combination thereof. The dynamic obstacle may include moving vehicles, pedestrians, and/or animals, or the like, or any combination thereof.
The autonomous vehicle 130 may include conventional structures of a non-autonomous vehicle, such as an engine, four wheels, a steering wheel, etc. The autonomous vehicle 130 may further include a sensing system 140, including a plurality of sensors (e.g., a sensor 142, a sensor 144, a sensor 146) and a control unit 150. The plurality of sensors may be configured to provide information that is used to control the vehicle. In some embodiments, the sensors may sense status of the vehicle. The status of  the vehicle may include dynamic situation of the vehicle, environmental information around the vehicle, or the like, or any combination thereof.
In some embodiments, the plurality of sensors may be configured to sense dynamic situation of the autonomous vehicle 130. The plurality of sensors may include a distance sensor, a velocity sensor, an acceleration sensor, a steering angle sensor, a traction-related sensor, a camera, and/or any sensor.
For example, the distance sensor (e.g., a radar, a LiDAR, an infrared sensor) may determine a distance between a vehicle (e.g., the autonomous vehicle 130) and other objects (e.g., the obstacle 110) . The distance sensor may also determine a distance between a vehicle (e.g., the autonomous vehicle 130) and one or more obstacles (e.g., static obstacles, dynamic obstacles) . The velocity sensor (e.g., a Hall sensor) may determine a velocity (e.g., an instantaneous velocity, an average velocity) of a vehicle (e.g., the autonomous vehicle 130) . The acceleration sensor (e.g., an accelerometer) may determine an acceleration (e.g., an instantaneous acceleration, an average acceleration) of a vehicle (e.g., the autonomous vehicle 130) . The steering angle sensor (e.g., a tilt sensor) may determine a steering angle of a vehicle (e.g., the autonomous vehicle 130) . The traction-related sensor (e.g., a force sensor) may determine a traction of a vehicle (e.g., the autonomous vehicle 130) .
In some embodiments, the plurality of sensors may sense environment around the autonomous vehicle 130. For example, one or more sensors may detect a road geometry and obstacles (e.g., static obstacles, dynamic obstacles) . The road geometry may include a road width, road length, road type (e.g., ring road, straight road, one-way road, two-way road) . The static obstacles may include a building, tree, roadblock, or the like, or any combination thereof. The dynamic obstacles may include moving vehicles, pedestrians, and/or animals, or the like, or any combination thereof. The  plurality of sensors may include one or more video cameras, laser-sensing systems, infrared-sensing systems, acoustic-sensing systems, thermal-sensing systems, or the like, or any combination thereof.
The control unit 150 may be configured to control the autonomous vehicle 130. The control unit 150 may control the autonomous vehicle 130 to drive along a path 120. The control unit 150 may calculate the path 120 based on the status information from the plurality of sensors. In some embodiments, the path 120 may be configured to avoid collisions between the vehicle and one or more obstacles (e.g., the obstacle 110) .
In some embodiments, the path 120 may include one or more path samples. Each of the one or more path samples may include a plurality of path sample features. The plurality of path sample features may include a path velocity, a path acceleration, a path location, or the like, or a combination thereof.
The autonomous vehicle 130 may drive along the path 120 to avoid a collision with an obstacle. In some embodiments, the autonomous vehicle 130 may pass each path location at a corresponding path velocity and a corresponding path accelerated velocity for each path location.
In some embodiments, the autonomous vehicle 130 may also include a positioning system to obtain and/or determine the position of the autonomous vehicle 130. In some embodiments, the positioning system may also be connected to another party, such as a base station, another vehicle, or another person, to obtain the position of the party. For example, the positioning system may be able to establish a communication with a positioning system of another vehicle, and may receive the position of the other vehicle and determine the relative positions between the two vehicles.
Fig. 2 is a block diagram of an exemplary vehicle with an autonomous driving capability according to some embodiments of the present disclosure. For example, the vehicle with an autonomous driving capability may include a  control system, including but not limited to a control unit 150, a plurality of  sensors  142, 144, 146, a storage 220, a network 230, a gateway module 240, a Controller Area Network (CAN) 250, an Engine Management System (EMS) 260, an Electric Stability Control (ESC) 270, an Electric Power System (EPS) 280, a Steering Column Module (SCM) 290, a throttling system 265, a braking system 275 and a steering system 295.
The control unit 150 may process information and/or data relating to vehicle driving (e.g., autonomous driving) to perform one or more functions described in the present disclosure. In some embodiments, the control unit 150 may be configured to drive a vehicle autonomously. For example, the control unit 150 may output a plurality of control signals. The plurality of control signal may be configured to be received by a plurality of electronic control units (ECUs) to control the drive of a vehicle. In some embodiments, the control unit 150 may determine a reference path and one or more candidate paths based on environment information of the vehicle. In some embodiments, the control unit 150 may include one or more processing engines (e.g., single-core processing engine (s) or multi-core processor (s) ) . Merely by way of example, the control unit 150 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
The storage 220 may store data and/or instructions. In some embodiments, the storage 220 may store data obtained from the autonomous vehicle 130. In some embodiments, the storage 220 may store data and/or instructions that the control unit 150 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the  storage 220 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyrisor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
In some embodiments, the storage 220 may be connected to the network 230 to communicate with one or more components of the autonomous vehicle 130 (e.g., the control unit 150, the sensor 142) . One or more components in the autonomous vehicle 130 may access the data or instructions stored in the storage 220 via the network 230. In some embodiments, the storage 220 may be directly connected to or communicate with one or more components in the autonomous vehicle 130 (e.g., the control unit 150, the sensor 142) . In some embodiments, the storage 220 may be part of the autonomous vehicle 130.
The network 230 may facilitate exchange of information and/or data. In some embodiments, one or more components in the autonomous vehicle 130 (e.g., the control unit 150, the sensor 142) may send information and/or  data to other component (s) in the autonomous vehicle 130 via the network 230. For example, the control unit 150 may obtain/acquire dynamic situation of the vehicle and/or environment information around the vehicle via the network 230. In some embodiments, the network 230 may be any type of wired or wireless network, or combination thereof. Merely by way of example, the network 230 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof. In some embodiments, the network 230 may include one or more network access points. For example, the network 230 may include wired or wireless network access points such as base stations and/or internet exchange points 230-1, ..., through which one or more components of the autonomous vehicle 130 may be connected to the network 230 to exchange data and/or information.
The gateway module 240 may determine a command source for the plurality of ECUs (e.g., the EMS 260, the EPS 280, the ESC 270, the SCM 290) based on a current driving status of the vehicle. The command source may be from a human driver, from the control unit 150, or the like, or any combination thereof.
The gateway module 240 may determine the current driving status of the vehicle. The driving status of the vehicle may include a manual driving status, a semi-autonomous driving status, an autonomous driving status, an error status, or the like, or any combination thereof. For example, the gateway module 240 may determine the current driving status of the vehicle to be a manual driving status based on an input from a human driver. For another example, the gateway module 240 may determine the current driving  status of the vehicle to be a semi-autonomous driving status when the current road condition is complex. As still another example, the gateway module 240 may determine the current driving status of the vehicle to be an error status when abnormalities (e.g., a signal interruption, a processor crash) happen.
In some embodiments, the gateway module 240 may transmit operations of the human driver to the plurality of ECUs in response to a determination that the current driving status of the vehicle is a manual driving status. For example, the gateway module 240 may transmit a press on the accelerator done by the human driver to the EMS 260 in response to a determination that the current driving status of the vehicle is a manual driving status. The gateway module 240 may transmit the control signals of the control unit 150 to the plurality of ECUs in response to a determination that the current driving status of the vehicle is an autonomous driving status. For example, the gateway module 240 may transmit a control signal associated with steering to the SCM 290 in response to a determination that the current driving status of the vehicle is an autonomous driving status. The gateway module 240 may transmit the operations of the human driver and the control signals of the control unit 150 to the plurality of ECUs in response to a determination that the current driving status of the vehicle is a semi-autonomous driving status. The gateway module 240 may transmit an error signal to the plurality of ECUs in response to a determination that the current driving status of the vehicle is an error status.
A Controller Area Network (CAN bus) is a robust vehicle bus standard (e.g., a message-based protocol) allowing microcontrollers (e.g., the control unit 150) and devices (e.g., the EMS 260, the EPS 280, the ESC 270, and/or the SCM 290, etc. ) to communicate with each other in applications without a host computer. The CAN 250 may be configured to connect the control unit 150 with the plurality of ECUs (e.g., the EMS 260, the EPS 280, the ESC 270,  the SCM 290) .
The EMS 260 may be configured to determine an engine performance of the autonomous vehicle 130. In some embodiments, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on the control signals from the control unit 150. For example, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on a control signal associated with an acceleration from the control unit 150 when the current driving status is an autonomous driving status. In some embodiments, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on operations of a human driver. For example, the EMS 260 may determine the engine performance of the autonomous vehicle 130 based on a press on the accelerator done by the human driver when the current driving status is a manual driving status.
The EMS 260 may include a plurality of sensors and a micro-processor. The plurality of sensors may be configured to detect one or more physical signals and convert the one or more physical signals to electrical signals for processing. In some embodiments, the plurality of sensors may include a variety of temperature sensors, an air flow sensor, a throttle position sensor, a pump pressure sensor, a speed sensor, an oxygen sensor, a load sensor, a knock sensor, or the like, or any combination thereof. The one or more physical signals may include an engine temperature, an engine intake air volume, a cooling water temperature, an engine speed, or the like, or any combination thereof. The micro-processor may determine the engine performance based on a plurality of engine control parameters. The micro-processor may determine the plurality of engine control parameters based on the plurality of electrical signals. The plurality of engine control parameters may be determined to optimize the engine performance. The plurality of engine control parameters may include an ignition timing, a fuel delivery, an idle air flow, or the like, or any combination thereof.
The throttling system 265 may be configured to change motions of the autonomous vehicle 130. For example, the throttling system 265 may determine a velocity of the autonomous vehicle 130 based on an engine output. For another example, the throttling system 265 may cause an acceleration of the autonomous vehicle 130 based on the engine output. The throttling system 365 may include fuel injectors, a fuel pressure regulator, an auxiliary air valve, a temperature switch, a throttle, an idling speed motor, a fault indicator, ignition coils, relays, or the like, or any combination thereof.
In some embodiments, the throttling system 265 may be an external executor of the EMS 260. The throttling system 265 may be configured to control the engine output based on the plurality of engine control parameters determined by the EMS 260.
The ESC 270 may be configured to improve the stability of the vehicle. The ESC 270 may improve the stability of the vehicle by detecting and reducing loss of traction. In some embodiments, the ESC 270 may control operations of the braking system 275 to help steer the vehicle in response to a determination that a loss of steering control is detected by the ESC 270. For example, the ESC 270 may improve the stability of the vehicle when the vehicle starts on an uphill slope by braking. In some embodiments, the ESC 270 may further control the engine performance to improve the stability of the vehicle. For example, the ESC 270 may reduce an engine power when a probable loss of steering control happens. The loss of steering control may happen when the vehicle skids during emergency evasive swerves, when the vehicle understeers or oversteers during poorly judged turns on slippery roads, etc.
The braking system 275 may be configured to control a motion state of the autonomous vehicle 130. For example, the braking system 275 may decelerate the autonomous vehicle 130. For another example, the braking system 275 may stop the autonomous vehicle 130 in one or more road  conditions (e.g., a downhill slope) . As still another example, the braking system 275 may keep the autonomous vehicle 130 at a constant velocity when driving on a downhill slope.
The braking system 275 man include a mechanical control component, a hydraulic unit, a power unit (e.g., a vacuum pump) , an executing unit, or the like, or any combination thereof. The mechanical control component may include a pedal, a handbrake, etc. The hydraulic unit may include a hydraulic oil, a hydraulic hose, a brake pump, etc. The executing unit may include a brake caliper, a brake pad, a brake disc, etc.
The EPS 280 may be configured to control electric power supply of the autonomous vehicle 130. The EPS 280 may supply, transfer, and/or store electric power for the autonomous vehicle 130. In some embodiments, the EPS 280 may control power supply to the steering system 295. For example, the EPS 280 may supply a large electric power to the steering system 295 to create a large steering torque for the autonomous vehicle 130, in response to a determination that a steering wheel is turned to a limit (e.g., a left turn limit, a right turn limit) .
The SCM 290 may be configured to control the steering wheel of the vehicle. The SCM 290 may lock/unlock the steering wheel of the vehicle. The SCM 290 may lock/unlock the steering wheel of the vehicle based on the current driving status of the vehicle. For example, the SCM 290 may lock the steering wheel of the vehicle in response to a determination that the current driving status is an autonomous driving status. The SCM 290 may further retract a steering column shaft in response to a determination that the current driving status is an autonomous driving status. For another example, the SCM 290 may unlock the steering wheel of the vehicle in response to a determination that the current driving status is a semi-autonomous driving status, a manual driving status, and/or an error status.
The SCM 290 may control the steering of the autonomous vehicle 130  based on the control signals of the control unit 150. The control signals may include information related to a turning direction, a turning location, a turning angle, or the like, or any combination thereof.
The steering system 295 may be configured to steer the autonomous vehicle 130. In some embodiments, the steering system 295 may steer the autonomous vehicle 130 based on signals transmitted from the SCM 290. For example, the steering system 295 may steer the autonomous vehicle 130 based on the control signals of the control unit 150 transmitted from the SCM 290 in response to a determination that the current driving status is an autonomous driving status. In some embodiments, the steering system 295 may steer the autonomous vehicle 130 based on operations of a human driver. For example, the steering system 295 may turn the autonomous vehicle 130 to a left direction when the human driver turns the steering wheel to a left direction in response to a determination that the current driving status is a manual driving status.
FIG. 3 is a schematic diagram illustrating exemplary hardware components of a computing device 300..
The computing device 300 may be a special purpose computing device for autonomous driving, such as a single-board computing device including one or more microchips. Further, the control unit 150 may include one or more of the computing device 300. The computing device 300 may be used to implement the method and/or system described in the present disclosure via its hardware, software program, firmware, or a combination thereof.
The computing device 300, for example, may include COM ports 350 connected to and from a network connected thereto to facilitate data communications. The computing device 300 may also include a processor 320, in the form of one or more processors, for executing computer instructions. The computer instructions may include, for example, routines,  programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions described herein. For example, during operation, the processor 320 may access instructions for operating the autonomous vehicle 130 and execute the instructions to determine a driving path for the autonomous vehicle.
In some embodiments, the processor 320 may include one or more hardware processors built in one or more microchips, such as a microcontroller, a microprocessor, a reduced instruction set computer (RISC) , an application specific integrated circuits (ASICs) , an application-specific instruction-set processor (ASIP) , a central processing unit (CPU) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a microcontroller unit, a digital signal processor (DSP) , a field programmable gate array (FPGA) , an advanced RISC machine (ARM) , a programmable logic device (PLD) , any circuit or processor capable of executing one or more functions, or the like, or any combinations thereof.
The exemplary computer device 300 may include an internal communication bus 310, program storage and data storage of different forms, for example, a disk 270, and a read only memory (ROM) 330, or a random access memory (RAM) 340, for various data files to be processed and/or transmitted by the computer. The exemplary computer device 300 may also include program instructions stored in the ROM 330, RAM 340, and/or other type of non-transitory storage medium to be executed by the processor 320. The methods and/or processes of the present disclosure may be implemented as the program instructions. The computing device 300 also includes an I/O component 360, supporting input/output between the computer and other components (e.g., user interface elements) . The computing device 300 may also receive programming and data via network communications.
Merely for illustration, only one processor is described in the computing device 300. However, it should be noted that the computing  device 300 in the present disclosure may also include multiple processors, thus operations and/or method steps that are performed by one processor as described in the present disclosure may also be jointly or separately performed by the multiple processors. For example, if in the present disclosure the processor 320 of the computing device 300 executes both step A and step B, it should be understood that step A and step B may also be performed by two different processors jointly or separately in the computing device 300 (e.g., the first processor executes step A and the second processor executes step B, or the first and second processors jointly execute steps A and B) .
Also, one of ordinary skill in the art would understand that when an element in the control system in FIG. 2 performs, the element may perform through electrical signals and/or electromagnetic signals. For example, when a  sensor  142, 144, or 146 sends out detected information, such as a digital photo or a LiDAR cloud point image, the information may be transmitted to a receiver in a form of electronic signals. The control unit 150 may receive the electronic signals of the detected information and may operate logic circuits in its processor to process such information. When the control unit 150 sends out a command to the CAN 250 and/or the gateway module 240 to control the EMS 260, ESC 270, EPS 280 etc., a processor of the control unit 159 may generate electrical signals encoding the command and then send the electrical signals to an output port. Further, when the processor retrieves data from a storage medium, it may send out electrical signals to a read device of the storage medium, which may read structured data in the storage medium. The structured data may be transmitted to the processor in the form of electrical signals via a bus of the control unit 150. Here, an electrical signal may refer to one electrical signal, a series of electrical signals, and/or a plurality of discrete electrical signals.
FIG. 4 is a block diagram illustrating an exemplary sensing system  according to some embodiments of the present disclosure. The sensing system 140 may be in communication with a control unit 150 to send raw sensing data (e.g., images) or preprocessed sensing data to the control unit 150. In some embodiments, the sensing system 140 may include at least one camera 410, at least one LiDAR detector 420, at least radar detector 430, and a processing unit 440. In some embodiments, the camera 410, the LiDAR detector 420, and the radar detector 430 may correspond to the  sensors  142, 144, and 146, respectively.
The camera 410 may be configured to capture camera image (s) of environmental data around a vehicle. The camera 410 may include an unchangeable lens camera, a compact camera, a 3D camera, a panoramic camera, an audio camera, an infrared camera, a digital camera, or the like, or any combination thereof. In some embodiments, multiple cameras of the same or different types may be mounted on a vehicle. For example, an infrared camera may be mounted on a back hood of the vehicle to capture infrared images of objects behind the vehicle, especially, when the vehicle is backing up at night. As another example, an audio camera may be mounted on a reflector of the vehicle to capture images of objects at a side of the vehicle. The audio camera may mark a sound level of different sections or objects on the images obtained. In some embodiments, the images captured by the multiple cameras 410 mounted on the vehicle may collectively cover a whole region around the vehicle.
Merely by way of example, the multiple cameras 410 may be mounted on different parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate. The window may include a front window, a back window, a side window, etc. The car body may include a front hood, a back hood, a roof, a chassis, a side, etc. In some embodiments, the multiple cameras 410 may be attached to or mounted on accessories in the compartment of the vehicle (e.g., a steering  wheel, a cowl, a reflector) . The method of mounting may include adhesive bonding, bolt and nut connection, bayonet fitting, vacuum fixation, or the like, or any combination thereof.
The LiDAR device 420 may be configured to obtain high resolution images with certain range from the vehicle. For example, the LiDAR device 420 may be configured to detect objects within 35 meters of the vehicle.
The LiDAR device 420 may be configured to generate LiDAR point cloud images of the surrounding environment of the vehicle to which the LiDAR device 420 is mounted. The LiDAR device 420 may include a laser generator and a sensor. The laser beam may include an ultraviolet light, a visible light, a near infrared light, etc. The laser generator may illuminate the objects with a pulsed laser beam at a fixed predetermine frequency or predetermined varying frequencies. The laser beam may reflect back after contacting the surface of the objects and the sensor may receive the reflected laser beam. Through the reflected laser beam, the LiDAR device 420 may measure the distance between the surface of the objects and the LiDAR device 420. During operation, the LiDAR device 420 may rotate and use the laser beam to scan the surrounding environment of the vehicle, thereby generating a LiDAR point cloud image according to the reflected laser beam. Since the LiDAR device 420 rotates and scans along limited heights of the vehicle’s surrounding environment, the LiDAR point cloud image measures the 360° environment surrounding the vehicle between the predetermined heights of the vehicle. The LiDAR point cloud image may be a static or dynamic image. Further, since each point in the LiDAR point cloud image measures the distance between the LiDAR device and a surface of an object from which the laser beam is reflected, the LiDAR point cloud image is a 3D image. In some embodiments, the LiDAR point cloud image may be a real-time image illustrating a real-time propagation of the laser beam.
Merely by way of example, the LiDAR device 420 may be mounted on  the roof or front window of the vehicle, however, it should be noted that the LiDAR device 420 may also be installed on other parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate.
The radar device 430 may be configured to generate a radar image by measuring distance to objects around a vehicle via radio waves. Comparing with LiDAR device 420, the radar device 430 may be less precise (with less resolution) but may have a wider detection range. Accordingly, the radar device 430 may be used to measure objects farther than the detection range of the LiDAR device 420. For example, the radar device 430 may be configured to measure objects between 35 meters and 100 meters from the vehicle.
The radar device 430 may include a transmitter for producing electromagnetic waves in the radio or microwaves domain, a transmitting antenna for transmitting or broadcasting the radio waves, a receiving antenna for receiving the radio waves and a processor for generating a radar image. Merely by way of example, the radar device 430 may be mounted on the roof or front window of the vehicle, however, it should be noted that the radar device 430 may also be installed on other parts of the vehicle, including but not limited to a window, a car body, a rear-view mirror, a handle, a light, a sunroof and a license plate.
In some embodiments, the LiDAR image and the radar image may be fused to generate a compensated image. Detailed methods regarding the fusion of the LiDAR image and the radar image may be found elsewhere in present disclosure (See, e.g., FIG. 15 and the descriptions thereof) . In some embodiments, the camera 410, the LiDAR device 420 and the radar device 430 may work concurrently or individually. In a case that they are working individually at different time frame rates, a synchronization method may be employed. Detailed method regarding the synchronization of the frames of  the camera 410, the LiDAR device 420 and/or the radar device 430 may be found elsewhere in the present disclosure (See e.g., FIG. 16 and the descriptions thereof) .
The sensing system 140 may further include a processing unit 440 configured to pre-process the generated images (e.g., camera image, LiDAR image, and radar image) . In some embodiments, the pre-processing of the images may include smoothing, filtering, denoising, reconstructing, or the like, or any combination thereof.
FIG. 5 is a flowchart illustrating an exemplary process for generating a LiDAR point cloud image on which 3D shape of objects are marked according to some embodiments of the present disclosure. In some embodiments, the process 500 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 500 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In 510, the control unit 150 may obtain a LiDAR point cloud image (also referred to as a first LiDAR point cloud image) around a base station.
The base station may be any device that the LiDAR device, the radar, and the camera are mounted on. For example, the base station may be a movable platform, such as a vehicle (e.g., a car, an aircraft, a ship etc. ) . The base station may also be a stationary platform, such as a detection station or a airport control tower. Merely for illustration purpose, the present disclosure takes a vehicle or a device (e.g., a rack) mounted on the vehicle as an example of the base station.
The first LiDAR point cloud image may be generated by the LiDAR device 420. The first LiDAR point cloud image may be a 3D point cloud  image including voxels corresponding to one or more objects around the base station. In some embodiments, the first LiDAR point cloud image may correspond to a first time frame (also referred to as a first time point) .
In 520, the control unit 150 may identify one or more objects in the first LiDAR point cloud image.
The one or more objects may include pedestrians, vehicles, obstacles, buildings, signs, traffic lights, animals, or the like, or any combination thereof. In some embodiments, the control unit 150 may identify the regions and types of the one or more objects in 520. In some embodiments, the control unit 150 may only identify the regions. For example, the control unit 150 may identify a first region of the LiDAR point cloud image as a first object, a second region of the LiDAR point cloud image as a second object and remaining regions as ground (or air) . As another example, the control unit 150 may identify the first region as a pedestrian and the second region as a vehicle.
In some embodiments, if the current method is employed by a vehicle-mounted device as a way of driving aid, control unit 150 may first determine the height of the points (or voxels) around the vehicle-mounted base station (e.g., the height of the vehicle where the vehicle-mounted device is plus the height of the vehicle mounted device) . The points that are too low (ground) , or too high (e.g., at a height that is unlikely to be an object to avoid or to consider during driving) may be removed by the control unit 150 before identifying the one or more objects. The remaining points may be clustered into a plurality of clusters. In some embodiments, the remaining points may be clustered based on their 3D coordinates (e.g., cartesian coordinates) in the 3D point cloud image (e.g., distance between points that are less than a threshold is clustered into a same cluster) . In some embodiments, the remaining points may be swing scanned before clustered into the plurality of clusters. The swing scanning may include converting the remaining points in  the 3D point cloud image from a 3D cartesian coordinate system to a polar coordinate system. The polar coordinate system may include an origin or a reference point. The polar coordinate of each of the remaining points may be expressed as a straight-line distance from the origin, and an angle from the origin to the point. A graph may be generated based on the polar coordinates of the remaining points (e.g., angle from the origin as x-axis or horizontal axis and distance from the origin as y-axis or vertical axis) . The points in the graph may be connected to generate a curve that includes sections with large curvatures and sections with small curvatures. Points on a section with a small curvature are likely the points on a same object and may be clustered into a same cluster. Points on a section with a large curvature are likely the points on different objects and may be clustered into different clusters. Each cluster may correspond to an object. The method of identifying the one or more objects may be found in FIG. 11. In some embodiments, the control unit 150 may obtain a camera image that is taken at a same (or substantially the same or similar) time and angle as the first LiDAR point cloud image. The control unit 150 may identify the one or more objects in the camera image and directly treat them as the one or more objects in the LiDAR point cloud image.
In 530, the control unit 150 may determine one or more locations of the one or more objects in the first LiDAR point image. The control unit 150 may consider each identified object separately and perform operation 530 for each of the one or more objects individually. In some embodiments, the locations of the one or more objects may be a geometric center or gravity point of the clustered region of the one or more objects. In some embodiments, the locations of the one or more objects may be preliminary locations that are adjusted or re-determined after the 3D shape of the one or more objects are generated in 540. It should be noted that the  operations  520 and 530 may be performed in any order, or combined as one operation.  For example, the control unit 150 may determine locations of points corresponding to one or more unknown objects, cluster the points into a plurality of clusters and then identify the clusters as objects.
In some embodiments, the control unit 150 may obtain a camera image. The camera image may be taken by the camera at the same (or substantially the same, or similar) time and angle as the LiDAR point cloud image. The control unit 150 may determine locations of the objects in the camera image based on a neural network (e.g., a tiny yolo network as described in FIG. 10) . The control unit 150 may determine the locations of the one or more objects in the LiDAR point cloud image by mapping locations in the camera image to the LiDAR point cloud image. The mapping of locations from a 2D camera image to a 3D LiDAR point cloud image may include a conic projection, etc.
In some embodiments, the  operations  520 and 530 for identifying the objects and determining the locations of the objects may be referred to as a coarse detection.
In 540, the control unit 150 may generate a 3D shape (e.g., a 3D box) for each of the one or more objects. Detailed methods regarding the generation of the 3D shape for each of the one or more objects may be found elsewhere in the present disclosure (See, e.g., FIG. 13 and the descriptions thereof) . In some embodiments, the operation 540 for generating a 3D shape for the objects may be referred to as a fine detection.
In 550, the control unit 150 may generate a second LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects. For example, the control unit 150 may mark the first LiDAR point cloud image by the 3D shapes of the one or more objects at their corresponding locations to generate the second LiDAR point cloud image.
FIGs. 6A-6C are a series of schematic diagrams of generating and marking a 3D shape of an object in LiDAR point cloud image according to  some embodiments of the present disclosure As shown in FIG. 6A, a base station (e.g., a rack of the LiDAR point or a vehicle itself) may be mounted on a vehicle 610 to receive a LiDAR point cloud image around the vehicle 610. It can be seen that the laser is blocked at an object 620. The control unit 150 may identify and position the object 620 by a method disclosed in process 500. For example, the control unit 150 may mark the object 620 after identifying and positioning it as shown in FIG. 6B. The control unit 150 may further determine a 3D shape of the object 620 and mark the object 620 in the 3D shape as shown in FIG. 6C.
FIG. 7 is a flowchart illustrating an exemplary process for generating a marked camera image according to some embodiments of the present disclosure. In some embodiments, the process 700 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 700 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In 710, the control unit 150 may obtain a first camera image. The camera image may be obtained by the camera 410. Merely by way of example, the camera image may be a 2D image, including one or more objects around a vehicle.
In 720, the control unit 150 may identify the one or more objects and the locations of the one or more objects. The identification may be performed based on a neural network. The neural network may include an artificial neural network, a convolutional neural network, a you only look once network, a tiny yolo network, or the like, or any combination thereof. The neural network may be trained by a plurality of camera image samples in which the objects are identified manually or artificially. In some  embodiments, the control unit 150 may input the first camera image into the trained neural network and the trained neural network may output the identifications and locations of the one or more objects.
In 730, the control unit 150 may generate and mark 2D representations of 3D shapes of the one or more objects in the camera image. In some embodiments, the 2D representations of 3D shapes of the one or more objects may be generated by mapping 3D shapes of the one or more objects in LiDAR point cloud image to the camera image at the corresponding locations of the one or more objects. Detailed methods regarding the generation of the 2D representations of 3D shapes of the one or more objects in the camera image may be found in FIG. 8.
FIG. 8 is a flowchart illustrating an exemplary process for generating 2D representations of 3D shapes of the one or more objects in the camera image according to some embodiments of the present disclosure. In some embodiments, the process 800 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 800 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In step 810, the control unit 150 may obtain a 2D shape of the one or more target objects in the first camera image.
It should be noted that because the camera only captures objects in a limited view whereas the LiDAR scans 360° around the base station, the first camera image may only include part of all the objects in the first LiDAR point cloud image. For brevity, objects that occur in both the first camera image and the first LiDAR point cloud image may be referred to as target objects in the present application. It should also be noted that a 2D shape described in  present disclosure may include but not limited to a triangle, a rectangle (also referred to as a 2D box) , a square, a circle, an oval, and a polygon. Similarly, a 3D shape described in present disclosure may include but not limited to a cuboid (also referred to as a 3D box) , a cube, a sphere, a polyhedral, and a cone. The 2D representation of 3D shape may be a 2D shape that looks like a 3D shape.
The 2D shape of the one or more target objects may be generated by executing a neural network. The neural network may include an artificial neural network, a convolutional neural network, a you only look once (yolo) network, a tiny yolo network, or the like, or any combination thereof. The neural network may be trained by a plurality of camera image samples in which 2D shapes, locations, and types of the objects are identified manually or artificially. In some embodiments, the control unit 150 may input the first camera image into the trained neural network and the trained neural network may output the types, locations and 2D shapes of the one or more target objects. In some embodiments, the neural network may generate a camera image in which the one or more objects are marked with 2D shapes (e.g., 2D boxes) based on the first camera image.
In step 820, the control unit 150 may correlate the first camera image with the first LiDAR point cloud image.
For example, a distance between each of the one or more target objects and the base station (e.g., the vehicle or the rack of the LiDAR device and camera on the vehicle) in the first camera image and the first LiDAR point cloud image may be measured and correlated. For example, the control unit 150 may correlate the distance between a target object and the base station in the first camera image with that in the first LiDAR point cloud image. Accordingly, the size of 2D or 3D shape of the target object in the first camera image may be correlated with that in the first LiDAR point cloud image by the control unit 150. For example, the size of the target object and the distance  between the target object and the base station in the first camera image may be proportional to that in the first LiDAR point cloud image. The correlation between the first camera image and the first LiDAR point cloud image may include a mapping relationship or a conversion of coordinate between them. For example, the correlation may include a conversion from a 3D cartesian coordinate to a 2D plane of a 3D spherical coordinate centered at the base station.
In step 830, the control unit 150 may generate 2D representations of 3D shapes of the target objects based on the 2D shapes of the target objects and the correlation between the LiDAR point cloud image and the first camera image.
For example, the control unit 150 may first perform a registration between the 2D shapes of the target objects in the camera image and the 3D shapes of the target objects in the LiDAR point cloud image. The control unit 150 may then generate the 2D representations of 3D shapes of the target objects based on the 3D shapes of the target objects in the LiDAR point cloud image and the correlation. For example, the control unit 150 may perform a simulated conic projection from a center at the base station, and generate 2D representations of 3D shapes of the target objects at the plane of the 2D camera image based on the correlation between the LiDAR point cloud image and the first camera image.
In step 840, the control unit 150 may generate a second camera image by marking the one or more target objects in the first camera image based on their 2D representations of 3D shapes and the identified location in the first camera image.
FIGs. 9A and 9B are schematic diagrams of same 2D camera images of a car according to some embodiments of the present disclosure. As shown in FIG. 9A, a vehicle 910 is identified and positioned, and a 2D box is marked on it. In some embodiments, the control unit 150 may perform a  method disclosed in present application (e.g., process 800) to generate a 2D representation of a 3D box of the car. The 2D representation of the 3D box of the car is marked on the car as shown in FIG. 9B. Comparing with FIG. 9A, FIG. 9B indicates not only the size of the car but also a depth of car in an axis perpendicular to the plane of the camera image and thus is better in understanding the location of the car.
FIG. 10 is a schematic diagram of a you only look once (yolo) network according to some embodiments of the present disclosure. A yolo network may be a neural network that divides a camera image into regions and predicts bounding boxes and probabilities for each region. The yolo network may be a multilayer neural network (e.g., including multiple layers) . The multiple layers may include at least one convolutional layer (CONV) , at least one pooling layer (POOL) , and at least one fully connected layer (FC) . The multiple layers of the yolo network may correspond to neurons arranged multiple dimensions, including but not limited to width, height, center coordinate, confidence, and classification.
The CONV layer may connect neurons to local regions and compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and the regions they are connected to. The POOL layer may perform a down sampling operation along the spatial dimensions (width, height) resulting in a reduced volume. The function of the POOL layer may include progressively reducing the spatial size of the representation to reduce the number of parameters and computation in the network, and hence to also control overfitting. The POOL Layer operates independently on every depth slice of the input and resizes it spatially, using the MAX operation. In some embodiments, each neuron in the FC layer may be connected to all the values in the previous volume and the FC layer may compute the classification scores.
As shown in FIG. 10, 1010 may be an initial image in a volume of e.g.,  [448*448*3] , wherein “448” relates to a resolution (or number of pixels) and “3” relates to channels (RGB 3 channels) . Images 1020-1070 may be intermediate images generated by multiple CONV layers and POOL layers. It may be noticed that the size of the image reduces and the dimension increases from image 1010 to 1070. The volume of image 1070 may be [7*7*1024] , and the size of the image 1070 may not be reduced any more by extra CONV layers. Two fully connected layers may be arranged after 1070 to generate  images  1080 and 1090. Image 1090 may divide the original image into 49 regions, each region containing 30 dimensions and responsible for predicting a bounding box. In some embodiments, the 30 dimensions may include x, y, width, height for the bounding box’s rectangle, a confidence score, and a probability distribution over 20 classes. If a region is responsible for predicting a number of bounding boxes, the dimension may be multiplied by the corresponding number. For example, if a region is responsible for predicting 5 bounding boxes, the dimension of 1090 may be 150.
A tiny yolo network may be a network with similar structure but fewer layers than a yolo network, e.g., fewer convolutional layers and fewer pooling layers. The tiny yolo network may be based off of the Darknet reference network and may be much faster but less accurate than a normal yolo network.
FIG. 11 is a flowchart illustrating an exemplary process for identifying the objects in a LiDAR point cloud image according to some embodiments of the present disclosure. In some embodiments, the process 1100 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 1100 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The  present disclosure takes the control unit 150 as an example to execute the instruction.
In 1110, the control unit 150 may obtain coordinates of a plurality of points (or voxels) in the LiDAR point cloud image (e.g., the first LiDAR point image) . The coordinate of each of the plurality of points may be a relative coordinate corresponding to an origin (e.g., the base station or the source of the laser beam) .
In 1120, the control unit 150 may remove uninterested points from the plurality of points according to their coordinates. In a scenario of using the present application as a driving aid, the uninterested points may be points that are of too low (e.g., ground) position or too high (e.g., at a height that cannot be an object to avoid or to consider when driving) position in the LiDAR point cloud image.
In 1130, the control unit 150 may cluster the remaining points in the plurality of points in the LiDAR point cloud image into one or more clusters based on a point cloud clustering algorithm. In some embodiments, a spatial distance (or a Euclidean distance) between any two of the remaining points in a 3D cartesian coordinate system may be measured and compared with a threshold. If the spatial distance between two points is less than or equal to the threshold, the two points are considered from a same object and clustered into a same cluster. The threshold may vary dynamically based on the distances between remaining points. In some embodiments, the remaining points may be swing scanned before clustered into the plurality of clusters. The swing scanning may include converting the remaining points in the 3D point cloud image from a 3D cartesian coordinate system to a polar coordinate system. The polar coordinate system may include an origin or a reference point. The polar coordinate of each of the remaining points may be expressed as a straight-line distance from the origin, and an angle from the origin to the point. A graph may be generated based on the polar  coordinates of the remaining points (e.g., angle from the origin as x-axis or horizontal axis and distance from the origin as y-axis or vertical axis) . The points in the graph may be connected to generate a curve that includes sections with large curvatures and sections with small curvatures. Points on a section of the curve with a small curvature are likely the points on a same object and may be clustered into a same cluster. Points on a section of the curve with a large curvature are likely the points on different objects and may be clustered into different clusters. As another example, the point cloud clustering algorithm may include employing a pre-trained clustering model. The clustering model may include a plurality of classifiers with pre-trained parameters. The clustering model may be further updated when clustering the remaining points.
In 1140, the control unit 150 may select at least one of the one or more clusters as a target cluster. For example, some of the one or more clusters are not at a size of any meaningful object, such as a size of a leave, a plastic bag, or a water bottle and may be removed. In some embodiments, only the cluster that satisfies a predetermined size of the objects may be selected as the target cluster.
FIGs. 12A-12E are a series of schematic diagrams of identifying an object in a LiDAR point cloud image according to some embodiments of the present disclosure. FIG. 12A is a schematic LiDAR point cloud image around a vehicle 1210. The control unit 150 may obtain the coordinates of the points in FIG. 12 A and may remove points that are too low or too high to generate the FIG. 12B. Then the control unit 150 may swing scan the points in the FIG. 12 B and measure a distance and angle of each of the points in the FIG. 12B from a reference point or origin as shown in FIG. 12C. The control unit 150 may further cluster the points into one or more clusters based on the distances and angles as shown in FIG. 12D. The control unit 150 may extract the cluster of the one or more clusters individually as shown in FIG.  12E and generate a 3D shape of the objects in the extracted cluster. Detailed methods regarding the generation of 3D shape of the objects in the extracted cluster may be found elsewhere in present disclosure (See, e.g., FIG. 13 and the descriptions thereof) .
FIG. 13 is a flowchart illustrating an exemplary process for generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure. In some embodiments, the process 1300 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 1300 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In 1310, the control unit 150 may determine a preliminary 3D shape of the object.
The preliminary 3D shape may be a voxel, a cuboid (also referred to as 3D box) , a cube, etc. In some embodiments, the control unit 150 may determine a center point of the object. The center point of the object may be determined based on the coordinates of the points in the object. For example, the control unit 150 may determine the center point as the average value of the coordinates of the points in the object. Then the control unit 150 may place the preliminary 3D shape at the centered point of the object (e.g., clustered and extracted LiDAR point cloud image of the object) . For example, a cuboid of a preset size may be placed on the center point of the object by the control unit 150.
Because the LiDAR point cloud image only includes points of the surface of objects that reflect a laser beam, the points only reflects the surface shape of the objects. In an ideal situation without considering error and  variations of the points, the distribution of the points of an object may tight along a contour of the shape of the object. No points are inside the contour and no points are outside the contour. In reality, however, because of measurement errors, the points are scattered around the contour. Therefore, a shape proposal may be needed to identify a rough shape of the object for the purpose of autonomous driving. To this end, the control unit 150 may tune up the 3D shape to obtain an ideal size, shape, orientation, and position and use the 3D shape to serve as the shape proposal.
In 1320, the control unit 150 may adjust at least one of parameters including a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal. In some embodiments, the operations 1320 (and operations 1330 and 1340) may be performed iteratively. In each iteration, one or more of the parameters may be adjusted. For example, the height of the 3D shape is adjusted in the first iteration, and the length of the 3D shape is adjusted in the second iteration. As another example, both the height and length of the 3D shape are adjusted in the first iteration, and the height and width of the 3D shape are adjusted in the second iteration. The adjustment of the parameters may be an increment or a decrement. Also, the adjustment of the parameter in each iteration may be same or different. In some embodiments, the adjustment of height, width, length and yaw may be employed based on a grid searching method.
An ideal shape proposal should serve as a reliable reference shape for the autonomous vehicle to plan its driving path. For example, when the autonomous vehicle determines to surpass the object using the shape proposal as the description of the object, the driving path should guarantee that the vehicle can accurately plan its driving path to safely drive around the object, but at the same time operate a minimum degree of turning to left or right to ensure the driving as smooth as possible. As an example result, the shape proposal may not be required to precisely describe the shape of the  object, but must be big enough to cover the object so that the an autonomous vehicle may reliably rely on the shape proposal to determine a driving path without colliding and/or crashing into the object. However, the shape proposal may not be unnecessarily big either to affect the efficiency of driving path in passing around the object.
Accordingly, the control unit 150 may evaluate a loss function, which serves as a measure how good the shape proposal is in describing the object for purpose of autonomous driving path planning. The lesser the score or value of the loss function, the better the shape proposal describes the object.
In 1330, the control unit 150 may calculate a score (or a value) of the loss function of the 3D shape proposal. Merely by way of example, the loss function may include three parts: Linbox, Lsuf and Lother. For example, the loss function of the 3D shape proposal may be expressed as follows:
L= (Linbox+Lsuf) /N+Lother   (1)
Linbox=∑P_all dis   (2)
Lsuf (car) =∑P_out m*dis+∑P_in n*dis  (3)
Lsuf (ped) =∑P_out a*dis+∑P_in b*dis+∑P_behind c*dis   (4)
Lother=f (N) +Lmin (V)    (5)
Here L may denote an overall score of the 3D shape proposal, Linbox may denote a score of the 3D shape proposal relating to the number of points of the object inside the 3D shape proposal. Lsuf may denote a score describing how close the 3D shape proposal is to the true shape of the object, measured by distances of the points to the surface of the shape proposal. Thus a smaller score of Lsuf means the 3D shape proposal is closer to the surface shape or contour of the object. Further, Lsuf (car) may denote a score of the 3D shape proposal relating to distances between points of a car and the surface of the 3D shape proposal, Lsuf (ped) may denote a score of the 3D shape proposal relating to distances between points of a pedestrian and the surface of the 3D shape proposal and Lother may denote a score of the 3D  shape proposal due to other bonuses or penalties.
Further, N may denote number of points, P_all may denote all the points of the object, P_out may denote points outside the 3D shape proposal, P_in may denote points inside the 3D shape proposal, P_behind may denote points behind the 3D shape proposal (e.g., points on the back side of the 3D shape proposal) , and dis may denote distance from the points of the object to the surface of the 3D shape proposal. In some embodiments, m, n, a, b and c are constants. For example, m may be 2.0, n may be 1.5, a may be 2.0, b may be 0.6 and c may be 1.2.
Linbox may be configured to minimize the number of points inside the 3D shape proposal. Therefore, the less the number of points inside, the smaller the score of Linbox. Lsurf may be configured to encourage certain shape and orientation of the 3D shape proposal so that the points close to the surface of the 3D shape proposal are as much as possible. Accordingly, the smaller the accumulative distances of the points to the surface of the 3D shape proposal, the smaller the score of Lsurf. Lother is configured to encourage a nice and dense cluster of points, i.e., the number of point cluster is larger and the volume of the 3D shape proposal is smaller. Accordingly, f (N) is defined as a function with respect to the total number of points in the 3D shape proposal, i.e., the more points in the 3D shape proposal, the better the loss function, thereby the lesser the score of f (N) ; and Lmin (V) is defined as a restrain to the volume of the 3D shape proposal, which try to minimize the volume of the 3D shape proposal, i.e., the smaller the volume of the 3D shape proposal, the smaller the score of Lmin (V) .
Accordingly, the loss function L in equation (1) incorporates balanced consideration of different factors that encourage the 3D shape proposal to be close to the contour of the object without being unnecessarily big.
In 1340, the control unit 150 may determine whether the score of the 3D shape proposal satisfies a preset condition. The preset condition may  include that the score is less than or equal to a threshold, the score doesn’t change over a number of iterations, a certain number of iterations is performed, etc. In response to the determination that the score of the 3D shape proposal does not satisfy a preset condition, the process 1300 may proceed back to 1320; otherwise, the process 1300 may proceed to 1360.
In 1320, the control unit 150 may further adjust the 3D shape proposal. In some embodiments, the parameters that are adjusted in subsequent iterations may be different from the current iteration. For example, the control unit 150 may perform a first set of adjustments on the height of the 3D shape proposal in the first five iterations. After finding that the score of the 3D shape proposal cannot be reduced lower than the threshold by only adjusting the height. The control unit 150 may perform a second set of adjustments on the width, the length, the yaw of the 3D shape proposal in the next 10 iterations. The score of the 3D shape proposal may still be higher than the threshold after the second adjustment, and the control unit 150 may perform a third set of adjustments on the orientation (e.g., the location or center point) of the 3D shape proposal. It should be noted the adjustments of parameters may be performed in any order and the number and type of parameters in each adjustment may be same or different.
In 1360, the control unit 150 may determine the 3D shape proposal as the 3D shape of the object (or nominal 3D shape of the object) .
FIGs. 14A-14D are a series of schematic diagrams of generating a 3D shape of an object in a LiDAR point cloud image according to some embodiments of the present disclosure. FIG. 14A is a clustered and extracted LiDAR point cloud image of an object. The control unit 150 may generate a preliminary 3D shape and may adjust a height, a width, a length, and a yaw of the preliminary 3D shape to generate a 3D shape proposal as shown in FIG. 14B. After the adjustment of the height, width, length, and yaw, the control unit 150 may further adjust the orientation of the 3D shape  proposal as shown in FIG. 14C. Finally, a 3D shape proposal that satisfies a preset condition as described in the description of the process 1300 may be determined as an 3D shape of the object and may be marked on the object as shown in FIG. 14D.
FIG. 15 is a flow chart illustrating an exemplary process for generating a compensated image according to some embodiments of the present disclosure. In some embodiments, the process 1500 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 1500 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In 1510, the control unit 150 may obtain a first radar image around a base station. The first radar image may be generated by the radar device 430. Comparing with the LiDAR device 420, the radar device 430 may be less precise (with less resolution) but may have a wider detection range. For example, a LiDAR device 420 may only receive a reflected laser beam at a reasonable quality from an object within 35 meters. However, the radar device 430 may receive reflected radio waves from an object hundreds of meters away.
In 1520, the control unit 150 may identify the one or more objects in the first radar image. The method of identifying the one or more objects in the first radar image may be similar to that of the first LiDAR point cloud image, and is not repeated herein.
In 1530, the control unit 150 may determine one or more locations of the one or more objects in the first radar image. The method of determining the one or more locations of the one or more objects in the first radar image may be similar to that in the first LiDAR point cloud image, and is not repeated  herein.
In 1540, the control unit 150 may generate a 3D shape for each of the one or more objects in the first radar image. In some embodiments, the method of generating the 3D shape for each of the one or more objects in the first radar image may be similar to that in the first LiDAR point cloud image. In some other embodiments, the control unit 150 may obtain dimensions and center point of a front surface each of the one or more objects. The 3D shape of an object may be generated simply by extending the front surface in a direction of the body of the object.
In 1550, the control unit 150 may mark the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image to generate a second Radar image.
In 1560, the control unit 150 may fuse the second Radar image and the second LiDAR point cloud image to generate a compensated image. In some embodiments, the LiDAR point cloud image may have higher resolution and reliability near the base station than the radar image, and the radar image may have higher resolution and reliability away from the base station than the LiDAR point cloud image. For example, the control unit 150 may divide the second radar image and second LiDAR point cloud image into 3 sections, 0 to 30 meters, 30 to 50 meters, and greater than 50 meters from the base station. The second radar image and second LiDAR point cloud image may be fused in a manner that only the LiDAR point cloud image is retained from 0 to 30 meters, and only the radar image is retained greater than 50 meters. In some embodiments, the greyscale value of voxels from 30 to 50 meters of the second radar image and the second LiDAR point cloud image may be averaged.
FIG. 16 is a schematic diagram of a synchronization between camera, LiDAR device, and/or radar device according to some embodiments of the present disclosure. As shown in FIG. 16, the frame rates of a camera (e.g.,  camera 410) , a LiDAR device (e.g., LiDAR device 420) and a radar device (e.g., radar device 430) are different. Assuming that the camera, the LiDAR device and the radar device start to work simultaneously at a first time frame T1, a camera image, a LiDAR point cloud image, and a radar image may be generated roughly at the same time (e.g., synchronized) . However, the subsequent images are not synchronized due to the different frame rates. In some embodiments, a device with slowest frame rate among the camera, the LiDAR device, and the radar device may be determined (In the example of FIG. 16, it’s the camera) . The control unit 150 may record each of the time frames of the camera images that camera captured and may search for other LiDAR images and radar images that are close to the time of each of the time frames of the camera images. For each of the time frames of the camera images, a corresponding LiDAR image and a corresponding radar image may be obtained. For example, a camera image 1610 is obtained at T2, the control unit 150 may search for a LiDAR image and a radar image that are closest to T2 (e.g., the LiDAR image 1620 and radar image 1630) . The camera image and the corresponding LiDAR image and radar image are extracted as a set. The three images in a set is assumed to be obtained at the same time and synchronized.
FIG. 17 is a flow chart illustrating an exemplary process for generating a LiDAR point cloud image or a video based on existing LiDAR point cloud images according to some embodiments of the present disclosure. In some embodiments, the process 1700 may be implemented in the autonomous vehicle as illustrated in FIG. 1. For example, the process 1700 may be stored in the storage 220 and/or other storage (e.g., the ROM 330, the RAM 340) as a form of instructions, and invoked and/or executed by a processing unit (e.g., the processor 320, the control unit 150, one or more microchips of the control unit 150) . The present disclosure takes the control unit 150 as an example to execute the instruction.
In 1710, the control unit 150 may obtain two first LiDAR point cloud images around a base station at two different time frames. The two different time frames may be taken successively by a same LiDAR device.
In 1720, the control unit 150 may generate two second LiDAR point cloud images based on the two first LiDAR point cloud images. The method of generating the two second LiDAR point cloud images from the two first LiDAR point cloud images may be found in process 500.
In 1730, the control unit 150 may generate a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
FIG. 18 is a schematic diagram of validating and interpolating frames of images according to some embodiments of the present disclosure. As shown in FIG. 18, the radar images, the camera images, and the LiDAR images are synchronized (e.g., by a method disclosed in FIG. 16) . Additional camera images are generated between existing camera images by an interpolation method. The control unit 150 may generate a video based on the camera images. In some embodiments, the control unit 150 may validate and modify each frame of the camera images, LiDAR images and/or radar images based on historical information. The historical information may include the same or different type of images in the preceding frame or previous frames. For example, a car is not properly identified and positioned in a particular frame of a camera image. However, all of the previous 5 frames correctly identified and positioned the car. The control unit 150 may modify the camera image at the incorrect frame based on the camera images at previous frames and LiDAR images and/or radar images at the incorrect frame and previous frames.
Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example  only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.
Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.
Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “unit, ” “module, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
A non-transitory computer readable signal medium may include a  propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the "C" programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .
Furthermore, the recited order of processing elements or sequences,  or the use of numbers, letters, or other designations therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software only solution, e.g., an installation on an existing server or mobile device.
Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various inventive embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, inventive embodiments lie in less than all features of a single foregoing disclosed embodiment.
In some embodiments, the numbers expressing quantities, properties, and so forth, used to describe and claim certain embodiments of the application are to be understood as being modified in some instances by the term “about, ” “approximate, ” or “substantially. ” For example, “about, ” “approximate, ” or “substantially” may indicate ±20%variation of the value it describes, unless otherwise stated. Accordingly, in some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that may vary depending upon the desired properties  sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the application are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable.
Each of the patents, patent applications, publications of patent applications, and other material, such as articles, books, specifications, publications, documents, things, and/or the like, referenced herein is hereby incorporated herein by this reference in its entirety for all purposes, excepting any prosecution file history associated with same, any of same that is inconsistent with or in conflict with the present document, or any of same that may have a limiting affect as to the broadest scope of the claims now or later associated with the present document. By way of example, should there be any inconsistency or conflict between the description, definition, and/or the use of a term associated with any of the incorporated material and that associated with the present document, the description, definition, and/or the use of the term in the present document shall prevail.
In closing, it is to be understood that the embodiments of the application disclosed herein are illustrative of the principles of the embodiments of the application. Other modifications that may be employed may be within the scope of the application. Thus, by way of example, but not of limitation, alternative configurations of the embodiments of the application may be utilized in accordance with the teachings herein. Accordingly, embodiments of the present application are not limited to that precisely as shown and described.

Claims (23)

  1. A system for driving aid, comprising a control unit including:
    one or more storage media including a set of instructions for identifying and positioning one or more objects around a vehicle; and
    one or more microchips electronically connected to the one or more storage media, wherein during operation of the system, the one or more microchips execute the set of instructions to:
    obtain a first Light Detection and Ranging (LiDAR) point cloud image around a detection base station;
    identify one or more objects in the first LiDAR point cloud image;
    determine one or more locations of the one or more objects in the first LiDAR point cloud image;
    generate a 3D shape for each of the one or more objects; and
    generate a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
  2. The system of claim 1, further comprising:
    at least one LiDAR device in communication with the control unit to send the first LiDAR point cloud image to the control unit;
    at least one camera in communication with the control unit to send a camera image to the control unit; and
    at least one radar device in communication with the control unit to send a radar image to the control unit.
  3. The system of claim 1, wherein the base station is a vehicle; and the system further comprising:
    at least one LiDAR device mounted on a steering wheel, a cowl or  reflector of the vehicle, wherein the mounting of the at least one LiDAR device includes at least one of an adhesive bonding, a bolt and nut connection, a bayonet fitting, or a vacuum fixation.
  4. The system of claim 1, wherein the one or more microchips further:
    obtain a first camera image including at least one of the one or more objects;
    identify at least one target object of the one or more objects in the first camera image and at least one target location of the at least one target object in the first camera image; and
    generate a second camera image by marking the at least one target object in the first camera image based on the at least one target location in the first camera image and the 3D shape of the at least one target object in the second LiDAR point cloud image.
  5. The system of claim 4, wherein in marking the at least one target object in the first camera image, the one or more microchips further:
    obtain a 2D shape of the at least one target object in the first camera image;
    correlate the second LiDAR point cloud image with the first camera image;
    generate a 3D shape of the at least one target object in the first camera image based on the 2D shape of the at least one target object and the correlation between the second LiDAR point cloud image and the first camera image;
    generate a second camera image by marking the at least one target object in the first camera image based on the identified location in the first camera image and the 3D shape of the at least one target object in the first camera image.
  6. The system of claim 4, wherein to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image, the one or more microchips operate a you only look once (YOLO) network or a Tiny-YOLO network to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image.
  7. The system of claim 1, wherein to identify the one or more objects in the first LiDAR point cloud image, the one or more microchips further:
    obtain coordinates of a plurality of points in the first LiDAR point cloud image, wherein the plurality of points includes uninterested points and remaining points;
    remove the uninterested points from the plurality of points according to the coordinates;
    cluster the remaining points into one or more clusters based on a point cloud clustering algorithm; and
    select at least one of the one or more clusters as a target cluster, each of the target cluster corresponding to an object.
  8. The system of claim 1, wherein to generate a 3D shape for each of the one or more objects, the one or more microchips further:
    determine a preliminary 3D shape of the object;
    adjust at least one of a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal;
    calculate a score of the 3D shape proposal;
    determine whether the score of the 3D shape proposal satisfies a preset condition;
    In response to the determination that the score of the 3D shape proposal does not satisfy a preset condition, further adjust the 3D shape proposal; and
    In response to the determination that the score of the 3D shape proposal or further adjusted 3D shape proposal satisfies the preset condition, determine the 3D shape proposal or further adjusted 3D shape proposal as the 3D shape of the object.
  9. The system of claim 8, wherein the score of the 3D shape proposal is calculated based on at least one of a number of points of the first LiDAR point cloud image inside the 3D shape proposal, a number of points of the first LiDAR point cloud image outside the 3D shape proposal, or distances between points and the 3D shape.
  10. The system of claim 1, wherein the one or more microchips further:
    obtain a first radio detection and ranging (Radar) image around the detection base station;
    identify the one or more objects in the first Radar image;
    determine one or more locations of the one or more objects in the first Radar image;
    generate a 3D shape for each of the one or more objects in the first Radar image;
    generate a second Radar image by marking the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image; and
    fuse the second Radar image and the second LiDAR point cloud image to generate a compensated image.
  11. The system of claim 1, wherein the one or more microchips further:
    obtain two first LiDAR point cloud images around the base station at two different time frames;
    generate two second LiDAR point cloud images at the two different time  frames based on the two first LiDAR point cloud images; and
    generate a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
  12. The system of claim 1, wherein the one or more microchips further:
    obtain a plurality of first LiDAR point cloud images around the base station at a plurality of different time frames;
    generate a plurality of second LiDAR point cloud images at the plurality of different time frames based on the plurality of first LiDAR point cloud images; and
    generate a video based on the plurality of second LiDAR point cloud images.
  13. A method implemented on a computing device having one or more storage media storing instructions for identifying and positioning one or more objects around a vehicle, and one or more microchips electronically connected to the one or more storage media, the method comprising:
    obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station;
    identifying one or more objects in the first LiDAR point cloud image;
    determining one or more locations of the one or more objects in the first LiDAR point image;
    generating a 3D shape for each of the one or more objects; and
    generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
  14. The method of claim 13, further comprising:
    obtaining a first camera image including at least one of the one or more objects;
    identifying at least one target object of the one or more objects in the first camera image and at least one target location of the at least one target object in the first camera image; and
    generating a second camera image by marking the at least one target object in the first camera image based on the at least one target location in the first camera image and the 3D shape of the at least one target object in the second LiDAR point cloud image.
  15. The method of claim 14, wherein the marking the at least one target object in the first camera image further includes:
    obtaining a 2D shape of the at least one target object in the first camera image;
    correlating the second LiDAR point cloud image with the first camera image;
    generating a 3D shape of the at least one target object in the first camera image based on the 2D shape of the at least one target object and the correlation between the second LiDAR point cloud image and the first camera image;
    generating a second camera image by marking the at least one target object in the first camera image based on the identified location in the first camera image and the 3D shape of the at least one target object in the first camera image.
  16. The method of claim 14, wherein the identifying the at least one target object in the first camera image and the location of the at least one target object in the first camera image further includes:
    operating a you only look once (YOLO) network or a Tiny-YOLO network to identify the at least one target object in the first camera image and the location of the at least one target object in the first camera image.
  17. The method of claim 13, wherein the identifying the one or more objects in the first LiDAR point cloud image further includes:
    obtaining coordinates of a plurality of points in the first LiDAR point cloud image, wherein the plurality of points includes uninterested points and remaining points;
    removing the uninterested points from the plurality of points according to the coordinates;
    clustering the remaining points into one or more clusters based on a point cloud clustering algorithm; and
    selecting at least one of the one or more clusters as a target cluster, each of the target cluster corresponding to an object.
  18. The method of claim 13, wherein the generating a 3D shape for each of the one or more objects further includes:
    determining a preliminary 3D shape of the object;
    adjusting at least one of a height, a width, a length, a yaw, or an orientation of the preliminary 3D shape to generate a 3D shape proposal;
    calculating a score of the 3D shape proposal;
    determining whether the score of the 3D shape proposal satisfies a preset condition;
    In response to the determination that the score of the 3D shape proposal does not satisfy a preset condition, further adjusting the 3D shape proposal; and
    In response to the determination that the score of the 3D shape proposal or further adjusted 3D shape proposal satisfies the preset condition, determining the 3D shape proposal or further adjusted 3D shape proposal as the 3D shape of the object.
  19. The method of claim 18, wherein the score of the 3D shape proposal is calculated based on at least one of a number of points of the first LiDAR point cloud image inside the 3D shape proposal, a number of points of the first LiDAR point cloud image outside the 3D shape proposal, or distances between points and the 3D shape.
  20. The method of claim 13, further comprising:
    obtaining a first radio detection and ranging (Radar) image around the detection base station;
    identifying the one or more objects in the first Radar image;
    determining one or more locations of the one or more objects in the first Radar image;
    generating a 3D shape for each of the one or more objects in the first Radar image;
    generating a second Radar image by marking the one or more objects in the first Radar image based on the locations and the 3D shapes of the one or more objects in the first Radar image; and
    fusing the second Radar image and the second LiDAR point cloud image to generate a compensated image.
  21. The method of claim 13, further comprising:
    obtaining two first LiDAR point cloud images around the base station at two different time frames;
    generating two second LiDAR point cloud images at the two different time frames based on the two first LiDAR point cloud images; and
    generating a third LiDAR point cloud image at a third time frame based on the two second LiDAR point cloud images by an interpolation method.
  22. The method of claim 13, further comprising:
    obtaining a plurality of first LiDAR point cloud images around the base station at a plurality of different time frames;
    generating a plurality of second LiDAR point cloud images at the plurality of different time frames based on the plurality of first LiDAR point cloud images; and
    generating a video based on the plurality of second LiDAR point cloud images.
  23. A non-transitory computer readable medium, comprising at least one set of instructions for identifying and positioning one or more objects around a vehicle, wherein when executed by microchips of an electronic terminal, the at least one set of instructions directs the microchips to perform acts of:
    obtaining a first light detection and ranging (LiDAR) point cloud image around a detection base station;
    identifying one or more objects in the first LiDAR point cloud image;
    determining one or more locations of the one or more objects in the first LiDAR point image;
    generating a 3D shape for each of the one or more objects; and
    generating a second LiDAR point cloud image by marking the one or more objects in the first LiDAR point cloud image based on the locations and the 3D shapes of the one or more objects.
PCT/CN2017/115491 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle WO2019113749A1 (en)

Priority Applications (8)

Application Number Priority Date Filing Date Title
EP17916456.1A EP3523753A4 (en) 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle
CN201780041308.2A CN110168559A (en) 2017-12-11 2017-12-11 For identification with positioning vehicle periphery object system and method
CA3028659A CA3028659C (en) 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle
AU2017421870A AU2017421870B2 (en) 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle
PCT/CN2017/115491 WO2019113749A1 (en) 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle
JP2018569058A JP2020507137A (en) 2017-12-11 2017-12-11 System and method for identifying and positioning objects around a vehicle
TW107144499A TW201937399A (en) 2017-12-11 2018-12-11 Systems and methods for identifying and positioning objects around a vehicle
US16/234,701 US20190180467A1 (en) 2017-12-11 2018-12-28 Systems and methods for identifying and positioning objects around a vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/115491 WO2019113749A1 (en) 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/234,701 Continuation US20190180467A1 (en) 2017-12-11 2018-12-28 Systems and methods for identifying and positioning objects around a vehicle

Publications (1)

Publication Number Publication Date
WO2019113749A1 true WO2019113749A1 (en) 2019-06-20

Family

ID=66697075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115491 WO2019113749A1 (en) 2017-12-11 2017-12-11 Systems and methods for identifying and positioning objects around a vehicle

Country Status (7)

Country Link
US (1) US20190180467A1 (en)
EP (1) EP3523753A4 (en)
JP (1) JP2020507137A (en)
CN (1) CN110168559A (en)
CA (1) CA3028659C (en)
TW (1) TW201937399A (en)
WO (1) WO2019113749A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023090610A (en) * 2021-12-17 2023-06-29 南京郵電大学 5g indoor smart positioning method fusing triple visual matching and multi-base-station regression

Families Citing this family (89)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10614326B2 (en) * 2017-03-06 2020-04-07 Honda Motor Co., Ltd. System and method for vehicle control based on object and color detection
US10733338B2 (en) * 2017-06-29 2020-08-04 The Boeing Company Methods and apparatus to generate a synthetic point cloud of a spacecraft
US11307309B2 (en) * 2017-12-14 2022-04-19 COM-IoT Technologies Mobile LiDAR platforms for vehicle tracking
US20190204845A1 (en) * 2017-12-29 2019-07-04 Waymo Llc Sensor integration for large autonomous vehicles
US11017548B2 (en) * 2018-06-21 2021-05-25 Hand Held Products, Inc. Methods, systems, and apparatuses for computing dimensions of an object using range images
CN110757446B (en) * 2018-07-25 2021-08-27 深圳市优必选科技有限公司 Robot recharging login method and device and storage device
US11726210B2 (en) 2018-08-05 2023-08-15 COM-IoT Technologies Individual identification and tracking via combined video and lidar systems
CN109271893B (en) * 2018-08-30 2021-01-01 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium for generating simulation point cloud data
CN109188457B (en) * 2018-09-07 2021-06-11 百度在线网络技术(北京)有限公司 Object detection frame generation method, device, equipment, storage medium and vehicle
US10909424B2 (en) * 2018-10-13 2021-02-02 Applied Research, LLC Method and system for object tracking and recognition using low power compressive sensing camera in real-time applications
US10984540B2 (en) * 2018-10-15 2021-04-20 Tusimple, Inc. Tracking and modeling processing of image data for LiDAR-based vehicle tracking system and method
US10878580B2 (en) 2018-10-15 2020-12-29 Tusimple, Inc. Point cluster refinement processing of image data for LiDAR-based vehicle tracking system and method
US10878282B2 (en) * 2018-10-15 2020-12-29 Tusimple, Inc. Segmentation processing of image data for LiDAR-based vehicle tracking system and method
KR102635265B1 (en) * 2018-12-20 2024-02-13 주식회사 에이치엘클레무브 Apparatus and method for around view monitoring using lidar
WO2020154967A1 (en) * 2019-01-30 2020-08-06 Baidu.Com Times Technology (Beijing) Co., Ltd. Map partition system for autonomous vehicles
DE102019202025B4 (en) * 2019-02-15 2020-08-27 Zf Friedrichshafen Ag System and method for the safe operation of an automated vehicle
CN110082775B (en) * 2019-05-23 2021-11-30 北京主线科技有限公司 Vehicle positioning method and system based on laser device
US11507789B2 (en) * 2019-05-31 2022-11-22 Lg Electronics Inc. Electronic device for vehicle and method of operating electronic device for vehicle
CN110287032B (en) * 2019-07-02 2022-09-20 南京理工大学 Power consumption optimization scheduling method of YoloV3-Tiny on multi-core system on chip
CN110412564A (en) * 2019-07-29 2019-11-05 哈尔滨工业大学 A kind of identification of train railway carriage and distance measuring method based on Multi-sensor Fusion
CN110550072B (en) * 2019-08-29 2022-04-29 北京博途智控科技有限公司 Method, system, medium and equipment for identifying obstacle in railway shunting operation
CN110471085B (en) * 2019-09-04 2023-07-04 深圳市镭神智能系统有限公司 Track detecting system
US11526706B2 (en) * 2019-10-09 2022-12-13 Denso International America, Inc. System and method for classifying an object using a starburst algorithm
CN110706288A (en) * 2019-10-10 2020-01-17 上海眼控科技股份有限公司 Target detection method, device, equipment and readable storage medium
CN110687549B (en) * 2019-10-25 2022-02-25 阿波罗智能技术(北京)有限公司 Obstacle detection method and device
US20210141078A1 (en) * 2019-11-11 2021-05-13 Veoneer Us, Inc. Detection system and method for characterizing targets
US11940804B2 (en) * 2019-12-17 2024-03-26 Motional Ad Llc Automated object annotation using fused camera/LiDAR data points
CN111127442B (en) * 2019-12-26 2023-05-02 内蒙古科技大学 Trolley wheel shaft defect detection method and device
CN113128248B (en) * 2019-12-26 2024-05-28 深圳一清创新科技有限公司 Obstacle detection method, obstacle detection device, computer device, and storage medium
CN111160302B (en) * 2019-12-31 2024-02-23 深圳一清创新科技有限公司 Obstacle information identification method and device based on automatic driving environment
CN111353481A (en) * 2019-12-31 2020-06-30 成都理工大学 Road obstacle identification method based on laser point cloud and video image
CN111260789B (en) * 2020-01-07 2024-01-16 青岛小鸟看看科技有限公司 Obstacle avoidance method, virtual reality headset and storage medium
CN111341096B (en) * 2020-02-06 2020-12-18 长安大学 Bus running state evaluation method based on GPS data
US20210246636A1 (en) * 2020-02-07 2021-08-12 Caterpillar Inc. System and Method of Autonomously Clearing a Windrow
US11592570B2 (en) * 2020-02-25 2023-02-28 Baidu Usa Llc Automated labeling system for autonomous driving vehicle lidar data
TWI726630B (en) * 2020-02-25 2021-05-01 宏碁股份有限公司 Map construction system and map construction method
CN111458718B (en) * 2020-02-29 2023-04-18 阳光学院 Spatial positioning device based on integration of image processing and radio technology
CN113433566B (en) * 2020-03-04 2023-07-25 宏碁股份有限公司 Map construction system and map construction method
CN111402161B (en) * 2020-03-13 2023-07-21 北京百度网讯科技有限公司 Denoising method, device, equipment and storage medium for point cloud obstacle
CN111414911A (en) * 2020-03-23 2020-07-14 湖南信息学院 Card number identification method and system based on deep learning
CN111308500B (en) * 2020-04-07 2022-02-11 三一机器人科技有限公司 Obstacle sensing method and device based on single-line laser radar and computer terminal
KR20210124789A (en) * 2020-04-07 2021-10-15 현대자동차주식회사 Apparatus for recognizing object based on lidar sensor and method thereof
US11180162B1 (en) 2020-05-07 2021-11-23 Argo AI, LLC Systems and methods for controlling vehicles using an amodal cuboid based algorithm
CN111553353B (en) * 2020-05-11 2023-11-07 北京小马慧行科技有限公司 Processing method and device of 3D point cloud, storage medium and processor
CN111666855B (en) * 2020-05-29 2023-06-30 中国科学院地理科学与资源研究所 Animal three-dimensional parameter extraction method and system based on unmanned aerial vehicle and electronic equipment
CN111695486B (en) * 2020-06-08 2022-07-01 武汉中海庭数据技术有限公司 High-precision direction signboard target extraction method based on point cloud
CN111832548B (en) * 2020-06-29 2022-11-15 西南交通大学 Train positioning method
US11628856B2 (en) 2020-06-29 2023-04-18 Argo AI, LLC Systems and methods for estimating cuboids from LiDAR, map and image data
CN111860227B (en) 2020-06-30 2024-03-08 阿波罗智能技术(北京)有限公司 Method, apparatus and computer storage medium for training trajectory planning model
CN111914839B (en) * 2020-07-28 2024-03-19 特微乐行(广州)技术有限公司 Synchronous end-to-end license plate positioning and identifying method based on YOLOv3
CN111932477B (en) * 2020-08-07 2023-02-07 武汉中海庭数据技术有限公司 Noise removal method and device based on single line laser radar point cloud
CN112068155B (en) * 2020-08-13 2024-04-02 沃行科技(南京)有限公司 Partition obstacle detection method based on multiple multi-line laser radars
US20220067399A1 (en) * 2020-08-25 2022-03-03 Argo AI, LLC Autonomous vehicle system for performing object detections using a logistic cylinder pedestrian model
CN114929539A (en) * 2020-09-07 2022-08-19 松下知识产权经营株式会社 Information processing method and information processing apparatus
CN112560671B (en) * 2020-12-15 2022-04-12 哈尔滨工程大学 Ship detection method based on rotary convolution neural network
CN112835037B (en) * 2020-12-29 2021-12-07 清华大学 All-weather target detection method based on fusion of vision and millimeter waves
CN112754658B (en) * 2020-12-31 2023-03-14 华科精准(北京)医疗科技有限公司 Operation navigation system
CN112935703B (en) * 2021-03-19 2022-09-27 山东大学 Mobile robot pose correction method and system for identifying dynamic tray terminal
CN112926476A (en) * 2021-03-08 2021-06-08 京东鲲鹏(江苏)科技有限公司 Vehicle identification method, device and storage medium
US20220284707A1 (en) * 2021-03-08 2022-09-08 Beijing Roborock Technology Co., Ltd. Target detection and control method, system, apparatus and storage medium
US20220291681A1 (en) * 2021-03-12 2022-09-15 6 River Systems, Llc Systems and methods for edge and guard detection in autonomous vehicle operation
RU2767831C1 (en) * 2021-03-26 2022-03-22 Общество с ограниченной ответственностью "Яндекс Беспилотные Технологии" Methods and electronic devices for detecting objects in the environment of an unmanned vehicle
JP2022152402A (en) * 2021-03-29 2022-10-12 本田技研工業株式会社 Recognition device, vehicle system, recognition method and program
CN113096395B (en) * 2021-03-31 2022-03-25 武汉理工大学 Road traffic safety evaluation system based on positioning and artificial intelligence recognition
CN113051304B (en) * 2021-04-02 2022-06-24 中国有色金属长沙勘察设计研究院有限公司 Calculation method for fusion of radar monitoring data and three-dimensional point cloud
CN113091737A (en) * 2021-04-07 2021-07-09 阿波罗智联(北京)科技有限公司 Vehicle-road cooperative positioning method and device, automatic driving vehicle and road side equipment
CN113221648B (en) * 2021-04-08 2022-06-03 武汉大学 Fusion point cloud sequence image guideboard detection method based on mobile measurement system
US11557129B2 (en) * 2021-04-27 2023-01-17 Argo AI, LLC Systems and methods for producing amodal cuboids
CN115248428B (en) * 2021-04-28 2023-12-22 北京航迹科技有限公司 Laser radar calibration and scanning method and device, electronic equipment and storage medium
CN113536892B (en) * 2021-05-13 2023-11-21 泰康保险集团股份有限公司 Gesture recognition method and device, readable storage medium and electronic equipment
CN113296119B (en) * 2021-05-24 2023-11-28 江苏盛海智能科技有限公司 Unmanned obstacle avoidance driving method and terminal based on laser radar and UWB array
CN113296118B (en) * 2021-05-24 2023-11-24 江苏盛海智能科技有限公司 Unmanned obstacle detouring method and terminal based on laser radar and GPS
CN113192109B (en) * 2021-06-01 2022-01-11 北京海天瑞声科技股份有限公司 Method and device for identifying motion state of object in continuous frames
CN114374777B (en) 2021-06-02 2023-06-16 北京石头世纪科技股份有限公司 Linear laser module and self-moving equipment
CN113071498B (en) * 2021-06-07 2021-09-21 新石器慧通(北京)科技有限公司 Vehicle control method, device, system, computer device and storage medium
US11978259B2 (en) * 2021-07-09 2024-05-07 Ford Global Technologies, Llc Systems and methods for particle filter tracking
CN113625299B (en) * 2021-07-26 2023-12-01 北京理工大学 Method and device for detecting height and unbalanced load of loaded material based on three-dimensional laser radar
CA3233393A1 (en) * 2021-09-30 2023-04-06 Deepak Rajasekhar Karishetti Obstruction avoidance
US11527085B1 (en) * 2021-12-16 2022-12-13 Motional Ad Llc Multi-modal segmentation network for enhanced semantic labeling in mapping
US20230219578A1 (en) * 2022-01-07 2023-07-13 Ford Global Technologies, Llc Vehicle occupant classification using radar point cloud
US20230219595A1 (en) * 2022-01-13 2023-07-13 Motional Ad Llc GOAL DETERMINATION USING AN EYE TRACKER DEVICE AND LiDAR POINT CLOUD DATA
CN114255359B (en) * 2022-03-01 2022-06-24 深圳市北海轨道交通技术有限公司 Intelligent stop reporting verification method and system based on motion image identification
CN114419231B (en) * 2022-03-14 2022-07-19 幂元科技有限公司 Traffic facility vector identification, extraction and analysis system based on point cloud data and AI technology
CN114494248B (en) * 2022-04-01 2022-08-05 之江实验室 Three-dimensional target detection system and method based on point cloud and images under different visual angles
WO2024025850A1 (en) * 2022-07-26 2024-02-01 Becton, Dickinson And Company System and method for vascular access management
CN115035195B (en) * 2022-08-12 2022-12-09 歌尔股份有限公司 Point cloud coordinate extraction method, device, equipment and storage medium
CN116385431B (en) * 2023-05-29 2023-08-11 中科航迈数控软件(深圳)有限公司 Fault detection method for numerical control machine tool equipment based on combination of infrared thermal imaging and point cloud
CN116913033B (en) * 2023-05-29 2024-04-05 深圳市兴安消防工程有限公司 Fire big data remote detection and early warning system
CN117470249B (en) * 2023-12-27 2024-04-02 湖南睿图智能科技有限公司 Ship anti-collision method and system based on laser point cloud and video image fusion perception

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060222207A1 (en) * 2003-02-13 2006-10-05 Iee International Electronics & Engineering S.A. Device for a motor vehicle used for the three-dimensional detection of a scene inside or outside said motor vehicle
CN102538802A (en) * 2010-12-30 2012-07-04 上海博泰悦臻电子设备制造有限公司 Three-dimensional navigation display method and relevant device thereof
CN103578133A (en) * 2012-08-03 2014-02-12 浙江大华技术股份有限公司 Method and device for reconstructing two-dimensional image information in three-dimensional mode
CN103890606A (en) * 2011-10-20 2014-06-25 罗伯特·博世有限公司 Methods and systems for creating maps with radar-optical imaging fusion
US20170220887A1 (en) * 2016-01-29 2017-08-03 Pointivo, Inc. Systems and methods for extracting information about objects from scene information

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8996228B1 (en) * 2012-09-05 2015-03-31 Google Inc. Construction zone object detection using light detection and ranging
US9221461B2 (en) * 2012-09-05 2015-12-29 Google Inc. Construction zone detection using a plurality of information sources
JP6682833B2 (en) * 2015-12-04 2020-04-15 トヨタ自動車株式会社 Database construction system for machine learning of object recognition algorithm
US9760806B1 (en) * 2016-05-11 2017-09-12 TCL Research America Inc. Method and system for vision-centric deep-learning-based road situation analysis
CN106371105A (en) * 2016-08-16 2017-02-01 长春理工大学 Vehicle targets recognizing method, apparatus and vehicle using single-line laser radar
US10328934B2 (en) * 2017-03-20 2019-06-25 GM Global Technology Operations LLC Temporal data associations for operating autonomous vehicles

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060222207A1 (en) * 2003-02-13 2006-10-05 Iee International Electronics & Engineering S.A. Device for a motor vehicle used for the three-dimensional detection of a scene inside or outside said motor vehicle
CN102538802A (en) * 2010-12-30 2012-07-04 上海博泰悦臻电子设备制造有限公司 Three-dimensional navigation display method and relevant device thereof
CN103890606A (en) * 2011-10-20 2014-06-25 罗伯特·博世有限公司 Methods and systems for creating maps with radar-optical imaging fusion
CN103578133A (en) * 2012-08-03 2014-02-12 浙江大华技术股份有限公司 Method and device for reconstructing two-dimensional image information in three-dimensional mode
US20170220887A1 (en) * 2016-01-29 2017-08-03 Pointivo, Inc. Systems and methods for extracting information about objects from scene information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3523753A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023090610A (en) * 2021-12-17 2023-06-29 南京郵電大学 5g indoor smart positioning method fusing triple visual matching and multi-base-station regression
JP7479715B2 (en) 2021-12-17 2024-05-09 南京郵電大学 5G indoor smart positioning method combining triple visual matching and multi-base station regression

Also Published As

Publication number Publication date
CA3028659C (en) 2021-10-12
US20190180467A1 (en) 2019-06-13
AU2017421870A1 (en) 2019-06-27
TW201937399A (en) 2019-09-16
JP2020507137A (en) 2020-03-05
CN110168559A (en) 2019-08-23
CA3028659A1 (en) 2019-06-11
EP3523753A4 (en) 2019-10-23
EP3523753A1 (en) 2019-08-14

Similar Documents

Publication Publication Date Title
CA3028659C (en) Systems and methods for identifying and positioning objects around a vehicle
US10627521B2 (en) Controlling vehicle sensors based on dynamic objects
US11255958B2 (en) Recognizing radar reflections using velocity information
WO2021041510A1 (en) Estimating in-plane velocity from a radar return of a stationary roadside object
US20240134054A1 (en) Point cloud segmentation using a coherent lidar for autonomous vehicle applications
US11965956B2 (en) Recognizing radar reflections using position information
WO2020176483A1 (en) Recognizing radar reflections using velocity and position information
AU2017421870B2 (en) Systems and methods for identifying and positioning objects around a vehicle
US20240151855A1 (en) Lidar-based object tracking
US20230131721A1 (en) Radar and doppler analysis and concealed object detection
US20230142674A1 (en) Radar data analysis and concealed object detection

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018569058

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017916456

Country of ref document: EP

Effective date: 20181227

ENP Entry into the national phase

Ref document number: 2017421870

Country of ref document: AU

Date of ref document: 20171211

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17916456

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE