WO2023249802A1 - Localisation de géométrie visuelle multimode - Google Patents

Localisation de géométrie visuelle multimode Download PDF

Info

Publication number
WO2023249802A1
WO2023249802A1 PCT/US2023/024065 US2023024065W WO2023249802A1 WO 2023249802 A1 WO2023249802 A1 WO 2023249802A1 US 2023024065 W US2023024065 W US 2023024065W WO 2023249802 A1 WO2023249802 A1 WO 2023249802A1
Authority
WO
WIPO (PCT)
Prior art keywords
geometry
pose
detections
localization
visual
Prior art date
Application number
PCT/US2023/024065
Other languages
English (en)
Inventor
Kuan-Wei Liu
Mianwei Zhou
Yibo Chen
Original Assignee
Plusai, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/844,650 external-priority patent/US11562501B1/en
Priority claimed from US17/844,655 external-priority patent/US11624616B1/en
Application filed by Plusai, Inc. filed Critical Plusai, Inc.
Publication of WO2023249802A1 publication Critical patent/WO2023249802A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30256Lane; Road marking

Definitions

  • the present technology relates to autonomous systems.
  • the present technology relates to visual geometry localization for autonomous systems of vehicles.
  • the determination of pose is fundamental for autonomous systems of vehicles, such as trucks. Accurate determinations of pose for an autonomously driving truck are vital to, for example, path planning and safe navigation. Localization involves, for example, matching objects in an environment in which a truck is driving with features from high definition (HD) maps so that the truck can determine its precise pose in real time.
  • HD high definition
  • Various embodiments of the present technology can include systems, methods, and non-transitory computer readable media configured to perform operations comprising determining visual geometry detections (e.g., lane line detections) associated with geometry corresponding with a map; aligning the visual geometry detections with the geometry based on transformations associated with selected degrees of freedom; and determining a pose of a vehicle based on alignment of the visual geometry detections with the geometry.
  • visual geometry detections e.g., lane line detections
  • the operations further comprise generating a grid map based on the geometry, wherein the grid map includes a grid of cells and the cells are associated with values based on presence or absence of a boundary line in the geometry.
  • the operations further comprise generating a score for the visual geometry detections based on the visual geometry detections overlaid on the grid map.
  • the operations further comprise loading a high definition (HD) map based on a GPS position.
  • HD high definition
  • the transformations associated with selected degrees of freedom are transformations with respect to pitch, yaw, x-axis, and y-axis, and a transformation with respect to yaw is determined based on transformation with respect to pitch.
  • the transformation with respect to yaw is based on a median angle difference determined based on alignment of the visual geometry detections with the geometry with respect to pitch.
  • the aligning the visual geometry detections comprises aligning the visual geometry detections based on transformations with respect to pitch and yaw; and subsequent to the aligning the visual geometry detections based on transformations with respect to pitch and yaw, aligning the visual geometry detections based on transformations with respect to the x-axis and y-axis.
  • the aligning the visual geometry detections does not perform transformations with respect to roll and z-axis.
  • the visual geometry detections include detected lane lines and the geometry includes lane boundary lines.
  • Various embodiments of the present technology can include systems, methods, and non-transitory computer readable media configured to perform operations comprising generating a first type of pose and a second type of pose based on visual geometry localization; determining a mode for planning a planning path for a vehicle based on at least one of the first type of pose and the second type of pose; and generating the planning path for the vehicle based on the mode.
  • the first type of pose is a local result pose and the second type of pose is a global result pose.
  • the operations further comprise generating, in a normal mode, a fusion result pose based on the local result pose, wherein the planning path is generated based on the fusion result pose; and determining a difference between the local result pose and the global result pose.
  • the operations further comprise operating in the normal mode based on the difference between the local result pose and the global result pose being within a threshold distance and a variance of the difference being within a threshold variance; or operating in a re-localization mode based on the difference between the local result pose and the global result pose being at least the threshold distance for a threshold period of time.
  • the global result pose is generated based on a global search of a high definition (HD) map
  • the local result pose is generated based on a local search of a portion of the HD map
  • the global search restricts the global result pose to a global range associated with a road
  • the local search restricts the local result pose to a local range associated with a lane in the road.
  • the mode is a normal mode
  • the planning path is generated based on a driving path in an HD map.
  • the mode is a re-localization mode
  • the planning path is generated based on lane tracking.
  • FIGURE 1 illustrates an example system, according to some embodiments of the present technology.
  • FIGURES 3A-3B illustrate examples associated with visual geometry localization, according to some embodiments of the present technology.
  • FIGURE 4 illustrates an example method, according to some embodiments of the present technology.
  • FIGURE 6 illustrates an example block diagram associated with multi-mode visual geometry localization, according to some embodiments of the present technology.
  • FIGURES 7A-7B illustrate examples associated with multimode visual geometry localization, according to some embodiments of the present technology.
  • FIGURE 8 illustrates an example method, according to some embodiments of the present technology.
  • Autonomous systems of vehicles rely on localization for various functions, such as path planning and safe navigation.
  • an autonomous system can have an accurate determination of where a vehicle is located. Based on the accurate determination of where the vehicle is located, the autonomous system can, for example, plan a path that safely navigates the vehicle through an environment.
  • localization is critical to autonomous system functions that rely on an accurate determination of location.
  • various conventional localization techniques such as those based on GPS or LiDAR, can fail to provide accurate determinations of location.
  • GPS localization can suffer from effects of noise and drift.
  • the effects of noise and drift can introduce a bias in GPS localization that makes determinations of location based on GPS localization imprecise.
  • GPS localization may not be sufficiently precise for exacting requirements of autonomous driving.
  • bias arises due to noise and drift GPS localization can fail to provide a precise determination of which lane of a road a vehicle is located in.
  • LiDAR localization can rely on data-heavy point clouds, which can be difficult to scale up, to determine locations for vehicles. Therefore, LiDAR localization is inefficient while navigating an environment. Further, LiDAR performance can suffer due to environmental factors, such as inclement weather.
  • a precise location of a vehicle can be determined based on visual geometry localization.
  • the visual geometry localization can involve receiving captured image data.
  • Visual geometry detections such as detected lane lines (or lane line detections), can be detected in the captured image data.
  • the detected lane lines can indicate where lane boundaries are in the captured image data.
  • the lane lines detected in the captured image data can be aligned with lane boundary line geometry in a high definition (HD) map. With the detected lane lines in the captured image data aligned with the lane boundary line geometry in the HD map, a precise location of the vehicle can be determined with respect to the associated lane lines.
  • HD high definition
  • aligning lane lines detected in captured image data with lane boundary line geometry in an HD map can involve a search for rotational (or angle) transformations (e.g., pitch, yaw, roll) and a search for translational transformations (e.g., x-axis, y-axis, z-axis).
  • the search for rotational transformations can be based on certain rotational degrees of freedom (e.g., pitch) instead of all three rotational degrees of freedom (e.g., pitch, yaw, roll).
  • certain rotational degrees of freedom e.g., pitch
  • other rotational transformations based on other rotational degrees of freedom can be determined (e.g., yaw).
  • certain rotational degrees of freedom e.g., roll
  • transformations associated with those rotational degrees of freedom are not performed.
  • the search for translational transformations can be based on certain translational degrees of freedom (e.g., x-axis, y-axis) instead of all three translational degrees of freedom (e.g., x-axis, y-axis, z-axis).
  • certain translational degrees of freedom e.g., z-axis
  • certain translational degrees of freedom e.g., z-axis
  • transformations associated with those translational degrees of freedom are not performed.
  • lane lines detected in captured image data can be aligned with lane boundary line geometry in an HD map more efficiently than by searching for a transformation involving all six degrees of freedom.
  • an alignment of lane lines with lane boundary line geometry based on decoupled searches for rotational transformations with two degrees of freedom and translational transformations with two degrees of freedom would have a time complexity of O(n 2 ) whereas an alignment based on a search of six degrees of freedom would have a time complexity of O(n 6 ).
  • an alignment in accordance with the present technology poses advantages relating to reduced need for computing resources and faster calculations relating to vehicle localization.
  • Searches for transformations to align lane lines detected from captured image data with geometry of lane boundary lines in an HD map can be evaluated by determining aligning scores associated with the searches.
  • the lane boundary line geometry of the HD map can be converted to a grid map of values corresponding with the presence of a lane boundary.
  • the detected lane lines transformed in accordance with a search can be overlaid on the grid map to determine an aligning score for the search.
  • Searches for transformations to align detected lane lines with a lane boundary line geometry can be compared based on their aligning scores. Based on the comparison, transformations can be applied to the lane boundary lines.
  • Visual geometry localization can be performed based on the aligned lane boundary lines and the HD map to determine a precise location of a vehicle.
  • a vehicle can navigate an environment.
  • An autonomous system on the vehicle can determine where the vehicle is in the environment based on visual geometry localization.
  • the autonomous system can capture image data through various sensors (e.g., cameras) on the vehicle.
  • Lane lines can be detected in the captured image data.
  • the detected lane lines can be aligned to lane boundary line geometry in an HD map of the environment.
  • the detected lane lines can be aligned by performing searches for pitch transformations to apply to the detected lane lines. Based on the pitch transformations, yaw transformations can be determined for the detected lane lines.
  • the pitch transformations and the yaw transformations can be applied to the detected lane lines to rotationally align the detected lane lines with the lane boundary line geometry.
  • the rotationally aligned lane lines can be translationally aligned with the lane boundary line geometry by performing searches for x-axis translations and y- axis translations to apply to the rotationally aligned lane lines.
  • aligning scores can be determined for the searches for transformations to align the detected lane lines with the lane boundary line geometry. Based on the aligning scores, the searches can be compared to determine which transformations to apply to the detected lane lines. For example, the transformations of the search associated with the highest aligning score can be applied to the detected lane lines to align the lane detection lines with the lane boundary line geometry.
  • FIGURE 1 illustrates an example system 100 including a local geometry loader 102, a preloader 104, a visual geometry localization module 108, and a visual geometry localizer 110, according to some embodiments of the present technology.
  • some or all of the functionality performed by the example system 100 may be performed by one or more computing systems implemented in any type of vehicle, such as an autonomous vehicle.
  • some or all of the functionality performed by the example system 100 may be performed by one or more backend computing systems.
  • some or all of the functionality performed by the example system 100 may be performed by one or more computing systems associated with (e.g., carried by) one or more users riding in a vehicle.
  • some or all data processed and/or stored by the example system 100 can be stored in a data store (e.g., local to the example system 100) or other storage system (e.g., cloud storage remote from the example system 100).
  • a data store e.g., local to the example system 100
  • other storage system e.g., cloud storage remote from the example system 100.
  • the components (e.g., modules, elements, etc.) shown in this figure and all figures herein, as well as their described functionality, are exemplary only. Other implementations of the present technology may include additional, fewer, integrated, or different components and related functionality. Some components and related functionality may not be shown or described so as not to obscure relevant details. In various embodiments, one or more of the functionalities described in connection with the example system 100 can be implemented in any suitable combinations.
  • autonomous vehicles can include, for example, a fully autonomous vehicle, a partially autonomous vehicle, a vehicle with driver assistance, or an autonomous capable vehicle.
  • the capabilities of autonomous vehicles can be associated with a classification system or taxonomy having tiered levels of autonomy.
  • a classification system can be specified by, for example, industry standards or governmental guidelines.
  • the levels of autonomy can be considered using a taxonomy such as level 0 (momentary driver assistance), level 1 (driver assistance), level 2 (additional assistance), level 3 (conditional assistance), level 4 (high automation), and level 5 (full automation without any driver intervention).
  • an autonomous vehicle can be capable of operating, in some instances, in at least one of levels 0 through 5.
  • an autonomous capable vehicle may refer to a vehicle that can be operated by a driver manually (that is, without the autonomous capability activated) while being capable of operating in at least one of levels 0 through 5 upon activation of an autonomous mode.
  • the term “driver” may refer to a local operator (e.g., an operator in the vehicle) or a remote operator (e.g., an operator physically remote from and not in the vehicle).
  • the autonomous vehicle may operate solely at a given level (e.g., level 2 additional assistance or level 5 full automation) for at least a period of time or during the entire operating time of the autonomous vehicle.
  • Other classification systems can provide other levels of autonomy characterized by different vehicle capabilities.
  • information associated with an environment can be based on sensor data.
  • the sensor data may be collected by, for example, sensors mounted to a vehicle and/or sensors on computing devices associated with users riding in the vehicle.
  • the sensor data may include data captured by one or more sensors including, for example, optical cameras, LiDAR, radar, infrared cameras, and ultrasound equipment.
  • the sensor data can be obtained from a data store or from sensors associated with a vehicle in real-time (or near real-time).
  • information related to sensor data can be obtained, such as a calendar date, day of week, and time of day during which the sensor data was captured.
  • Such related information may be obtained from an internal clock of a sensor or a computing device, one or more external computing systems (e.g., Network Time Protocol (NAP) servers), or GPS data, to name some examples.
  • NAP Network Time Protocol
  • the preloader 104 can provide a high definition
  • HD maps are detailed, accurate maps that can be used by autonomous systems to navigate an environment.
  • the HD maps can include details captured by various types of sensors.
  • the HD maps can include map elements such as road shape, road markings, traffic signs, and boundaries, such as lane boundary line geometry.
  • the preloader 104 can provide an HD map, or a portion of the HD map, that corresponds with an environment in which a vehicle is navigating.
  • the preloader 104 can determine which HD map to provide based on a location of the vehicle, a trajectory of the vehicle, or a planned path of the vehicle. For example, a vehicle can be navigating an environment.
  • the preloader 104 can determine or predict, based on a current trajectory and a planned path of the vehicle, a location where the vehicle is likely to be.
  • the preloader 104 can provide an HD map that corresponds with the location. Thus, when the vehicle arrives at the location, the HD map that corresponds with the location is readily available.
  • the local geometry loader 102 can provide local geometry based on an HD map.
  • the local geometry loader 102 can inherit from the preloader 104 or receive an HD map from the preloader 104.
  • the local geometry loader 102 can provide geometry, such as local geometry, based on the HD map, or a portion of the HD map.
  • the local geometry can include information such as geometry of lane boundary lines.
  • the lane boundary line geometry provides the size, shape, and location of lane boundaries in an environment.
  • the lane boundary line geometry can indicate types of lane boundaries (e.g., solid line, dotted line, dashed line).
  • the local geometry can include other information such as road geometry and road markings.
  • the local geometry can be provided to, for example, the visual geometry localization module 108.
  • Visual geometry localization can be performed based on the local geometry.
  • the local geometry loader 102 can generate local geometry based on an HD map of an environment in which a vehicle is navigating.
  • the environment can include a road segment with three lanes.
  • the local geometry for the environment can include lane boundary line geometry describing the size, shape, and location of lane boundaries associated with the three lanes of the road segment.
  • the lane boundary line geometry can also describe the types of lane boundaries associated with the three lanes of the road segment.
  • the lane boundaries marking the edges of the road segment can be solid lines.
  • the lane boundaries separating the three lanes of the road segment can be dashed lines.
  • the visual geometry localization module 108 can provide position information to the local geometry loader 102.
  • the visual geometry localization module 108 can receive lane boundary line geometry from the local geometry loader 102 based on the position information.
  • the visual geometry localization module 108 can receive detected lane lines (or lane line detections) and a pose based on sensor data from a perception system.
  • the pose from the perception system can be a rough (or approximate) pose and provide sufficient precision to determine a road in which a vehicle is located (e.g., road-level precision).
  • the rough pose can be based on GPS localization.
  • the visual geometry localization module 108 can provide the detected lane lines, the lane boundary line geometry, and the rough pose to the visual geometry localizer 110.
  • the visual geometry localization module 108 can receive a pose from the visual geometry localizer 110.
  • the pose from the visual geometry localizer 110 can be a precise pose and provide sufficient precision to determine a lane in which a vehicle is located (e.g., lane-level precision).
  • the position information provided to the local geometry loader 102 can be based on the precise pose or, in some cases, the rough pose. While discussion provided herein may reference detected lane lines (or lane line detections) and lane boundary line geometry as examples, the present technology can apply to other types of visual geometry detections and local geometry. Many variations are possible.
  • the visual geometry localizer 110 can align detected lane lines with lane boundary line geometry.
  • the visual geometry localizer 110 can receive detected lane lines and a pose from the visual geometry localization module 108 or, in some cases, a perception system.
  • the received pose can be a rough pose and provide sufficient precision to determine a road in which a vehicle is located (e.g., road-level precision).
  • the visual geometry localizer 110 can align the detected lane lines with lane boundary line geometry of an environment associated with the received pose.
  • the visual geometry localizer 110 can align the detected lane lines with the lane boundary line geometry based on a search for rotational transformations (e.g., pitch, yaw, roll) and translational transformations (e.g., x-axis, y- axis, z-axis) associated with less than six degrees of freedom.
  • the search for rotational transformations can align the detected lane lines with the lane boundary line geometry with respect to rotation (e.g., angle).
  • the search for translational transformations can align the detected lane lines with the lane boundary line geometry with respect to translation (e.g., offset).
  • the search for rotational transformations can disregard (not consider) roll transformations. Roll transformations can be disregarded based on a flat plane assumption or disregarded as acceptable error within a selected level of tolerance. In the search for rotational transformations, a search for a pitch transformation can be performed.
  • the search for a pitch transformation can determine a pitch transformation that aligns the detected lane lines such that the detected lane lines are parallel with each other.
  • the search for a pitch transformation can determine a pitch transformation such that angle differences between detected lane lines and corresponding lane boundary line geometry is constant and with a low variance (e.g., the angle differences are within a threshold delta).
  • the pitch transformation aligns the detected lane lines with respect to pitch.
  • a yaw transformation can be determined based on the pitch transformation. Based on the pitch transformation, the detected lane lines can be aligned with respect to pitch and projected on the lane boundary line geometry.
  • An angle difference between each lane detection line as projected and corresponding lane boundary lines in the lane boundary line geometry can be determined.
  • a yaw transformation can be based on a median angle difference (or a mean angle difference) of the angle differences between the detected lane lines and the lane boundary lines.
  • the median angle difference (or the mean angle difference) can be referred to as a yaw error.
  • the yaw transformation can align the detected lane lines with respect to yaw.
  • a yaw transformation can be determined for each pitch transformation in a search for pitch transformations.
  • a yaw transformation can be determined after a pitch transformation has been determined and applied to detected lane lines. In either case, a search for rotational transformations to align detected lane lines with lane boundary line geometry can be performed with linear time complexity.
  • a search for translational transformations can be performed to align the detected lane lines and the lane boundary geometry with respect to translation (e.g., offset).
  • the search for translational transformations can disregard (not consider) z-axis.
  • the z-axis can be disregarded based on a flat plane assumption or disregarded as acceptable error within a selected level of tolerance.
  • a search for an x-axis transformation e.g., horizontal translation, horizontal offset
  • a search for a y-axis transformation e.g., vertical translation, vertical offset
  • the search for an x-axis transformation aligns the detected lane lines with the lane boundary geometry with respect to the x-axis.
  • the search for a y-axis transformation aligns the detected lane lines with the lane boundary geometry with respect to the y-axis. Because a search for translational transformations involves searches in two axes (instead of three axes), the search for translational transformations can be performed with quadratic time complexity.
  • FIGURE 2 illustrates an example block diagram 200 associated with aligning detected lane lines (or lane line detections) with lane boundary line geometry, according to some embodiments of the present technology.
  • the aligning of the detected lane lines with the lane boundary line geometry can be performed by, for example, the visual geometry localizer 110 of FIGURE 1. It should be understood that there can be additional, fewer, or alternative steps performed in similar or alternative orders, or in parallel, based on the various features and embodiments discussed herein unless otherwise stated.
  • detected lane lines 202 can be provided for alignment based on certain rotational transformations and translational transformations.
  • the detected lane lines 202 can undergo rotation alignment 204 based on rotational transformations to align the detected lane lines 202 with lane boundary line geometry with respect to rotation (e.g., angle).
  • the rotation alignment 204 can include pitch alignment 206.
  • the pitch alignment 206 can align the detected lane lines 202 with the lane boundary line geometry with respect to pitch based on a search for pitch transformations.
  • the rotation alignment 204 can include yaw alignment 208.
  • the yaw alignment 208 can align the detected lane lines 202 with the lane boundary line geometry with respect to yaw based on a yaw transformation determined based on a pitch transformation applied in the pitch alignment 206.
  • the detected lane lines 202 can undergo translation alignment 210 based on translational transformations to align the detected lane lines 202 with the lane boundary line geometry with respect to translation (e.g., offset).
  • the translation alignment 210 can include x alignment 212.
  • the x alignment 212 can align the detected lane lines 202 with the lane boundary line geometry with respect to an x-axis based on a search for x-axis transformations.
  • the translation alignment 210 can include y alignment 214.
  • the y alignment 214 can align the detected lane lines 202 with the lane boundary line geometry with respect to a y-axis based on a search for y-axis transformations.
  • Aligned detected lane lines 216 can be produced by applying the rotation alignment 204 and the translation alignment 210 to the detected lane lines 202. Accordingly, alignment of detected lane lines 202 with lane boundary line geometry, and ultimate determinations of vehicle localization, in accordance with the present technology can be achieved without rotational transformations and translational transformations in all six degrees of freedom. In some embodiments, transformations associated with certain degrees of freedom (e.g., roll transformations, z-axis transformations) do not need to be performed.
  • roll transformations and z-axis transformations are not performed.
  • alignment of detected lane lines with lane boundary line geometry can be advantageously performed based on a linear time complexity search and a quadratic time complexity search that are more efficient than a search for transformations involving six degrees of freedom, which undesirably would have n 6 time complexity.
  • FIGURE 3A illustrates an example 300 associated with aligning detected lane lines (or lane line detections) with lane boundary line geometry, according to some embodiments of the present technology.
  • the aligning of the detected lane lines with the lane boundary line geometry can be performed by, for example, the visual geometry localizer 110 of FIGURE 1. It should be understood that there can be additional, fewer, or alternative steps performed in similar or alternative orders, or in parallel, based on the various features and embodiments discussed herein unless otherwise stated.
  • detected lane lines 310a can be projected on lane boundary line geometry 312 without alignment. With no alignment 302, the detected lane lines 310a reflect the shape and orientation of the detected lane lines 310a as they are determined from sensor data of a perception system.
  • detected lane lines 310b aligned with respect to pitch can be projected on the lane boundary line geometry 312.
  • the detected lane lines 310b can be the detected lane lines 310a aligned with respect to pitch.
  • pitch alignment 304 the detected lane lines 310b have been aligned such that they are parallel with each other.
  • the detected lane lines 310b have a consistent angle difference with low variance with the lane boundary line geometry 312.
  • the visual geometry localizer 110 can generate a grid map based on lane boundary line geometry.
  • the grid map can be used to determine a degree to which detected lane lines align with the lane boundary line geometry and to generate a score for the alignment accordingly.
  • the grid map can include information associated with location of lane boundaries and types of lane boundaries (e.g., dotted line, dashed line, solid line).
  • the grid map can include a grid with a plurality of cells, with each cell in the grid corresponding with a portion of the lane boundary line geometry. The portion can be any suitable selected value of area (e.g., 10 cm 2 ).
  • each cell that corresponds with a portion of the lane boundary line geometry that is adjacent to a lane boundary line is assigned a value of 1.
  • Each cell that corresponds with a portion of the lane boundary line geometry that does not include a lane boundary line is assigned a value of 0.
  • the grid 356 includes cells assigned a value of 2 that correspond with where the lane boundary line in the section 354 is located.
  • the grid 356 includes cells assigned a value of 1 that are adjacent to the cells that correspond with where the lane boundary line in the section 354 is located.
  • the other cells in the grid 356, which are assigned a value of 0, do not correspond with where the lane boundary line in the section 354 is located and are not adjacent to the cells that correspond with where the lane boundary line in the section 354 is located.
  • Many variations are possible.
  • the optimal pose can be a pose that best aligns with the trajectory, is closest to the trajectory, or is within a threshold proximity of the trajectory.
  • the optimal pose can be determined based on which pose from the set of poses more closely corresponds with an estimated pose determined from the trajectory and the prior pose.
  • a vehicle can be navigating in an environment with a trajectory associated with straight forward longitudinal travel.
  • a pose can be determined for the vehicle while the vehicle is navigating the environment.
  • an estimated pose can be determined based on the pose and the trajectory.
  • a set of poses can be determined based on a set of detected lane lines that most closely match lane boundary line geometry associated with the environment.
  • An optimal pose can be determined from the set of poses based on which pose of the set of poses is closest to the estimated pose. Many variations are possible.
  • autonomous systems of vehicles rely on localization for various functions, such as path planning and safe navigation.
  • localization an autonomous system can accurately determine where a vehicle is located. Based on accurate determination of vehicle location, the autonomous system can, for example, plan a path that safely navigates the vehicle through an environment.
  • visual geometry localization can provide precise localization, allowing the autonomous system to have an accurate determination of where the vehicle is located and safely navigate the vehicle.
  • technological challenges in visual geometry localization can arise in situations where discrepancies exist between a high definition (HD) map of an environment and the environment itself.
  • HD high definition
  • visual geometry localization may generate erroneous results. For example, a vehicle can navigate a road that was recently repainted to have four lanes instead of three lanes. An HD map of the road may not reflect the recent repainting and indicate that the road has three lanes.
  • the vehicle can capture image data of the road. The captured image data can indicate that the road has four lanes. Accordingly, the captured image data, which indicates the road has four lanes, cannot correctly align with the HD map, which indicates the road has three lanes.
  • Visual geometry localization based on the captured image data and the HD map in this example can generate an incorrect pose for the vehicle because the captured image data and the HD map do not correctly align.
  • visual geometry localization for autonomous systems can be associated with technological challenges in various circumstances.
  • re-localization mode there is a consistent difference between a local result pose and a global result pose, and the global result pose is trusted for localization.
  • the pose of the vehicle is determined based on a fusion result pose (e.g., determined by a localization fusion module) involving the global result pose, the path planning can follow a lane tracking path.
  • a pose determined by the local search can be anywhere in the lane.
  • a mode of operation can be determined based on the global result pose and the local result pose.
  • the autonomous system determines a fusion result pose based on a fusion of the local result pose and results from other localization processes.
  • the autonomous system determines a pose of the vehicle based on the fusion result pose.
  • the autonomous system plans a path for the vehicle based on a driving path in the HD map and the pose of the vehicle.
  • operation in normal mode indicates that the HD map is consistent with the environment and can be relied on for visual geometry localization.
  • the mode of operation can switch to relocalization mode when the global result pose and the local result pose are a threshold distance away from each other for a threshold period of time.
  • the autonomous system determines a fusion result pose based on a fusion of the global result pose and results from other localization processes.
  • the autonomous system determines a pose of the vehicle based on the fusion result pose.
  • the autonomous system plans a path for the vehicle based on a lane tracking path. Operation in re-localization mode indicates that the HD map, or the local result pose, cannot be relied upon for visual geometry localization.
  • the mode of operation can switch to normal mode when the global result pose and the fusion result pose are within a threshold distance from each other for a threshold period of time.
  • FIGURE 5 illustrates an example system 500 including a visual geometry localizer 502, a localization fusion 504, and a planner 506, according to some embodiments of the present technology.
  • some or all of the functionality performed by the example system 500 may be performed by one or more computing systems implemented in any type of vehicle, such as an autonomous vehicle as further discussed herein.
  • some or all of the functionality performed by the example system 500 may be performed by one or more backend computing systems.
  • some or all of the functionality performed by the example system 500 may be performed by one or more computing systems associated with (e.g., carried by) one or more users riding in a vehicle.
  • the localization results can be filtered based on concurrence with other localization results. For example, a localization result that is a threshold difference away from other localization results that are within a threshold distance of each other can be filtered or otherwise disregarded.
  • the weighted and filtered localization results can be combined to determine a fusion result pose.
  • the fusion result pose can represent an aggregated localization result based on the localization results from the different localization processes.
  • a pose can be determined based on the fusion result pose. Many variations are possible.
  • a visual geometry localizer of a vehicle can perform a global search of an HD map of an environment including a road based on image data captured at the road. If the road has, for example, four lanes, the global search can search all four lanes for a position and an orientation that would allow a camera to capture the captured image data. The global search can generate a global result pose that indicates the vehicle is in, for example, the leftmost lane of the four lanes.
  • the visual geometry localizer 502 can generate a local result pose by visual geometry localization based on a local search of an HD map.
  • the local search of the HD map can be limited to a portion of the HD map, such as one lane of a road represented in the HD map.
  • the portion of the HD map can be based on a current pose or a prior pose of the vehicle.
  • the local search can involve capturing image data of an environment corresponding with the HD map.
  • the portion of the HD map is searched for a position and orientation that would allow a camera to capture the captured image data.
  • the local result pose can be determined based on the position and the orientation.
  • a visual geometry localizer of a vehicle can perform a local search of an HD map of a road based on image data captured at the road.
  • the road can, for example, have three lanes.
  • the local search can, therefore, search the middle lane for a position and an orientation that would allow a camera to capture the captured image data.
  • the local search can generate a local result pose that indicates where the vehicle is in the middle lane.
  • a mode of operation can be determined based on a global result pose and a local result pose generated by the visual geometry localizer 502.
  • normal mode can be a default mode of operation.
  • the visual geometry localizer 502 provides the local result pose to localization fusion 504.
  • the localization fusion 504 determines a fusion result pose based on the local result pose generated by the visual geometry localizer 502 and localization results from other localization processes. Operation in normal mode indicates that an HD map on which the local result pose and the global result pose are based is reliable.
  • a local result pose and a global result pose can consistently deviate with a stable bias if the local result pose and the global result pose are at least 1 .5 meters apart (or half a lane width) for at least 10 seconds. In other implementations, other threshold distances and other threshold periods of time can be used.
  • the visual geometry localizer 502 provides the global result pose to localization fusion 504.
  • the localization fusion 504 determines a fusion result pose based on the global result pose generated by the visual geometry localizer 502 and localization results from other localization processes.
  • Operation in re-localization mode indicates that the local result pose may be incorrect due to, for example, a lack of updates to an HD map on which the local result pose and the global result pose are based.
  • the global result pose and the fusion result pose are compared to determine whether the global result pose and the fusion result pose are converging (e.g., are within a threshold distance of each other for a threshold period of time). If the global result pose and the fusion result pose are converging, then the mode of operation switches to normal mode. If the global result pose and the fusion result pose are not converging (e.g., are a threshold distance away from each other), then the mode of operation remains in re-localization mode. Many variations are possible.
  • the planner 506 can generate a path for a vehicle to follow.
  • the planner 506 can generate a path based on a mode of operation (e.g., normal mode, re-localization mode).
  • a mode of operation e.g., normal mode, re-localization mode
  • the planner 506 can generate a path based on a driving path in an HD map of an environment.
  • the driving path in the HD map can be a stored path that navigates the environment in a preplanned manner.
  • the driving path can, for example, cross lanes in anticipation of an upcoming turn.
  • Operation in normal mode can be based on a determination that information in the HD map is reliable (e.g., the environment has not changed since the HD map was generated) and, accordingly, the driving path in the HD map is safe to follow.
  • the planner 506 can generate the path to follow the driving path while accounting for real-time conditions (e.g., weather, road hazards) and road objects (e.g., other vehicles).
  • a driving path in an HD map can include a preplanned route from a start location to a destination location that abides by various rules and regulations associated with driving a truck (e.g., a heavy truck with a trailer).
  • the driving path can route through certain lanes in accordance with the various rules and regulations.
  • a planner operating in normal mode can generate a path for a truck that follows the driving path. In re-localization mode, the planner 506 can generate a path based on lane tracking.
  • multi-mode visual geometry localization 602 includes two modes of operation - normal mode 604 and relocalization mode 606.
  • visual geometry localizer 608 can provide a local result pose to localization fusion 610.
  • the localization fusion 610 can generate a fusion result pose based on the local result pose and localization results from other localization processes.
  • the visual geometry localizer 608 can provide a local result pose and a global result pose to a difference calculator 612a.
  • the difference calculator 612a can determine a difference or deviation between the local result pose and the global result pose.
  • a determination can be made as to whether the difference between the local result pose and the global result pose is consistently biased 614. If the local result pose and the global result pose are not consistently biased, then operation can remain in normal mode 604. If the local result pose and the global result pose are consistently biased, then operation changes to re-localization mode 606.
  • the multi-mode visual geometry localization changes to re-localization mode at point 760.
  • the fusion result poses 752 are generated based on the global result poses 754 and localization results from other localization processes. From point 760 to point 762, the differences between the global result poses 754 and the fusion result poses 752 gradually reduce. At point 762, the differences between the global result poses 754 and the fusion result poses 752 are within a threshold distance (e.g., within 1 ,5m). Assuming the differences between the global result poses 754 and the fusion result poses 752 remain within the threshold distance for a threshold period of time (e.g., 10s), then the multi-mode visual geometry localization can change to normal mode. Many variations are possible.
  • FIGURE 9 illustrates an example of a computer system 900 that may be used to implement one or more of the embodiments of the present technology.
  • the computer system 900 includes sets of instructions 924 for causing the computer system 900 to perform the processes and features discussed herein.
  • the computer system 900 may be connected (e.g., networked) to other machines and/or computer systems. In a networked deployment, the computer system 900 may operate in the capacity of a server or a client machine in a client-server network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
  • the computer system 900 includes a processor 902 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 904, and a nonvolatile memory 906 (e.g., volatile RAM and non-volatile RAM, respectively), which communicate with each other via a bus 908.
  • the computer system 900 can be a desktop computer, a laptop computer, personal digital assistant (PDA), or mobile phone, for example.
  • machine-readable medium 922 is shown in an exemplary embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
  • the term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding, or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present technology.
  • machine- readable media include, but are not limited to, recordable type media such as volatile and non-volatile memory devices; solid state memories; floppy and other removable disks; hard disk drives; magnetic media; optical disks (e.g., Compact Disk Read-Only Memory (CD ROMS), Digital Versatile Disks (DVDs)); other similar non-transitory (or transitory), tangible (or non-tangible) storage medium; or any type of medium suitable for storing, encoding, or carrying a series of instructions for execution by the computer system 900 to perform any one or more of the processes and features described herein.
  • recordable type media such as volatile and non-volatile memory devices; solid state memories; floppy and other removable disks; hard disk drives; magnetic media; optical disks (e.g., Compact Disk Read-Only Memory (CD ROMS), Digital Versatile Disks (DVDs)); other similar non-transitory (or transitory), tangible (or non-tangible) storage medium; or any type
  • the executable routines and data may be stored in various places, including, for example, ROM, volatile RAM, non-volatile memory, and/or cache memory. Portions of these routines and/or data may be stored in any one of these storage devices. Further, the routines and data can be obtained from centralized servers or peer-to-peer networks. Different portions of the routines and data can be obtained from different centralized servers and/or peer-to-peer networks at different times and in different communication sessions, or in a same communication session. The routines and data can be obtained in entirety prior to the execution of the applications. Alternatively, portions of the routines and data can be obtained dynamically, just in time, when needed for execution.
  • routines and data be on a machine-readable medium in entirety at a particular instance of time.
  • the embodiments described herein can be implemented using special purpose circuitry, with or without software instructions, such as using Application-Specific Integrated Circuit (ASIC) or Field- Programmable Gate Array (FPGA).
  • ASIC Application-Specific Integrated Circuit
  • FPGA Field- Programmable Gate Array
  • Embodiments can be implemented using hardwired circuitry without software instructions, or in combination with software instructions.
  • the techniques are limited neither to any specific combination of hardware circuitry and software, nor to any particular source for the instructions executed by the data processing system.
  • references in this specification to “one embodiment,” “an embodiment,” “other embodiments,” “another embodiment,” “in various embodiments,” “in an example,” “in one implementation,” or the like means that a particular feature, design, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the technology.
  • the appearances of, for example, the phrases “according to an embodiment,” “in one embodiment,” “in an embodiment,” “in various embodiments,” or “in another embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
  • each of the various elements of the invention and claims may also be achieved in a variety of manners.
  • This technology should be understood to encompass each such variation, be it a variation of an embodiment of any apparatus (or system) embodiment, a method or process embodiment, a computer readable medium embodiment, or even merely a variation of any element of these.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Des systèmes, des procédés et des supports lisibles par ordinateur non transitoires peuvent effectuer des opérations consistant : à déterminer des détections de géométrie visuelle (par exemple, des détections de ligne de voie) associées à une géométrie correspondant à une carte ; à aligner les détections de géométrie visuelle avec la géométrie en fonction de transformations associées à des degrés de liberté sélectionnés ; et à déterminer la pose d'un véhicule en fonction de l'alignement des détections de géométrie visuelle avec la géométrie.
PCT/US2023/024065 2022-06-20 2023-05-31 Localisation de géométrie visuelle multimode WO2023249802A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US17/844,650 2022-06-20
US17/844,650 US11562501B1 (en) 2022-06-20 2022-06-20 Multi-mode visual geometry localization
US17/844,655 2022-06-20
US17/844,655 US11624616B1 (en) 2022-06-20 2022-06-20 Multi-mode visual geometry localization

Publications (1)

Publication Number Publication Date
WO2023249802A1 true WO2023249802A1 (fr) 2023-12-28

Family

ID=89380422

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/024065 WO2023249802A1 (fr) 2022-06-20 2023-05-31 Localisation de géométrie visuelle multimode

Country Status (1)

Country Link
WO (1) WO2023249802A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180188027A1 (en) * 2016-12-30 2018-07-05 DeepMap Inc. Visual odometry and pairwise alignment for high definition map creation
US20210407186A1 (en) * 2020-06-30 2021-12-30 Lyft, Inc. Multi-Collect Fusion

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180188027A1 (en) * 2016-12-30 2018-07-05 DeepMap Inc. Visual odometry and pairwise alignment for high definition map creation
US20210407186A1 (en) * 2020-06-30 2021-12-30 Lyft, Inc. Multi-Collect Fusion

Similar Documents

Publication Publication Date Title
US11320836B2 (en) Algorithm and infrastructure for robust and efficient vehicle localization
US10436595B2 (en) Method and system for updating localization maps of autonomous driving vehicles
US10380890B2 (en) Autonomous vehicle localization based on walsh kernel projection technique
Xia et al. Integrated inertial-LiDAR-based map matching localization for varying environments
JP6653381B2 (ja) 自律走行車の制御フィードバックに基づくマップ更新方法およびシステム
KR101581286B1 (ko) 무인운전차량의 자율 주행을 위한 경로 추종 시스템 및 방법
US11562501B1 (en) Multi-mode visual geometry localization
US10429849B2 (en) Non-linear reference line optimization method using piecewise quintic polynomial spiral paths for operating autonomous driving vehicles
CN108391429B (zh) 用于自主车辆速度跟随的方法和系统
US20210365038A1 (en) Local sensing based autonomous navigation, and associated systems and methods
JP2019505423A (ja) 比例、積分及び微分(pid)コントローラを用いた自律走行車のステアリング制御方法及びシステム
US11579622B2 (en) Systems and methods for utilizing images to determine the position and orientation of a vehicle
Lee et al. LiDAR odometry survey: recent advancements and remaining challenges
JP2020083306A (ja) 自動運転車両を動作させるための所定のキャリブレーションテーブルに基づく車両制御システム
US20240282124A1 (en) Vehicle localization based on lane templates
US11624616B1 (en) Multi-mode visual geometry localization
WO2024155602A1 (fr) Systèmes et procédés de navigation d'un véhicule par la création d'une carte dynamique en fonction d'une segmentation de voie
WO2018180247A1 (fr) Dispositif de sortie, procédé de commande, programme et support de stockage
WO2023249802A1 (fr) Localisation de géométrie visuelle multimode
CN115542899A (zh) 车辆路径跟踪的方法、装置、车辆、电子设备及介质
JP6717132B2 (ja) 車両走行制御方法及び車両走行制御装置
US11630199B2 (en) Sensor information fusion device and method thereof
Zhang et al. Multi-Sensor Fusion Localization
Lian et al. Traversable map construction and robust localization for unstructured road environments 1
CN117333535A (zh) 路标辅助定位方法、自移动设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23827685

Country of ref document: EP

Kind code of ref document: A1