WO2022087014A1 - Systems and methods for producing occupancy maps for robotic devices - Google Patents

Systems and methods for producing occupancy maps for robotic devices Download PDF

Info

Publication number
WO2022087014A1
WO2022087014A1 PCT/US2021/055677 US2021055677W WO2022087014A1 WO 2022087014 A1 WO2022087014 A1 WO 2022087014A1 US 2021055677 W US2021055677 W US 2021055677W WO 2022087014 A1 WO2022087014 A1 WO 2022087014A1
Authority
WO
WIPO (PCT)
Prior art keywords
robot
voxels
pixels
voxel
sensor
Prior art date
Application number
PCT/US2021/055677
Other languages
French (fr)
Inventor
Jayram MOORKANIKARA-NAGESWARAN
Siddharth ATRE
Original Assignee
Brain Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brain Corporation filed Critical Brain Corporation
Publication of WO2022087014A1 publication Critical patent/WO2022087014A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • G01S17/931Lidar systems specially adapted for specific applications for anti-collision purposes of land vehicles

Definitions

  • the present application relates generally to robotics, and more specifically to systems and methods for producing occupancy maps for robotic devices.
  • the present disclosure provides, inter alia, systems and methods for producing occupancy maps for robotic devices.
  • the present disclosure is directed at apractical application of data collection and filtering to produce computer-readable maps for navigating robots at an improved rate to enable the robots to more readily perceive and adapt their motions to their environments.
  • robot may generally refer to an autonomous vehicle or object that travels a route, executes a task, or otherwise moves automatically upon executing or processing computer-readable instructions.
  • a method comprises a controller of a robot receiving a point cloud, the point cloud produced based on measurements from at least one sensor of the robot; discretizing the point cloud into a plurality of voxels, each voxel comprising a count; associating points of the point cloud which fall within a same voxel with a single point within the voxel; and ray marching from an origin of the sensor to the single point within each voxel to detect empty space surrounding the robot.
  • the method further comprises the controller producing a computer-readable map of an environment of the robot, the computer-readable map comprising a plurality of pixels, each pixel being identified as occupied or empty space based on the counts of the voxels above each of the plurality of pixels.
  • the method further comprises the controller setting a count equal to zero for occupied pixels comprising neighboring pixels which include a cumulative total count less than a threshold value.
  • points of the same voxel are determined based on adjacency within an image plane of at least one sensor.
  • the magnitude of the distance measurements is less than a distance threshold.
  • At least one sensor includes one of a scanning planar LiDAR, two-dimensional LiDAR, or depth camera.
  • FIG. 1 A is a functional block diagram of a robot in accordance with some embodiments of this disclosure.
  • FIG. IB is a functional block diagram of a controller or processor in accordance with some embodiments of this disclosure.
  • FIG. 2 A(i-ii) illustrates a time of flight or LiDAR sensor used to produce a point cloud, according to an exemplary embodiment.
  • FIG. 2B illustrates methods for localizing an origin of a sensor within an environment of a robot, according to an exemplary embodiment.
  • FIG. 3 illustrates an image plane of a sensor configured to produce a point cloud in accordance with some embodiments of the present disclosure.
  • FIG. 4 illustrates a ray marching procedure used to detect empty space surrounding a robot, according to an exemplary embodiment.
  • FIG. 5A-B illustrates an adjacency filtering procedure used to accelerate the ray marching procedure, according to an exemplary embodiment.
  • FIG. 5C is a process flow diagram illustrating a method for performing the adjacency filtering procedure, according to an exemplary embodiment.
  • FIG. 6A-B illustrates a robot utilizing adjacency filtering to accelerate the ray marching procedure, according to an exemplary embodiment.
  • FIG. 6C illustrates a sensor collecting a point cloud representing an object, according to an exemplary embodiment.
  • FIG. 6D illustrates the point cloud collected in FIG. 6C being utilized for the ray marching procedure subsequent to performance of the adjacency filtering procedure, according to an exemplary embodiment.
  • FIG. 7A illustrates a projection of a 3D point cloud onto a 2D plane to produce an occupancy map, according to an exemplary embodiment.
  • FIG. 7B illustrates a projection of a 3D volume of labeled voxels onto a 2D plane to produce an occupancy map, according to an exemplary embodiment.
  • FIG. 8A-B illustrates the removal of a spatially isolated pixel or voxel of an occupancy map, according to an exemplary embodiment.
  • FIG. 9 is a process flow diagram illustrating a method for producing an occupancy map from a point cloud, according to an exemplary embodiment.
  • FIGS. 10A-B illustrates the ray marching procedure for two points and a marking of a count for voxels in discrete 3D space, according to an exemplary embodiment.
  • FIG. 11 illustrates a single instruction multiple data processor to illustrate how adjacency filtering procedure is configured to be executed in parallel using common hardware elements of a robot, according to an exemplary embodiment.
  • robots may utilize computer-readable maps to perceive and map their surrounding environments.
  • the computer-readable maps may additionally be utilized to determine routes or movements of the robots such that the robots avoid objects within their environments. Rapid production of these maps using incoming data from sensors of the robots may be critical for the robots to operate efficiently and avoid collisions.
  • LiDAR light detection and ranging
  • the point clouds are typically of high resolution (e.g., floating point accuracy), wherein use of point clouds to navigate an environment may be computationally taxing and/or slow.
  • point clouds do not denote any shapes of objects as points of the point clouds do not comprise any dimensionality or volume. Accordingly, use of an occupancy map may enable robots to more readily perceive their surrounding environments and plan their motions accordingly.
  • Occupancy maps may include denotations for occupied regions (i.e., objects) and free space regions, wherein robots may plan their motions within the free space regions to avoid collisions.
  • Ray marching, or ray tracing may enable robots to detect free space surrounding them as described below, however ray marching may be computationally taxing, slow, and include a plurality of redundant calculations. The redundant calculations are increased as (i) point cloud density increases (i.e., based on sensor resolution), and/or (ii) as objects are localized closer to the robot. Accordingly, there is a need in the art for systems and methods for accelerating the ray marching procedure to enable robots to produce occupancy maps using incoming point cloud data at faster speeds.
  • a robot may include mechanical and/or virtual entities configured to carry out a complex series of tasks or actions autonomously.
  • robots may be machines that are guided and/or instructed by computer programs and/or electronic circuitry.
  • robots may include electro-mechanical components that are configured for navigation, where the robot may move from one location to another.
  • Such robots may include autonomous and/or semi-autonomous cars, floor cleaners, rovers, drones, planes, boats, carts, trams, wheelchairs, industrial equipment, stocking machines, mobile platforms, personal transportation devices (e.g., hover boards, scooters, self-balancing vehicles such as manufactured by Segway, etc.), trailer movers, vehicles, and the like.
  • Robots may also include any autonomous and/or semi-autonomous machine for transporting items, people, animals, cargo, freight, objects, luggage, and/or anything desirable from one location to another.
  • network interfaces may include any signal, data, or software interface with a component, network, or process including, without limitation, those of the FireWire (e.g., FW400, FW800, FWS800T, FWS1600, FWS3200, etc.), universal serial bus (“USB”) (e.g., USB l.X, USB 2.0, USB 3.0, USB Type-C, etc.), Ethernet (e.g., 10/100, 10/100/1000 (Gigabit Ethernet), I0-Gig- E, etc.), multimedia over coax alliance technology (“MoCA”), Coaxsys (e.g., TVNETTM), radio frequency tuner (e.g., in-band or OOB, cable modem, etc.), Wi-Fi (802.11), WiMAX (e.g., WiMAX (802.16)), PAN (e.g., PAN/802.15), cellular (e.g., 3G, LTE/LTE-A/TD-LTE/TD
  • processor, microprocessor, and/or digital processor may include any type of digital processing device such as, without limitation, digital signal processors (“DSPs”), reduced instruction set computers (“RISC”), complex instruction set computers (“CISC”) processors, microprocessors, gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors, and application-specific integrated circuits (“ASICs”).
  • DSPs digital signal processors
  • RISC reduced instruction set computers
  • CISC complex instruction set computers
  • microprocessors e.g., gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors, and application-specific integrated circuits (“ASICs”).
  • DSPs digital signal processors
  • RISC reduced instruction set computers
  • computer program and/or software may include any sequence or human- or machine -cognizable steps that perform a function.
  • Such computer program and/or software may be rendered in any programming language or environment including, for example, C/C++, C#, Fortran, COBOL, MATLABTM, PASCAL, GO, RUST, SCALA, Python, assembly language, markup languages (e.g., HTML, SGML, XML, VoXML), and the like, as well as object-oriented environments such as the Common Object Request Broker Architecture (“CORBA”), JAVATM (including J2ME, Java Beans, etc.), Binary Runtime Environment (e.g., “BREW”), and the like.
  • CORBA Common Object Request Broker Architecture
  • JAVATM including J2ME, Java Beans, etc.
  • BFW Binary Runtime Environment
  • connection, link, and/or wireless link may include a causal link between any two or more entities (whether physical or logical/virtual), which enables information exchange between the entities.
  • computer and/or computing device may include, but are not limited to, personal computers (“PCs”) and minicomputers, whether desktop, laptop, or otherwise, mainframe computers, workstations, servers, personal digital assistants (“PDAs”), handheld computers, embedded computers, programmable logic devices, personal communicators, tablet computers, mobile devices, portable navigation aids, J2ME equipped devices, cellular telephones, smart phones, personal integrated communication or entertainment devices, and/or any other device capable of executing a set of instructions and processing an incoming data signal.
  • PCs personal computers
  • PDAs personal digital assistants
  • handheld computers handheld computers
  • embedded computers embedded computers
  • programmable logic devices personal communicators
  • tablet computers tablet computers
  • mobile devices portable navigation aids
  • J2ME equipped devices portable navigation aids
  • cellular telephones smart phones
  • personal integrated communication or entertainment devices personal integrated communication or entertainment devices
  • the systems and methods of this disclosure at least: (i) accelerate depth and environmental perception of robotic devices that utilize sensors configured to produce point clouds; (ii) enable robots to accelerate the production of occupancy maps when objects are close to the robots, which may be when robots need up-to-date map information the most; (iii) enhance accuracy of occupancy maps of robots by reducing noise; and (iv) take advantage of common hardware elements of robots to accelerate their production of occupancy maps.
  • Other advantages are readily discernible by one having ordinary skill in the art given the contents of the present disclosure.
  • FIG. 1A is a functional block diagram of a robot 102 in accordance with some principles of this disclosure.
  • robot 102 may include controller 118, memory 120, user interface unit 112, sensor units 114, navigation units 106, actuator unit 108, and communications unit 116, as well as other components and subcomponents (e.g., some of which may not be illustrated).
  • controller 118 memory 120
  • user interface unit 112 user interface unit 112
  • sensor units 114 e.g., sensor units 114, navigation units 106, actuator unit 108, and communications unit 116
  • other components and subcomponents e.g., some of which may not be illustrated.
  • FIG. 1A Although a specific embodiment is illustrated in FIG. 1A, it is appreciated that the architecture may be varied in certain embodiments as would be readily apparent to one of ordinary skill given the contents of the present disclosure.
  • robot 102 may be representative at least in part of any robot described in this disclosure.
  • Controller 118 may control the various operations performed by robot 102. Controller 118 may include and/or comprise one or more processors (e.g., microprocessors) and other peripherals.
  • processors e.g., microprocessors
  • processors e.g., microprocessors
  • processors e.g., microprocessors
  • other peripherals e.g., peripherals.
  • processors e.g., microprocessors
  • processors e.g., microprocessors
  • RISC reduced instruction set computers
  • CISC complex instruction set computers
  • microprocessors gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors and application-specific integrated circuits (“ASICs”).
  • DSPs digital signal processors
  • RISC reduced instruction set computers
  • CISC complex instruction set computers
  • microprocessors gate arrays (
  • Peripherals may include hardware accelerators configured to perform a specific function using hardware elements such as, without limitation, encryption/description hardware, algebraic processors (e.g., tensor processing units, quadradic problem solvers, multipliers, etc.), data compressors, encoders, arithmetic logic units (“ALU”), and the like.
  • Such digital processors may be contained on a single unitary integrated circuit die, or distributed across multiple components.
  • Controller 118 may be operatively and/or communicatively coupled to memory 120.
  • Memory 120 may include any type of integrated circuit or other storage device configured to store digital data including, without limitation, read-only memory (“ROM”), random access memory (“RAM”), non-volatile random access memory (“NVRAM”), programmable read-only memory (“PROM”), electrically erasable programmable read-only memory (“EEPROM”), dynamic randomaccess memory (“DRAM”), Mobile DRAM, synchronous DRAM (“SDRAM”), double data rate SDRAM (“DDR/2 SDRAM”), extended data output (“EDO”) RAM, fast page mode RAM (“FPM”), reduced latency DRAM (“RLDRAM”), static RAM (“SRAM”), flash memory (e.g., NAND/NOR), memristor memory, pseudostatic RAM (“PSRAM”), etc.
  • ROM read-only memory
  • RAM random access memory
  • NVRAM non-volatile random access memory
  • PROM programmable read-only memory
  • EEPROM electrically erasable programm
  • Memory 120 may provide instructions and data to controller 118.
  • memory 120 may be a non-transitory, computer-readable storage apparatus and/or medium having a plurality of instructions stored thereon, the instructions being executable by a processing apparatus (e.g., controller 118) to operate robot 102.
  • the instructions may be configured to, when executed by the processing apparatus, cause the processing apparatus to perform the various methods, features, and/or functionality described in this disclosure.
  • controller 118 may perform logical and/or arithmetic operations based on program instructions stored within memory 120.
  • the instructions and/or data of memory 120 may be stored in a combination of hardware, some located locally within robot 102, and some located remote from robot 102 (e.g., in a cloud, server, network, etc.).
  • a processor may be internal to or on board robot 102 and/or may be external to robot 102 and be communicatively coupled to controller 118 of robot 102 utilizing communication units 116 wherein the external processor may receive data from robot 102, process the data, and transmit computer-readable instructions back to controller 118.
  • the processor may be on a remote server (not shown).
  • memory 120 may store a library of sensor data.
  • the sensor data may be associated at least in part with objects and/or people.
  • this library may include sensor data related to objects and/or people in different conditions, such as sensor data related to objects and/or people with different compositions (e.g., materials, reflective properties, molecular makeup, etc.), different lighting conditions, angles, sizes, distances, clarity (e.g., blurred, obstructed/occluded, partially off frame, etc.), colors, surroundings, and/or other conditions.
  • the sensor data in the library may be taken by a sensor (e.g., a sensor of sensor units 114 or any other sensor) and/or generated automatically, such as with a computer program that is configured to generate/simulate (e.g., in a virtual world) library sensor data (e.g., which may generate/simulate these library data entirely digitally and/or beginning from actual sensor data) from different lighting conditions, angles, sizes, distances, clarity (e.g., blurred, obstructed/occluded, partially off frame, etc.), colors, surroundings, and/or other conditions.
  • a sensor e.g., a sensor of sensor units 114 or any other sensor
  • a computer program that is configured to generate/simulate (e.g., in a virtual world) library sensor data (e.g., which may generate/simulate these library data entirely digitally and/or beginning from actual sensor data) from different lighting conditions, angles, sizes, distances, clarity (e.g., blurred, obstructed/occ
  • the number of images in the library may depend at least in part on one or more of the amount of available data, the variability of the surrounding environment in which robot 102 operates, the complexity of objects and/or people, the variability in appearance of objects, physical properties of robots, the characteristics of the sensors, and/or the amount of available storage space (e.g., in the library, memory 120, and/or local or remote storage).
  • the library may be stored on a network (e.g., cloud, server, distributed network, etc.) and/or may not be stored completely within memory 120.
  • various robots may be networked so that data captured by individual robots are collectively shared with other robots. In such a fashion, these robots may be configured to learn and/or share sensor data in order to facilitate the ability to readily detect and/or identify errors and/or assist events.
  • operative units 104 may be coupled to controller 118, or any other controller, to perform the various operations described in this disclosure. One, more, or none of the modules in operative units 104 may be included in some embodiments. Throughout this disclosure, reference may be to various controllers and/or processors.
  • a single controller may serve as the various controllers and/or processors described. In other embodiments different controllers and/or processors may be used, such as controllers and/or processors used particularly for one or more operative units 104. Controller 118 may send and/or receive signals, such as power signals, status signals, data signals, electrical signals, and/or any other desirable signals, including discrete and analog signals to operative units 104.
  • Controller 118 may coordinate and/or manage operative units 104, and/or set timings (e.g., synchronously or asynchronously), turn off/on control power budgets, receive/send network instructions and/or updates, update firmware, send interrogatory signals, receive and/or send statuses, and/or perform any operations for running features of robot 102.
  • timings e.g., synchronously or asynchronously
  • turn off/on control power budgets e.g., synchronously or asynchronously
  • receive/send network instructions and/or updates e.g., update firmware, send interrogatory signals, receive and/or send statuses, and/or perform any operations for running features of robot 102.
  • operative units 104 may include various units that perform functions for robot 102.
  • operative units 104 includes at least navigation units 106, actuator units 108, user interface units 112, sensor units 114, and communication units 116.
  • Operative units 104 may also comprise other units such as specifically configured task units (not shown) that provide the various functionality of robot 102.
  • operative units 104 may be instantiated in software, hardware, or both software and hardware.
  • units of operative units 104 may comprise computer-implemented instructions executed by a controller.
  • units of operative units 104 may comprise hardcoded logic (e.g., ASICS).
  • units of operative units 104 may comprise both computer-implemented instructions executed by a controller and hardcoded logic. Where operative units 104 are implemented in part in software, operative units 104 may include units/modules of code configured to provide one or more functionalities.
  • navigation units 106 may include systems and methods that may computationally construct and update a map of an environment, localize robot 102 (e.g., find its position) in a map, and navigate robot 102 to/from destinations.
  • the mapping may be performed by imposing data obtained in part by sensor units 114 into a computer-readable map representative at least in part of the environment.
  • a map of an environment may be uploaded to robot 102 through user interface units 112, uploaded wirelessly or through wired connection, or taught to robot 102 by a user.
  • navigation units 106 may include components and/or software configured to provide directional instructions for robot 102 to navigate. Navigation units 106 may process maps, routes, and localization information generated by mapping and localization units, data from sensor units 114, and/or other operative units 104.
  • actuator units 108 may include actuators such as electric motors, gas motors, driven magnet systems, solenoid/ratchet systems, piezoelectric systems (e.g., inchworm motors), magnetostrictive elements, gesticulation, and/or any way of driving an actuator known in the art.
  • actuators may actuate the wheels for robot 102 to navigate a route; navigate around obstacles; or repose cameras and sensors.
  • actuator unit 108 may include systems that allow movement of robot 102, such as motorized propulsion.
  • motorized propulsion may move robot 102 in a forward or backward direction, and/or be used at least in part in turning robot 102 (e.g., left, right, and/or any other direction).
  • actuator unit 108 may control if robot 102 is moving or is stopped and/or allow robot 102 to navigate from one location to another location.
  • Actuator unit 108 may also include any system used for actuating, in some cases actuating task units to perform tasks.
  • actuator unit 108 may include driven magnet systems, motors/engines (e.g., electric motors, combustion engines, steam engines, and/or any type of motor/engine known in the art), solenoid/ratchet system, piezoelectric system (e.g., an inchworm motor), magnetostrictive elements, gesticulation, and/or any actuator known in the art.
  • motors/engines e.g., electric motors, combustion engines, steam engines, and/or any type of motor/engine known in the art
  • solenoid/ratchet system e.g., piezoelectric system
  • piezoelectric system e.g., an inchworm motor
  • magnetostrictive elements gesticulation, and/or any actuator known in the art.
  • sensor units 114 may comprise systems and/or methods that may detect characteristics within and/or around robot 102.
  • Sensor units 114 may comprise a plurality and/or a combination of sensors.
  • Sensor units 114 may include sensors that are internal to or on board robot 102 or external, and/or have components that are partially internal and/or partially external.
  • sensor units 114 may include one or more exteroceptive sensors, such as sonars, light detection and ranging (“LiDAR”) sensors, radars, lasers, cameras (including video cameras (e.g., red-blue-green (“RBG”) cameras, infrared cameras, three-dimensional (“3D”) cameras, thermal cameras, etc.), time of flight (“ToF”) cameras, structured light cameras, antennas, motion detectors, microphones, and/or any other sensor known in the art.
  • sensor units 114 may collect raw measurements (e.g., currents, voltages, resistances, gate logic, etc.) and/or transformed measurements (e.g., distances, angles, detected points in obstacles, etc.).
  • measurements may be aggregated and/or summarized.
  • Sensor units 114 may generate data based at least in part on distance or height measurements. Such data may be stored in data structures, such as matrices, arrays, queues, lists, stacks, bags, etc.
  • sensor units 114 may include sensors that may measure internal characteristics of robot 102.
  • sensor units 114 may measure temperature, power levels, statuses, and/or any characteristic of robot 102.
  • sensor units 114 may be configured to determine the odometry of robot 102.
  • sensor units 114 may include proprioceptive sensors, which may comprise sensors such as accelerometers, inertial measurement units (“IMU”), odometers, gyroscopes, speedometers, cameras (e.g. using visual odometry), clock/timer, and the like. Odometry may facilitate autonomous navigation and/or autonomous actions of robot 102.
  • IMU inertial measurement units
  • odometers e.g. using visual odometry
  • clock/timer e.g. using visual odometry
  • This odometry may include robot 102’s position (e.g., where position may include robot’s location, displacement and/or orientation, and may sometimes be interchangeable with the term pose as used herein) relative to the initial location.
  • Such data may be stored in data structures, such as matrices, arrays, queues, lists, stacks, bags, etc.
  • the data structure of the sensor data may be called an image.
  • sensor units 114 may be in part external to robot 102 and coupled to communications units 116.
  • a security camera within an environment of a robot 102 may provide a controller 118 of the robot 102 with a video feed via wired or wireless communication channel(s).
  • sensor units 114 may include sensors configured to detect the presence of an object at a location such as, for example without limitation, a pressure or motion sensor may be disposed at a shopping cart storage location of a grocery store, wherein the controller 118 of the robot 102 may utilize data from the pressure or motion sensor to determine if the robot 102 should retrieve more shopping carts for customers.
  • user interface units 112 may be configured to enable a user to interact with robot 102.
  • user interface units 112 may include touch panels, buttons, keypads/keyboards, ports (e.g., universal serial bus (“USB”), digital visual interface (“DVI”), Display Port, E-Sata, Firewire, PS/2, Serial, VGA, SCSI, audioport, high-definition multimedia interface (“HDMI”), personal computer memory card international association (“PCMCIA”) ports, memory card ports (e.g., secure digital (“SD”) and miniSD), and/or ports for computer-readable medium), mice, rollerballs, consoles, vibrators, audio transducers, and/or any interface for a user to input and/or receive data and/or commands, whether coupled wirelessly or through wires.
  • USB universal serial bus
  • DVI digital visual interface
  • Display Port Display Port
  • E-Sata Firewire
  • PS/2 Serial, VGA, SCSI
  • HDMI high-definition multimedia interface
  • PCMCIA personal computer memory card international association
  • User interface units 218 may include a display, such as, without limitation, liquid crystal display (“ECDs”), light-emitting diode (“LED”) displays, LED LCD displays, in-plane-switching (“IPS”) displays, cathode ray tubes, plasma displays, high definition (“HD”) panels, 4K displays, retina displays, organic LED displays, touchscreens, surfaces, canvases, and/or any displays, televisions, monitors, panels, and/or devices known in the art for visual presentation.
  • ECDs liquid crystal display
  • LED light-emitting diode
  • IPS in-plane-switching
  • cathode ray tubes plasma displays
  • HD high definition
  • 4K displays retina displays
  • organic LED displays organic LED displays
  • touchscreens touchscreens
  • canvases canvases
  • any displays televisions, monitors, panels, and/or devices known in the art for visual presentation.
  • user interface units 112 may be positioned on the body of robot 102.
  • user interface units 112 may be positioned away from the body of robot 102 but may be communicatively coupled to robot 102 (e.g., via communication units including transmitters, receivers, and/or transceivers) directly or indirectly (e.g., through a network, server, and/or a cloud).
  • user interface units 112 may include one or more projections of images on a surface (e.g., the floor) proximally located to the robot, e.g., to provide information to the occupant or to people around the robot.
  • the information could be the direction of future movement of the robot, such as an indication of moving forward, left, right, back, at an angle, and/or any other direction. In some cases, such information may utilize arrows, colors, symbols, etc.
  • communications unit 116 may include one or more receivers, transmitters, and/or transceivers. Communications unit 116 may be configured to send/receive a transmission protocol, such as BLUETOOTH®, ZIGBEE®, Wi-Fi, induction wireless data transmission, radio frequencies, radio transmission, radio-frequency identification (“RFID”), near- field communication (“NFC”), infrared, network interfaces, cellular technologies such as 3G (3GPP/3GPP2), high-speed downlink packet access (“HSDPA”), high-speed uplink packet access (“HSUPA”), time division multiple access (“TDMA”), code division multiple access (“CDMA”) (e.g., IS-95A, wideband code division multiple access (“WCDMA”), etc.), frequency hopping spread spectrum (“FHSS”), direct sequence spread spectrum (“DSSS”), global system for mobile communication (“GSM”), Personal Area Network (“PAN”) (e.g., PAN/802.15), worldwide interoperability for microwave access (“WiMA
  • a transmission protocol such as BLU
  • Communications unit 116 may also be configured to send/receive signals utilizing a transmission protocol over wired connections, such as any cable that has a signal line and ground.
  • a transmission protocol such as any cable that has a signal line and ground.
  • cables may include Ethernet cables, coaxial cables, Universal Serial Bus (“USB”), FireWire, and/or any connection known in the art.
  • USB Universal Serial Bus
  • Such protocols may be used by communications unit 116 to communicate to external systems, such as computers, smart phones, tablets, data capture systems, mobile telecommunications networks, clouds, servers, or the like.
  • Communications unit 116 may be configured to send and receive signals comprising numbers, letters, alphanumeric characters, and/or symbols.
  • signals may be encrypted, using algorithms such as 128-bit or 256-bit keys and/or other encryption algorithms complying with standards such as the Advanced Encryption Standard (“AES”), RSA, Data Encryption Standard (“DES”), Triple DES, and the like.
  • Communications unit 116 may be configured to send and receive statuses, commands, and other data/information.
  • communications unit 116 may communicate with a user operator to allow the user to control robot 102.
  • Communications unit 116 may communicate with a server/network (e.g., a network) in order to allow robot 102 to send data, statuses, commands, and other communications to the server.
  • the server may also be communicatively coupled to computer(s) and/or device(s) that may be used to monitor and/or control robot 102 remotely.
  • Communications unit 116 may also receive updates (e.g., firmware or data updates), data, statuses, commands, and other communications from a server for robot 102.
  • operating system 110 may be configured to manage memory 120, controller 118, power supply 122, modules in operative units 104, and/or any software, hardware, and/or features of robot 102.
  • operating system 110 may include device drivers to manage hardware recourses for robot 102.
  • power supply 122 may include one or more batteries, including, without limitation, lithium, lithium ion, nickel-cadmium, nickel-metal hydride, nickelhydrogen, carbon-zinc, silver-oxide, zinc-carbon, zinc-air, mercury oxide, alkaline, or any other type of battery known in the art. Certain batteries may be rechargeable, such as wirelessly (e.g., by resonant circuit and/or a resonant tank circuit) and/or plugging into an external power source. Power supply 122 may also be any supplier of energy, including wall sockets and electronic devices that convert solar, wind, water, nuclear, hydrogen, gasoline, natural gas, fossil fuels, mechanical energy, steam, and/or any power source into electricity.
  • One or more of the units described with respect to FIG. 1A may be integrated onto robot 102, such as in an integrated system.
  • one or more of these units may be part of an attachable module.
  • This module may be attached to an existing apparatus to automate so that it behaves as a robot.
  • the features described in this disclosure with reference to robot 102 may be instantiated in a module that may be attached to an existing apparatus and/or integrated onto robot 102 in an integrated system.
  • a person having ordinary skill in the art would appreciate from the contents of this disclosure that at least a portion of the features described in this disclosure may also be run remotely, such as in a cloud, network, and/or server.
  • a robot 102, a controller 118, or any other controller, processor, or robot performing a task, operation or transformation illustrated in the figures below comprises a controller executing computer-readable instructions stored on a non-transitory computer-readable storage apparatus, such as memory 120, as would be appreciated by one skilled in the art.
  • the architecture of a processor or processing device 138 is illustrated according to an exemplary embodiment.
  • the processor 138 includes a data bus 128, a receiver 126, a transmitter 134, at least one processor 130, and a memory 132.
  • the receiver 126, the processor 130 and the transmitter 134 all communicate with each other via the data bus 128.
  • the processor 130 is configured to access the memory 132 which stores computer code or computer-readable instructions in order for the processor 130 to execute the specialized algorithms.
  • memory 132 may comprise some, none, different, or all of the features of memory 120 previously illustrated in FIG. 1A.
  • the algorithms executed by the processor 130 are discussed in further detail below.
  • the receiver 126 as shown in FIG. IB is configured to receive input signals 124.
  • the input signals 124 may comprise signals from a plurality of operative units 104 illustrated in FIG. 1 A including, but not limited to, sensor data from sensor units 114, user inputs, motor feedback, external communication signals (e.g., from a remote server), and/or any other signal from an operative unit 104 requiring further processing.
  • the receiver 126 communicates these received signals to the processor 130 via the data bus 128.
  • the data bus 128 is the means of communication between the different components — receiver, processor, and transmitter — in the processor.
  • the processor 130 executes the algorithms, as discussed below, by accessing specialized computer-readable instructions from the memory 132.
  • the memory 132 is a storage medium for storing computer code or instructions.
  • the storage medium may include optical memory (e g., CD, DVD, HD-DVD, Blu-Ray Disc, etc ), semiconductor memory (e g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others.
  • Storage medium may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location-addressable, file-addressable, and/or content- addressable devices.
  • the processor 130 may communicate output signals to transmitter 134 via data bus 128 as illustrated.
  • the transmitter 134 may be configured to further communicate the output signals to a plurality of operative units 104 illustrated by signal output 136.
  • FIG. IB may also illustrate an external server architecture configured to effectuate the control of a robotic apparatus from a remote location. That is, the server may also include a data bus, a receiver, a transmitter, a processor, and a memory that stores specialized computer-readable instructions thereon.
  • a controller 118 of a robot 102 may include one or more processors 138 and may further include other peripheral devices used for processing information, such as ASICS, DPS, proportional-integral-derivative (“PID”) controllers, hardware accelerators (e.g., encryption/decryption hardware), and/or other peripherals (e.g., analog to digital converters) described above in FIG. 1A.
  • PID proportional-integral-derivative
  • hardware accelerators e.g., encryption/decryption hardware
  • other peripherals e.g., analog to digital converters
  • peripheral devices are used as a means for intercommunication between the controller 118 and operative units 104 (e.g., digital to analog converters and/or amplifiers for producing actuator signals).
  • the controller 118 executing computer-readable instructions to perform a function may include one or more processors 138 thereof executing computer-readable instructions and, in some instances, the use of any hardware peripherals known within the art.
  • Controller 118 may be illustrative of various processors 138 and peripherals integrated into a single circuit die or distributed to various locations of the robot 102 which receive, process, and output information to/from operative units 104 of the robot 102 to effectuate control of the robot 102 in accordance with instructions stored in a memory 120, 132.
  • controller 118 may include a plurality of processors 138 for performing high-level tasks (e.g., planning a route to avoid obstacles) and processors 138 for performing low-level tasks (e.g., producing actuator signals in accordance with the route).
  • processors 138 for performing high-level tasks (e.g., planning a route to avoid obstacles) and processors 138 for performing low-level tasks (e.g., producing actuator signals in accordance with the route).
  • FIG. 2A(i-ii) illustrates a light detection and ranging (“LiDAR”) sensor 202 coupled to a robot 102, which collects distance measurements to an object, such as wall 206, along a measurement plane in accordance with some exemplary embodiments of the present disclosure.
  • LiDAR sensor 202 illustrated in FIG. 2A(i), may be configured to collect distance measurements to the wall 206 by projecting a plurality of beams 208, representing the path traveled by an electromagnetic pulse of energy at discrete angles along a measurement plane, to determine the distance to the wall 206 based on a time of flight (“ToF”) of the beams 208 leaving the LiDAR sensor 202, reflecting off the wall 206, and returning back to the LiDAR sensor 202.
  • ToF time of flight
  • the measurement plane of the LiDAR 202 comprises a plane along which the beams 208 are emitted which, for this exemplary embodiment illustrated, is the plane of the page.
  • LiDAR sensor 202 may emit beams 208 across a two- dimensional field of view instead of a one -dimensional planar field of view, wherein the additional dimension may be orthogonal to the plane of the page.
  • Such sensors may be referred to typically as 3- dimensional (scanning) LiDAR or ToF depth cameras.
  • Individual beams 208 of photons may localize a respective point 204 of the wall 206 in a point cloud, the point cloud comprising a plurality of points 204 localized in 2D or 3D space as illustrated in FIG. 2(ii).
  • the location of the points 204 may be defined about a local origin 210 of the sensor 202 and is based on the ToF of the respective beam 208 and the angle at which the beam 208 was emitted from the sensor 202.
  • Distance 212 to a point 204 may comprise half the time of flight of a photon of a respective beam 208 used to measure the point 204 multiplied by the speed of light, wherein coordinate values (x, y) of each point 204 depend both on distance 212 and an angle at which the respective beam 208 was emitted from the sensor 202.
  • the local origin 210 may comprise a predefined point of the sensor 202 to which all distance measurements are referenced (e.g., location of a detector within the sensor 202, focal point of a lens of sensor 202, etc.). For example, a 5-meter distance measurement to an object corresponds to 5 meters from the local origin 210 to the object.
  • a laser emitting element of the LiDAR sensor 202 which emits the beams 208 may include a spinning laser, wherein the individual beams 208 illustrated in FIG. 2A(i-ii) may correspond to discrete measurements of the ToF of the laser.
  • sensor 202 may be illustrative of a depth camera or other ToF sensor configured to measure distance, wherein the sensor 202 being a planar LiDAR sensor is not intended to be limiting.
  • Depth cameras may operate similarly to planar LiDAR sensors (i.e., measure distance based on a ToF of beams 208); however, depth cameras may emit beams 208 using a single pulse or flash of electromagnetic energy, rather than sweeping a laser beam across a field of view.
  • Depth cameras may additionally comprise a two-dimensional field of view.
  • sensor 202 may be illustrative of a structured light LiDAR sensor configured to sense distance and shape of an object by projecting a structured pattern onto the object and observing deformations of the pattern.
  • the size of the projected pattern may represent distance to the object and distortions in the pattern may provide information of the shape of the surface of the object.
  • Structured light sensors may emit beams 208 along a plane as illustrated or in a predetermined pattern (e.g., a circle or series of separated parallel lines).
  • FIG. 2B illustrates a robot 102 comprising an origin 216 defined based on a transformation 214 from a world origin 220, according to an exemplary embodiment.
  • World origin 220 may comprise a fixed or stationary point in an environment of the robot 102 which defines a (0,0,0) point within the environment.
  • the transform 214 may represent a matrix of values, which configures a change in coordinates from being centered about the world origin 220 to the origin 216 of the robot 102.
  • the value(s) of transform 214 may be based on a current position of the robot 102 and may change over time as the robot 102 moves, wherein the current position may be determined via navigation units 106 and/or using data from sensor units 114 of the robot 102.
  • the robot 102 may include one or more exteroceptive sensors 202 of sensor units 114, wherein each sensor 202 includes an origin 210.
  • the positions of the sensor 202 may be fixed onto the robot 102 such that its origin 210 does not move with respect to the robot origin 216 as the robot 102 moves. Measurements from the sensor 202 may include, for example, distance measurements, wherein the distances measured correspond to a distance from the origin 210 of the sensor 202 to one or more objects. Transform 218 may define a coordinate shift from being centered about an origin 210 of the sensor 202 to the origin 216 of the robot 102, or vice versa. Transform 218 may be a fixed value, provided the sensor 202 does not change its position. In some embodiments, sensor 202 may be coupled to one or more actuator units 108 configured to change the position of the sensor 202 on the robot 102 body, wherein the transform 218 may further depend on the current pose of the sensor 202.
  • Controller 118 of the robot 102 may always localize the robot origin 216 with respect to the world origin 220 during navigation, using transform 214 based on the robot 102 motions and position in the environment, and thereby localize sensor origin 210 with respect to the robot origin 216, using a fixed transform 218.
  • the transform 218 is only presumed as a fixed value as it is presumed the sensor 202 will not move on its own, wherein various contemporary calibration methods may be used to update the transform 218 if the sensor 202 changes position. In doing so, the controller 118 may convert locations of points 204 defined with respect to sensor origin 210 to locations defined about either the robot origin 216 or world origin 220.
  • transforms 214, 218 may enable the controller 118 of the robot 102 to translate a 5-m distance measured by the sensor 202 (defined as a 5- m distance between the point 204 and origin 210) into a location of the point 204 with respect to the robot origin 216 (e.g., distance of the point 204 to the robot 102) or world origin 220 (e.g., location of the point 204 in the environment).
  • the position of the sensor 202 on the robot 102 is not intended to be limiting. Rather, sensor 202 may be positioned anywhere on the robot 102 and transform 218 may denote a coordinate transformation from being centered about the robot origin 216 to the sensor origin 210 wherever the sensor origin 210 may be. Further, robot 102 may include two or more sensors 202 in some embodiments, wherein there may be two or more respective transforms 218 which denote the locations of the origins 210 of the two or more sensors 202. Similarly, the relative position of the robot 102 and world origin 220 as illustrated is not intended to be limiting.
  • FIG. 3 illustrates an image plane 302 of a sensor 202 in accordance with some exemplary embodiments of this disclosure.
  • Image plane 302 may comprise a size (i.e., width and height) corresponding to a field of view of a sensor 202.
  • Image plane 302 may comprise a plane upon which a visual scene is projected to produce, for example, images (e.g., RGB images, depth images, etc.).
  • the image plane 302 is analogous to the plane formed by a printed photograph on which a visual scene is depicted.
  • the image plane 302 subtends a solid angle about the origin 210 corresponding to a field of view of the sensor 202, the field of view being illustrated by dashed lines, which denote the edges of the field of view.
  • Image plane 302 may include a plurality of pixels 304. Each pixel 304 may include or be encoded with distance information and, in some instances, color information. If the depth camera 202 is configured to produce colorized depth imagery, each pixel 304 of the plane 302 may include a color value equal to the color of the visual scene as perceived by a point observer at a location of a sensor origin 210 (e.g., using data from color-sensitive sensors such as CCDs and optical filters).
  • the distance information may correspond to a time of flight of a beam 208 emitted from the origin 210, traveling through a pixel 304 (the intersection of the beams 208 with pixels 304 of the image plane 302 being shown with dots in the center of the pixels 304), and reaching an object (not shown), wherein the points 204 may be localized on the surface of the object.
  • the distance measurement may be based on a ToF of a beam 208 emitted at the origin 210, passing through a pixel 304, and reflecting off an object in the visual scene back to the depth camera 202.
  • the distance and color information for each pixel 304 may be stored as a matrix in memory 120 or as an array (e.g., by concatenating rows/columns of distance and color information for each pixel 304) for further processing, as shown in FIG. 3 A-B below.
  • the image plane 302 may instead comprise a one -dimensional (i.e., linear) row of pixels 304.
  • the number of pixels 304 along the row may correspond to the angular resolution of the planar LiDAR sensor.
  • the color value of the pixel 304 may be the color seen by an observer at the origin 210 looking through the “removed/transparent” pixel 304.
  • the depth value may correspond to the distance between the origin 210 to an object as traveled by a beam 208 through the “removed” pixel 304. It is appreciated, following the analogy, that depth cameras 202 may “look” through each pixel contemporaneously by emitting flashes or pulses of beams 208 through each pixel 304.
  • the number of pixels 304 may correspond to the resolution of the depth camera 202.
  • the resolution is only 8x8 pixels, however one skilled in the art may appreciate depth cameras may include higher resolutions such as, for example, 480x480 pixels, 1080x1080 pixels, or larger/smaller.
  • the resolution of the depth camera 202 is not required to include the same number of pixels along the horizontal (i.e., y) axis as the vertical (i.e., z) axis.
  • Depth imagery may be produced by the sensor emitting a beam 208 through each pixel 304 of the image plane 302 to record a distance measurement associated with each pixel 304, the depth image being represented based on a projection of the visual scene onto the image plane 302 as perceived by an observer at origin 210.
  • Depth imagery may further include color values for each pixel 304 if the sensor 202 is configured to detect color or greyscale representations of color, the color value of a pixel 304 being the color as perceived by a point observer at the origin 210 viewing a visual scene through each pixel 304.
  • the size (in steradians) of the pixels 304 may correspond to a resolution of the resulting depth image and/or sensor.
  • the angular separation between two horizontally adjacent beams 0 may be the angular resolution of the depth image, wherein the vertical angular resolution may be of the same or different value.
  • Depth imagery may be utilized to produce a point cloud, or a plurality of localized points in 3-dimensional (“3D”) space, each point comprising no volume and a defined (x, y, z) position.
  • Each point typically comprises non-integer (i.e., non-discrete) values for (x, y, z), such as floating-point values.
  • It may be desirable for a robot 102 to identify objects and localize them accurately within its environment to avoid collisions and/or perform tasks.
  • Robotic devices may utilize one or more computer-readable maps to navigate and perceive their environments, wherein the use of raw point cloud data may be computationally taxing and may be inaccurate because the points do not define volumes or surfaces of objects. Accordingly, point clouds are discretized into voxel space to enable the controller 118 to more readily utilize the point cloud data to perceive its surrounding environment.
  • Voxels may comprise 3D pixels, or non-overlapping rectangular/cubic prisms of space with a defined width, height, and length, wherein the width, height, and length may be of the same or different values.
  • voxels may be other non-overlapping 3- dimensional shapes such as, for example, voxels defined using spherical or cylindrical coordinates.
  • each voxel as shown and described hereinafter may be defined by Cartesian coordinates and comprise cubes which include width, height, and length dimensions of the equivalent values unless specifically stated otherwise.
  • Empty space may include any region or voxel in which no points 204 are localized therein.
  • a method of ray marching discussed below in FIG. 4, includes identifying a path followed by each beam 208 to a point 204, wherein any voxel that the beam 208 passes through may be presumed to comprise empty space (e.g., as shown by ray 402 in FIG. 4). Accordingly, it is advantageous to identify beams 304 and/or localized points 204 that fall within or occupy a same voxel to reduce a number of ray marching operations performed.
  • the point cloud may be discretized into voxel space by assigning each point 204 to a voxel.
  • Adjacent beams 208 refers to two beams (e.g., 208-1, 208-2) which pass through pixels 304 of the image plane 302 which are directly neighboring each other, or are contiguous, either vertically or horizontally (i.e., share a common edge or side, also referred to as “directly adjacent”). “Diagonally adjacent” pixels or voxels as used herein share a common vertex.
  • Each point 204 may comprise an (x, y, z) location defined using real valued numbers (e.g., floating point values), wherein the point 204 may be associated with a voxel in discretized 3D space.
  • the voxel assigned to each point may follow Equations 1-3 below:
  • parameters (x, y, z) represent real valued (x, y, z) coordinate values of a point 204, which comprise a unit of distance (e.g., centimeters, meters, inches, etc.), wherein the parameters (x, y, z) are defined about the origin 216 of the robot 102, origin 210 of the sensor 202, or origin 220 of the environment of the robot 102, wherein transforms 214, 218 enable the controller 118 to move between respective coordinate systems.
  • Parameters l x , l y , and l z represent the spatial resolution of a voxel along the x, y, and z axis respectively and comprise a unit of distance.
  • V x , V y , and V z are integer values corresponding to the x, y, z indices of the voxel assigned to the point 204 and comprise no units.
  • the “floor” function outputs a nearest integer number, which is less than its argument, i.e. it rounds down to an integer. It is appreciated that Equations 1-3 above are purely illustrative of the process of assigning points of a point cloud to discrete voxels in a discretized 3D space, wherein other methods for assigning points 204 to a voxel in discretized 3D space are also considered without limitation.
  • a voxel space comprises voxels of Ixlxl cm spatial resolution and a point 204 comprises a location of (3.532, 1.232, 2.960) cm
  • the point 204 may be assigned to voxel (3, 1, 2).
  • the resolution was 0.5x0.5x0.5 cm
  • the assigned voxel may be (6, 2, 4), and so forth.
  • Depth information, and in some instances color information, corresponding to pixels 304 of an image plane 302 may be stored as an array of values in memory 120.
  • the bottom right pixel 304 may comprise pixel (0, 0) and may be the first entry in the array
  • the top left pixel 304 may comprise pixel (M-l, N-l) to be stored as the last entry in the array, M and /V representing the length and height, respectively, of the image plane 302 in units of pixels.
  • the array may comprise a two-dimensional array (i.e., a matrix), wherein each row/column of pixels 304 is stored as a row/column entry of the two-dimensional array. Pixels horizontally or vertically adjacent to each other may be stored as adjacent entries in the array.
  • array 500 is illustrated as array 500, 512 depicted and discussed further in FIG. 5A-B below.
  • Each point 204 corresponding to a beam 208 which passes through a pixel 304 in the image plane may be localized within a voxel in 3D space as shown by Equations 1-3 above.
  • beams 208 that pass through adjacent pixels may localize points 204 in the same voxel, in adjacent voxels, or in non-adjacent voxels.
  • the array that stores the distance measurements for each pixel 304 of the image plane 302 may further include a voxel number (e.g., voxel #1500) or location of the voxel in which the point 204 falls within (e.g., voxel at (V x , V y , V z ), V x , V y , and V z being integer numbers).
  • a voxel number e.g., voxel #1500
  • location of the voxel in which the point 204 falls within e.g., voxel at (V x , V y , V z ), V x , V y , and V z being integer numbers.
  • FIG. 4 illustrates voxel space in two dimensions, the third dimension omitted for clarity, according to an exemplary embodiment.
  • the voxel space corresponds to a 3D discretization of space, which surrounds a robot 102.
  • the robot 102 may include a sensor 202, comprising a sensor origin 210 located at the illustrated position during acquisition of a scan or distance measurement (e.g., capture of a depth image), wherein the sensor 202 may localize a point 204 within the 3D space based on transforms 214, 218 shown in FIG.
  • a controller 118 of the robot 102 may localize itself within the environment via transform 214 shown in FIG. 2B and thereby localize the origin 210 of the sensor 202 using a predetermined transform 218. While the sensor origin 210 is at the illustrated position, the sensor 202 may localize a point 204, of many (omitted for clarity), within the illustrated voxel 408 (hatched)
  • the ray marching process includes the controller 118 of the robot 102 calculating a ray 402, which traces a path from the sensor origin 210 to the localized point 204. In doing so, each voxel that the ray 402 passes through may be identified as an “empty space” voxel 406 (shaded). This is analogous to a beam 208 emitted from the sensor 202 not detecting any objects along the path of ray 402, which corresponds to no objects being present along the path. The ray marching may be repeated for each point 204 localized by the sensor 202 for each scan or measurement acquired by the sensor 202.
  • ToF and LiDAR sensors 202 may localize hundreds or thousands of points 204 per scan, wherein performing the ray marching for each point 204 of each scan may utilize a substantial amount of computing resources and time. It is appreciated that a ray 402 extending to another point 204 within the voxel 408 will pass through a substantial majority of the same voxels 406 along its path, thereby redundantly identifying the same voxels 406 as empty space.
  • the ray marching step occupies a substantial portion of runtime when a controller 118 is mapping its environment from 3D point cloud representation to discrete occupancy maps. Further, as the distance between the objects and the sensor decreases, the likelihood that two or more points 204 lie within a same voxel 404 increases, wherein ray marching to two points 204 within a same voxel may result in a plurality of “empty space” voxels 406 being identified as “empty space” multiple times.
  • the systems and methods of the present disclosure are, inter alia, directed towards accelerating the ray marching procedure for producing an occupancy map for robots 102 to enable the robots 102 to map their environments from 3D point clouds to occupancy maps in real time by reducing the number of redundant calculations performed during the ray marching.
  • 3D point clouds are typically discretized to floating point accuracy or other accuracy which may be substantially more than a resolution of a 3D voxel space or 2D pixel space in which robots 102 operate, wherein converting the 3D point clouds to occupancy maps may enable the controller 118 to more readily perceive its environment and plan its trajectory accordingly.
  • FIG. 5 A illustrates an array 500 corresponding to an array of depth values of pixels 304 of an image plane of a sensor 202, according to an exemplary embodiment.
  • Depth values from each pixel 304 from a sensor 202 may be represented in an array by concatenating distance measurements (and color value(s), if applicable) of each row of pixels 304 in a series. That is, each entry 502 of the array 500 may represent a point 204 localized by a beam 208 which passes through a pixel 304 of the image plane 302.
  • Entries 502 may include a voxel location (i.e., (V x , V y , V z ) location) of which the point 204 localized by a beam 208 is within following Equations 1-3 above and may comprise of matrices of values.
  • the entries 502 are denoted D n , wherein n is an integer number and D n corresponds to a point 204 measured by a beam 208 adjacent to D n -i and D n +i.
  • each of the entries 502 of the array 500 (representative of individual pixels 304 of the image plane 302) may be utilized to fdter adjacent duplicates, or two points 204 localized within a same voxel, the two points 204 being localized by beams 208 which pass through adjacent pixels 304 on the image plane 302.
  • an array 504 has been shown comprising a lettered notation for each entry 502 of the array 500, wherein the same letters denote adjacent entries 502 that lie within a same voxel.
  • Array 504 is not a separate array stored in memory 120 and is provided herein for ease of describing the method.
  • distance measurements Di through D3 may represent three beams 208 which pass through three adjacent pixels 304 of the image plane 302 and lie within a voxel “A” located at (XA, yA, ZA).
  • measurements D2 and D x may each localize points 204 within a same voxel, but the two entries are not adjacent/neighboring each other in the arrays 500/504 (i.e., do not correspond to adjacent beams 208 which pass through neighboring pixels 304 of the image plane 302) and are denoted using different letters “A” and “D”.
  • adjacent beams 208 for a given entry D n are D n -i and/or D n +i, wherein adjacent beams D n -i and/or D n +i are further grouped together (e.g., as shown by the lettering in array 504) if the corresponding entries points 204 of D n -i and/or D n +i lie within the same voxel as the given entry D n .
  • the controller 118 may parse the array 500 by comparing adjacent entries 502 to each other as shown by operations 506.
  • Operations 506 include the controller comparing a first entry 502 Di to its adjacent entry 502 D2 and if the two entries comprise points 204 within a same voxel (i.e., comprise the same letter in array 504), the controller 118 compares the adjacent entry 502 D2 to its subsequent adjacent entry 502 D3, and so forth.
  • the controller 118 may continue comparing pairs of adjacent entries 502 (i.e., D n and D n +i) until the adjacent entries 502 D n and D n +i comprise two points 204 localized within two separate voxels (i.e., comprise a different letter in array 504), as shown by cross 508.
  • the controller 118 comparing a first entry 502 D3 to a second entry 502 D4 adjacent to the first entry 502 and determining the two entries denote points 204 localized within separate voxels, the first entry 502 may be provided to an output array 510 and the second entry 502 may be compared with its adjacent neighboring entry 502. Controller 118 may continue such comparison for each pair of entries 502 of array 500, wherein the final entry (e.g., D x ) does not include a comparison and therefore comprises the final entry in output array 510.
  • the final entry e.g., D x
  • entries 502 D3 and D4 of array 500 may localize points 204 within separate voxels, as shown by different lettering in array 504 of A and B. Accordingly, the controller 118 may provide the entry 502 of D3 to the output array 510 and compare D4 with its neighboring entry 502 D5. The prior entries Di and D2 that lie within the same voxel A may be omitted from the output array 510 such that the output array 510 is of equal or smaller size than the input array 500.
  • the output array 510 is smaller than the input array 500 corresponding to fewer points 204 being utilized by the ray marching process. If distance measurements in entries 502 localize an object substantially close to the sensor 202, the output array 510 may be substantially smaller than the input array 500, since substantially more points 204 may he within a same voxel. Accordingly, this reduction in the points 204 utilized by the ray marching may enhance a speed at which a robot 102 may update a computer-readable map based on new sensor 202 data in the instance where objects are very close to the robot 102, which may be when the robot 102 must accurately perceive its environment and make decisions quickly to avoid collision.
  • FIG. 5B illustrates an input array 512 comprising a plurality of distance measurements 514, each distance measurement 514 corresponding to a measured distance of a pixel of an image plane 302 of a sensor 202, according to an exemplary embodiment.
  • a LiDAR scan e.g., 2D or 3D
  • a depth image may be represented by a plurality of pixels 304 of an image plane 302, wherein each pixel comprises a depth value.
  • the depth value may localize a point 204 within 3D space.
  • the depth values of the pixels 304 are utilized in Equations 1-3 above to determine a voxel in which the localized point 204 lies.
  • entry 514 of Di may correspond to a pixel 304 in an upper left comer and entry 514 of Dn may correspond to a pixel 304 in a bottom right comer of a 4x3 pixel depth encoded image (i.e., in an image plane 302 comprising a 4x3 arrangement of pixels 304).
  • the image may be, following the above example, 3x4 pixels, wherein the array 512 may be formed by concatenating columns of pixels 304 of the image plane 302 instead of rows.
  • One skilled in the art may envision a plurality of ways of configuring the array 512 such that adjacent entries 514 correspond to adjacent distance measurements or points 204 measured through vertically or horizontally adjacent pixels 304 of the image plane 302.
  • Entries 514 of the array 512 denoted by D n represent a depth measurement which localizes a point 204 in 3D space for beam 208 which passes through an n th pixel 304 of an image plane 302, the point 204 being within a voxel determined by equations 1-3 above, wherein parameter n is an integer number.
  • an array 516 which comprises entries 514 which localize adjacent points 204 within a same voxel denoted with a same letter.
  • Di through D3 may lie within voxel A (e.g., a voxel at location (XA, yA, ZA) or an integer voxel number A). That is, array 516 is not illustrative of another array stored in memory 120 separate from array 512 and is intended to illustrate an alternative denotation of adjacent entries 514 in array 512 which lie within a same voxel for ease of description.
  • Controller 118 may partition the input array 512 into at least two partitions P m , each partition P m may comprise a fixed number of array entries 514 (i.e., pixels 304 and their corresponding measurements D n ) wherein parameter m is an integer number.
  • the controller 118 may compare each pair of adjacent entries 514 of the array 512, as shown by operations 506, to determine if the pair localize points 204 within a same voxel.
  • the controller 118 may stop comparisons at the end of each partition such that adjacent entries 514 within two separate partitions P m and P m +i are not compared.
  • the controller 118 may not compare entries 514 of D4 with D5 nor D « with D9 for the 4-pixel partition size that is illustrated. For each pair of entries 514 D n and D n +i, which correspond to two points 204 that lie within the same voxel, the prior of the pair D n may be removed or ignored and the later D n +i may be compared with the next entry 514 D n +2 of the array 512, provided the next entry 514 D n +2 falls within the same partition P m as the prior D n and D n +i entries 514.
  • the first of the pair (D n ) may be provided to output array 518 and the later D n +i compared with the subsequent entry D n +2, unless D n +2 is outside of the partition of D n and D n +i, in which case both D n and D n +i are provided to the output array 518.
  • the controller 118 reaching an entry 514 at the end of a partition P m , that entry 514 may be provided to the output array 518.
  • Entries of the output array 518 may be concatenated together, wherein the empty space between each entry is illustrated to show the reduction in size of output array 518 from the input array 512.
  • twelve entries in array 512 are reduced to seven entries in output array 518.
  • ray marching may be performed by extending a ray 402 from an origin 210 of the sensor 202 to each point 204 represented by each entry 520 of the output array 518.
  • the controller 118 may produce a ray 402 which extends from the location of the origin 210 of the sensor 202 used to produce the array 500, 512 (i.e., capture a depth image or LiDAR scan) to each point 204 of the output array 510, 518, wherein the location of the origin 210 corresponds to its location during acquisition of the array 500, 514 (i.e., acquisition of a depth image or LiDAR scan).
  • each ray 402 may extend from the origin 210 to each point 204 denoted by entries of the output array 510, 518.
  • the ray 402 may extend from the origin 210 to the center or side of the voxel in which points 204 of the output array 510, 518 are localized.
  • the controller 118 may produce a first ray 402 extending from origin 210 to a center of voxel A or point 204 corresponding to entry D3, a second ray 402 extending from origin 210 to a center of voxel B or to points 204 corresponding to entries D4 and D5, and so forth for each entry 520 of the output array 518.
  • Each voxel 404 in which the rays 402 pass through may be denoted as “empty space” voxels 406 (shaded).
  • voxels comprising points 204 may be denoted as “occupied” voxels 408 (hashed).
  • the denoting of occupied voxels 408 may be performed subsequent to the identification of the empty space voxels 406 if the ray marching process is performed in parallel, for reasons discussed below in regard to FIGs. 10-11.
  • Voxels in which the rays 402 never pass through and which comprise no points 204 therein may be denoted as “unknown” voxels (white) representing voxels of which sensor data does not provide localization information of any object, or lack thereof, within the voxels.
  • a ray 402 may pass through an occupied voxel 408.
  • the occupied voxel 408 may comprise a point 204 localized therein during a previous scan or acquisition of a depth image at a time t-1.
  • the same voxel 408 may not include a point 204 localized therein (e.g., an object may have moved).
  • the voxel 408 denoted as occupied at time t-1 may lie within the path of a ray 402 which extends from the sensor origin 210 to a point 204 localized during the current time step at time t.
  • the formerly occupied voxel 408 is replaced with an empty space voxel 406. This may be referred to as “clearing” of the occupied voxel 408. If no ray 402 passes through the occupied voxel 408, the voxel remains designated as occupied until cleared by a ray 402, based on a later scan at time t or later. Stated differently, a voxel comprising a point 204 therein may be denoted as “occupied” unless and until a future ray 402 extends through the voxel, thereby “clearing” the presence of the object. Essentially the robot 102 will presume an object to be present until it “sees through” the object (e.g., due to the object no longer being present) via ray marching through the occupied pixels.
  • One advantage of the partitioning method shown in FIG. 5B is that more data (i.e., more points 204) are preserved, wherein the amount of data preserved may depend on a size of each partition P m .
  • Operators of robots 102 may tune the partition size based on (i) computing capabilities of the controller 118, (ii) time spent performing the ray marching, (iii) resolution of the sensor 202 used to produce the point cloud, and (iv) desired accuracy of a resulting occupancy map.
  • Partition sizes of 1 entry 514 may yield the most accurate occupancy map but at the cost of substantially increased computation time during the ray marching because more rays 402 must be simulated and more calculations performed to determine empty voxels 506, wherein a plurality of empty space voxels 406 may be identified as empty space multiple times.
  • a second advantage is that partitioning of the array 512 may allow for rapid parallel computation of operations 506 using vectorized algorithms and/or hardware, as discussed in more detail below in regard to FIG. 11.
  • points 204 localized beyond a threshold distance from the sensor origin 210 may be ignored for ray marching as objects/free space far from the robot 102 may not be impactful in its path planning decisions. Further, performing ray marching for only points 204 far from the robot 102 yields: (i) increased distance causes the likelihood that two adjacent beams 208 localize points 204 within the same voxel, and (ii) increased calculation time due to longer beams 402 for little marginal gain in path planning/execution by the robot 102.
  • FIG. 5C is a process flow diagram illustrating a method 522 for a controller 118 of a robot 102 to perform adjacency filtering as used herein, according to an exemplary embodiment. Steps of method 522 may be effectuated by the controller 118 executing computer-readable instructions from memory 120.
  • Block 524 includes the controller 118 receiving a point cloud from a depth image or a LiDAR scan.
  • the point cloud may include a plurality of points 204 localized in 3D space using, for example, floating point numbers or other non-discretized values.
  • the point cloud may be produced by a singular depth image from a depth camera or a singular scan across a measurement plane of a planar LiDAR sensor (e.g., 202).
  • Block 526 includes the controller 118 discretizing the point cloud into voxels and assigning each point 204 of the point cloud to a voxel.
  • the voxel assigned to each point 204 corresponds to the location of the point 204 in 3D discretized space. Equations 1-3 above illustrate how the points 204 may be translated from (x, y, z) distance measurements, in units of distance, to a voxel (x v , y v , z v ) location, with no units.
  • the assigned voxel may be denoted by its position (x v , y v , z v ) in 3D space, wherein (x v , y v , z v ) are integer numbers. In some embodiments, the assigned voxel may correspond to a number (e.g., voxel #10250).
  • Block 528 includes the controller producing an array 512, each entry 514 of the array 512 may include the assigned voxel of each point 204 of the point cloud.
  • Adjacent entries 514 of the array 512 represent points 204 measured by distance measurements (i.e., beams 208) which are adjacent to each other in the image plane 302, as shown by beams 208-1 and 208-2 in FIG. 3 for example.
  • the array 512 may comprise a concatenation of rows or columns of the pixels 304, and their respective distance measurements and assigned voxel, as viewed on the image plane 302.
  • the first eight (8) entries of the array may include the distance measurements and the assigned voxels of the points 204 localized by beams 208 which pass through the eight (8) pixels 304 of the top row of the image plane 302; the next eight (8) entries of the array may include the same for the second row of pixels 304; and so forth, wherein the last eight (8) entries may include the same for the bottom row of pixels 304.
  • the array 512 may include 64 entries 514 for the 8x8 pixel image plane 302 depicted in FIG. 3.
  • the image plane 302 may include more or fewer pixels 304 in other embodiments.
  • block 528 further denotes that the array may include N entries 514 representing N pixels 304 of the image plane 302 and N distance measurements for the pixels 304, wherein N may be any integer number.
  • the array 512 may be partitioned into at least one partition.
  • the image plane may include a number of pixels 304 equal to an integer multiple of the partition size, wherein partition size corresponds to a number of entries 514 within each partition.
  • Block 530 includes the controller 118 setting a value for a parameter n equal to one.
  • Parameter n may be a value stored in memory 120 and manipulated by the controller 118 executing computer-readable instructions or may be purely illustrative for the purpose of explaining the remaining blocks 532-540.
  • Parameter n may correspond to the entry 514 number within the array 512 of which the controller 118 performs the operations as described in the following blocks, wherein the value of n may be incremented in block 540 following method 522.
  • Block 532 includes the controller 118 comparing n to N representative of the controller 118 determining if it has reached the end of the array 512.
  • N may be equal to 12, wherein upon the controller 118 reaching entry 514 D
  • 2 (i.e., n 12) the controller 118 does not have an adjacent entry 514 in the array 512 to compare D
  • n is a counter, set initially to one and incremented by one integer as the process flow iterates block 534, and /Vis 12, the number of entries 514 in the array. When n reaches 12, it equals N (block 532) and the process flow moves to block 542.
  • Block 534 includes the controller determining if the entry 514 D n is within a same partition P m of the adjacent entry 514 D n +i . If the subsequent entry 514 D n +i is within a different partition than the entry 514 D n , then the controller 118 may move to block 538. If the subsequent entry 514 D n +i is within the same partition as entry D n , the controller 118 proceeds to block 536.
  • Block 536 includes the controller 118 determining if the entry 514 D n and the subsequent entry 514 D n +i represent points 204 which lie within the same voxel of the discretized point cloud.
  • the entries 514 of the array 512 may include the voxel in which the respective point 204 corresponding to each entry 514 lies.
  • the controller 118 may determine which voxel points 204 corresponding to the entries 514 D n and D n +i lie within following Equations 1 -3 above and subsequently compare the voxels to determine if the points 204 lie within the same voxel.
  • controller 118 determines the points 204 of the entries 514 D n and D n +i lie within the same voxel, the controller 118 moves to block 540. If the points 204 of the entries 514 D n and D n +i lie within different voxels, the controller 118 moves to block 538.
  • the comparisons performed in blocks 534-536 may be skipped if the distance measurement associated with the entry D n is greater than a threshold value.
  • the threshold value may correspond to a distance away from the robot 102 at which points 204 comprise a high unlikelihood that two or more points 204 may fall within a same voxel. Because beams 208 extend radially away from the image plane 302/origin 210, the spatial distance between the two beams may increase while the angular separation remains constant, corresponding to an increased unlikelihood that two adjacent points 204 may fall within a same voxel as distance increases.
  • the threshold distance may be 1 meter, 5 meters, 10 meters, etc.
  • the controller 118 may first compare the distance value of the entry 514 D n to the threshold distance and, if the entry 514 comprises a distance greater than the threshold, the controller 118 may jump to block 538.
  • the loop created by blocks 532-540 may illustrate the process of adjacency filtering, as shown illustratively by operations 506 in FIG. 5A-B above, wherein the controller 118 may compare adjacent entries 514 within the array 512 and produce an output array 518 which includes only entries 520 representing points 204 of the original point cloud (received in block 524) which he within unique voxels from adjacent points 204. Adjacency is with respect to adjacent pixels 304 of the image plane 302 of the sensor 202 used to produce the point cloud.
  • Block 542 includes the controller 118 performing the ray marching procedure.
  • the ray marching procedure includes the controller 118 extending one or more rays 402 between an origin 210 of the sensor used to produce the point cloud and the voxels or points of the output array 518.
  • the ray 402 may extend between the origin 210 and the points 204 corresponding to entries 520 of the output array 518.
  • the rays 402 may extend from the origin 210 to the center of the voxels corresponding to entries 520 of the output array 518.
  • the rays 402 may extend from the origin 210 to the nearest or furthest border of the voxels corresponding to entries 520 of the output array 518.
  • Each voxel that the rays 402 pass through without reaching a point 204 may be denoted as “empty space,” “unoccupied,” “free space,” or equivalent, used interchangeably herein.
  • the controller 118 may mark each voxel of the output array 518 which comprises a point 204 localized therein as an “occupied” voxel 408.
  • the marking i.e., denoting of occupied voxels 408
  • an adjacency filter may refer to the operations 506 performed in FIG. 5A-B and/or blocks 528-540 of FIG. 5C. That is, an adjacency filter may include a controller 118 producing an array (e.g., 500, 512) representing a depth measurements through an image plane 302, partitioning the array into a plurality of sections or partitions, and comparing adjacent or neighboring entries within each partition to determine if the neighboring entries localize points 204 within a same voxel to produce an output filtered array (e.g., 510, 518) comprising points 204 within unique voxels.
  • an array e.g., 500, 512
  • an output filtered array e.g., 510, 518
  • the output array may comprise a plurality of coordinate locations for points 204 within separate voxels from other points 204 localized by distance measurements of adjacent pixels, although some points 204/entries of the output array may lie within the same voxel if the entries are within separate partitions or are not adjacent/neighboring in the input array.
  • the output array may include the voxel location of the respective points 204 corresponding to the distance measurements, wherein the controller 118 may translate the distance measurements into coordinate locations based on the position of the sensor.
  • FIG. 6A illustrates a robot 102 performing a ray marching procedure to identify voxels 406 corresponding to empty space without any adjacency filtering shown in FIG.
  • two points 204 may be localized for a scan from a sensor 202, wherein an origin 210 of the sensor 202 is illustrated at its location when the scan was captured. As shown, two adjacent points 204 may fall within the same voxel 602. Voxel 602 may be considered “occupied” and is illustrated with a hatched fdl to denote the occupied state. Without adjacency filtering, two rays 402 are required to detect empty space voxels 406 between the origin 210 and both points 204.
  • both rays 402 pass through substantially the same voxels 406, wherein performing the ray marching twice yields little to no new information as to unoccupied or free space voxels 406.
  • one or more voxels may be identified as empty space 406 if the two points 204 are far from the sensor origin 210.
  • a robot 102 is performing the ray marching procedure after adjacency filtering of points 204 that lie within a same voxel 602, according to an exemplary embodiment.
  • a robot 102 is performing the ray marching procedure after adjacency filtering of points 204 that lie within a same voxel 602, according to an exemplary embodiment.
  • only one point 204 may remain within the voxel 602.
  • only one ray 402 is utilized to perform the ray marching.
  • a substantial amount of redundant calculations have been omitted due to the removal of adjacent points 204 which he within a single voxel 602.
  • an adjacency filter may cause one or more voxels 406 identified as empty space in FIG. 6A to no longer encompass a ray 402 and thereby no longer be identified as empty space but as unknown space , as shown by a voxel 604 outlined by dashed lines. It is appreciated that some information may be lost, resulting in reduced accuracy of an occupancy map, for the advantage of a substantial reduction in time in performing the ray marching procedure by removing a plurality of redundant calculations.
  • Point clouds produced by depth cameras and/or LiDAR sensors 202 may be substantially dense when sensing objects nearby the robot 102 (e.g., 5 or more points 204 within a single voxel), wherein performing a plurality of redundant calculations may increase a time in which the controller 118 may map its environment and make decisions as to how the robot 102 should move in response to its environment.
  • a point 204 was located within the voxel 406 directly to the left of voxel 602 for example, a ray 402 may pass through the voxel 604 such that the voxel 604 may be denoted as “empty space” rather than “unknown.”
  • reducing the mapping time may be critical for robot 102 to operate effectively and safely (i.e., without collisions).
  • partitions may configure a maximum amount of potential data loss (i.e., points 204 filtered) which may be tuned by operators of robots 102 based on (i) parameters of the sensor 202 (e.g., resolution), (ii) navigation capabilities/precision of the robot 102, (iii) computing resources available to the controller 118, and in some instances (iv) the environment in which the robot 102 operates, allowing for the adjacency filter to be applied to a plurality of different sensors and/or robotic devices.
  • parameters of the sensor 202 e.g., resolution
  • navigation capabilities/precision of the robot 102 e.g., navigation capabilities/precision of the robot 102
  • computing resources available to the controller 118 e.g., the environment in which the robot 102 operates
  • FIG. 6C-D shows another illustration which depicts the advantages of adjacency filtering shown in FIG. 5A-C, according to an exemplary embodiment.
  • a sensor 202 (not shown) comprising an origin 210 at the illustrated location may localize a plurality of points 204 of a surface of an object 606.
  • the object 606 may include any object within an environment of a robot 102.
  • the depicted embodiment may comprise a 2-dimensional slice of a depth image viewed from above or a planar scan from a planar LiDAR sensor.
  • the slice may comprise a row of pixels 304 (not shown) on the image plane 302, wherein points 204 localized by beams 208 which pass through the row of pixels 304 are illustrated (the remaining points 204 omitted for clarity).
  • the row may form a portion of an input array 512 which may be partitioned into a plurality of sections as shown by lines 604.
  • Each line 608 may bound a specified number of pixels 304 of the image plane 302 corresponding to a size of each partition P m in pixels.
  • each partition P m may represent 2, 5, 10, 50, etc. pixels 304 of a row of the image plane 302.
  • Due to object 606 being substantially close to the sensor origin 210, the plurality of points 204 may be densely clustered along the surface of the object 606. Accordingly, each point 204 and its adjacent point 204 on the image plane 302 may be compared using the adjacency filtering described in FIG. 5A-C above to filter points 204 which may he within a same voxel.
  • voxels 404 may include one point 204, wherein the remaining points 204 within the voxels 404 have been removed (i.e., filtered). Some voxels 404 may include two points 204 therein, the two points 204 being included in two different partitions.
  • the partitions separate groups of pixels 304 of the 2D image plane 302 and do not separate groups of voxels 404 within 3D space, wherein the number of voxels 404 encompassed by each partition comprises an angular dependence (i.e., angular resolution of the image plane 302 multiplied by the number of pixels 304 within each partition) and distance dependence (i.e., distance from the origin 210 to the object 606). That is, the dashed lines 604 representing a size of each partition may pass through an integer number of pixels 304 of the image plane 302, wherein the lines 604 may extend anywhere within the voxels 404 and are shown for visual clarity only.
  • Each of the remaining points 204 may be utilized for ray marching, wherein rays 402 are extended from the origin 210 of the sensor 202.
  • the voxels 404 within which rays 402 pass through are denoted as empty space as shown by a grey shaded region.
  • the voxels 404 which include the points 204 are denoted as “occupied” voxels 408.
  • FIG. 7A illustrates a projection of a point cloud 700 onto a two-dimensional (“2D”) grid 702 for use in producing an occupancy map, according to an exemplary embodiment.
  • Occupancy maps as discussed above, comprise a plurality of pixels, each of the pixels being denoted as “occupied” or “unoccupied” pixels and, in some embodiments, may further be denoted as “unknown.”
  • Pixels corresponding to the “occupied” denotation may include pixels which represent objects within an environment of the robot, pixels denoted as “unoccupied” may represent empty space, and pixels denoted as “unknown” may represent regions of the environment of which sensor units 114 have not sensed.
  • Each pixel 704 of the grid 702 may represent a specified length and width of an environment of a robot 102.
  • Grid 702 may represent the same plane of the occupancy map (e.g., a floor occupancy map for robots 102 navigating on a floor).
  • Each pixel 704 of the grid 702 may comprise a count equal to a number of points 702 projected therein.
  • Each pixel 704 comprising a count greater than a threshold number which may be any number greater than zero (i.e., 1 or more points 204 per pixel 704), may correspond to an occupied pixel (dark hatched).
  • one pixel 708 of the plurality of pixels 704 includes a ray 706 extending vertically therefrom, wherein four points 204 (open circles) may be directly above the pixel 708 and be projected therein upon removal of their z component, corresponding to the pixel 708 being occupied by an object and comprising a count of four (4).
  • Ray 706 is intended to illustrate the points 204, illustrated with open circles, which comprise (x, y) components that fall within the pixel 708 (light hatched) at the base of ray 706.
  • Each of the other points 204 (closed circles) could be mapped onto an occupied pixel 704 (dark hatched) in similar fashion, not shown for simplicity of illustration.
  • the projection may be bounded by a maximum height, wherein points 204 above the maximum height may not be considered during the projection.
  • the maximum height may correspond to a height of a robot 102, wherein points 204 above maximum height may be ignored because the robot 102 may move beneath the object represented by the points 204.
  • the projection may be bounded by a minimum height, wherein points 204 below a specified minimum height value may be ignored during the projection.
  • the voxels 404 illustrated may comprise a “slice” or plane of a 3D volume of voxels, wherein additional voxels may be behind and/or in front of (i.e., into or out from the plane of the page) the voxels 404 illustrated.
  • a sensor 202 comprising an origin 210 at the illustrated location, of a robot 102 may measure its surrounding environment to produce a point cloud comprising a plurality of points 204.
  • a controller 118 of the robot 102 may perform the ray marching procedure, shown by rays 402, for selected points 204.
  • the selected points 204 include points 204 which correspond to entries 520 of an output array 518 following the adjacency filtering procedure described in FIG. 5A-C above.
  • the controller 118 only projects one ray 402 for each occupied voxel 408.
  • Additional points 204 collected and localized using the original measurements have been illustrated to show that the rays 402 utilized during the ray marching only extend from the sensor origin 210 to one point 204 within each occupied voxel 404.
  • two points within a same voxel may be utilized for the ray marching if the two points 204 fall within the same voxel but are denoted in different partitions in the input array 514 used for the adjacency filtering.
  • Voxels 404 of which the rays 402 pass through may be denoted as empty space voxels 406 (grey).
  • the controller 118 may denote voxels 404 comprising at least one point 204 therein as occupied voxels 408 (hashed lines).
  • the at least one point 204 used for the identification of occupied voxels 408 may be based on the original point cloud or points 204 remaining after the adjacency filtering procedure is executed.
  • Each occupied voxel 408 may include a count corresponding to a number of points 204 of the original point cloud localized therein (i.e., the count is based on points 204 of array 500, not the points 204 of the filtered arrays 510/518).
  • Voxels 404 which include no points 204 therein and wherein rays 402 do not pass through may be denoted as “unknown” voxels (white), corresponding to the controller 118 being unable to determine whether the voxel is free space or there is an object within the voxel, due to a lack of sensory data (i.e., points 204) at or nearby the unknown voxels.
  • the 3D volume of voxels 404, 406, 408 which comprise voxels labeled as “occupied,” “free space” and “unknown” are collectively denoted hereinafter as a 3D volume of labeled voxels.
  • the controller 118 may project the 3D volume of labeled voxels onto a plane.
  • the illustration in FIG. 7B comprises a slice or plane of voxels 404, the projection may include a projection onto a 1-dimensional line of pixels 704, representative of a line or column of pixels 704 of plane 702 shown in FIG. 7A.
  • the projection includes projecting the 3D volume of labeled voxels onto a 2D plane.
  • Each pixel 704 corresponds to the base of each column of voxels (i.e., voxels along the z axis), wherein the pixels 704 may include x, y dimensions equal to the x, y dimensions of the voxels 404.
  • the corresponding pixel 704 comprises the denotation of “occupied,” wherein the occupied pixel may include a count equal to the total count of all occupied voxels 408 above it.
  • the pixel 704 at the base of the column may comprise the empty space denotation.
  • the resulting pixel 704 may include the denotation of unknown (white pixels).
  • the projection has been illustrated by arrows 710 for each column of voxels corresponding to a pixel 704 of the occupancy map.
  • the counts for the occupied pixels 704 have been illustrated for clarity.
  • voxels 404 occupied by the robot 102 may also be denoted either as, e.g., “occupied” voxels 406 or as “robot” voxels.
  • the voxel which comprises the sensor origin 210 may be considered a “robot” voxel as it is occupied by a portion of the robot 102 (i.e., its depth camera 202).
  • Such pixels may also be referred to as a footprint of the robot 102.
  • the resulting occupancy map may include pixels 704 encoded as “robot” pixels which represent the robot 102 (i.e., area occupied by the robot 102 within the plane of the occupancy map), in addition to the “occupied,” “empty space,” and “unknown” pixels.
  • the leftmost pixel 704 may be encoded with the “robot” denotation.
  • the pixels occupied by the robot 102 on the resulting occupancy map may be based on (i) the location of the robot 102 origin 216 on the map (e.g., based on data from navigation units 106), and (ii) a footprint of the robot 102, the footprint corresponding to the area occupied by the robot 102 on the plane of the occupancy map.
  • the robot 102 may not be required to map itself in 3D space (i.e., denoting voxels occupied by the robot 102) and instead simply map its 2D projected area occupied to simplify motion/localization calculations by the controller 118.
  • FIG. 8 A illustrates an occupancy map 800, or portion thereof, according to an exemplary embodiment.
  • Occupancy map 800 may comprise a plurality of pixels 704, each pixel 704 being representative of a discretized area of an environment of a robot 102.
  • the pixels 704 may each represent, for example, 3cm x 3cm areas, or any other spatial resolution (e.g., Ixlcm, 2x2cm, 4x4cm, etc.).
  • the pixels 704 may comprise the same width and height as the voxels 404.
  • the sizes of the pixels 704 have been enlarged (e.g., the illustrated pixels 704 may be 30x30 cm or other resolution).
  • the occupancy map 800 may be produced by a controller 118 of the robot 102 identifying voxels in discretized 3D space which are occupied, unoccupied (i.e., free space), or unknown.
  • each pixel 704 of the occupancy map may comprise either an “occupied” (hatched pixels 704), “unoccupied” (grey pixels 704), or “unknown” (white pixels 704) label or encoding.
  • “Occupied” pixels 704 include pixels comprising a count greater than a threshold value, wherein the count is equal to the number of points 204 localized within voxels above the pixel 704 prior to the 2D projection.
  • “Unoccupied” pixels 704 include pixels below voxels identified to comprise empty space based on the ray marching procedure. “Unknown” pixels correspond to the remaining pixels 704 below only “unknown” voxels, the “unknown” voxels correspond to voxels of which the rays 402 configured during the ray marching procedure did not pass through. That is, the “unknown” pixels are below only “unknown” voxels and are not under (or projected thereon by) any “empty space” or “occupied” voxels.
  • Occupancy map 800 may be utilized by a controller 118 of a robot 102 to perceive, map, and navigate its environment. Occupancy map 800 may further include pixels 704 which denote the area (footprint) occupied by the robot 102, as shown by pixels 802 (black). To navigate the environment safely (i.e., without collisions), the controller 118 may specify a predetermined distance between the robot 102 and any occupied pixels 704. For example, the controller 118 may configure the robot 102 to maintain at least one unoccupied pixel 704 between the robot 102 and any occupied pixels 704 to account for sensor noise and/or imperfections in localization of the robot origin 216.
  • Occupancy map 800 may further include a route 802 represented by one or more pixels 704 (dark shaded pixels).
  • the route 802 may comprise a width of one or more pixels 704.
  • the route 802 may cause the robot 102 to navigate close to an occupied pixel 804, but the occupied pixel 804 is spatially isolated (no adjacent occupied pixels).
  • the spatial resolution of pixels 704 corresponds to a few centimeters (e.g., each pixel may represent 1x1 cm, 2x2 cm, 3x3 cm, etc.
  • any spatially isolated occupied pixels 704 from the occupancy map 800 may be advantageous to remove any spatially isolated occupied pixels 704 from the occupancy map 800, as shown next in FIG. 8B. For example, if the controller 118 of the robot 102 is configured to maintain a distance of at least 1 pixel from any occupied pixel 704, then the route 802 as illustrated is unnavigable.
  • FIG. 8B illustrates the removal of a spatially isolated occupied pixel 804, according to an exemplary embodiment.
  • the spatially isolated occupied pixel 804 may be required to be surrounded only by “unoccupied” pixels 704 (light grey) in order to be reset to “free space” (i.e., removed).
  • the spatially isolated occupied pixel 804 may be required to be surrounded by only “unoccupied” pixels 704 and “unknown” pixels 704 (white) to be reset to “free space.”
  • the pixel 804 may be reset to “free space,” wherein the count is based on the 3D point cloud prior to adjacency filtering. Accordingly, the robot 102 may now navigate the route 802 and maintain an at least one pixel 704 distance between the robot footprint 802 and any occupied pixels 704 on the occupancy map 800.
  • an occupied pixel may comprise one or more neighboring pixels which is/are also denoted as occupied.
  • the pixel may be considered a spatially isolated pixel 804, and therefore may be reset to “free space,” if its neighboring four pixels (excluding diagonally neighboring pixels) or neighboring eight pixels (including diagonally neighboring pixels) comprise a count less than a threshold value.
  • the threshold number may depend on the resolution of the sensor and noise level of the sensor. For example, if an occupied pixel comprises a count of three (3) and only one neighboring pixel is occupied and comprises a count of one (1), both pixels may be reset to “free space” as spatially isolated pixels if the threshold count is five (5).
  • the removal of spatially isolated pixels may instead be performed in 3D voxel space prior to the 2D projection of the point cloud or discretized 3D volume of labeled voxels. That is, the controller 118 may identify one or more voxels comprising a nonzero count with neighboring voxels comprising a zero count and/or “free space” denotation and subsequently reset the count of the one or more voxels to zero.
  • the neighboring voxels may be the six directly adjacent voxels (i.e., excluding diagonally adjacent voxels) or may include the twenty-six (26) adjacent and diagonally adjacent voxels.
  • a voxel comprising a nonzero count i.e., reset its count to zero such that it becomes a “free space” voxel
  • its neighboring voxels must in total include a count less than a predetermined threshold value.
  • FIG. 9 is a process flow diagram illustrating a method 900 for a controller 118 to produce an occupancy map based on a point cloud produced by one or more sensors 202, according to an exemplary embodiment. Steps of method 900 may be effectuated by the controller 118 executing computer-readable instructions from memory 120.
  • Block 902 includes the controller 118 receiving a point cloud from a sensor 202.
  • the point cloud may be received based on data from two or more sensors 202 if the sensors 202 operate synchronously.
  • the point cloud may be received based on data from two or more sensors 202 if the sensors 202 include non-overlapping fields of view.
  • Block 904 includes the controller 118 discretizing the point cloud into a plurality of voxels. Each voxel may comprise a count if one or more points 204 are localized therein following Equations 1-3 above. The count may be an integer number corresponding to the number of points 204 localized within each voxel of the plurality.
  • Block 906 includes the controller 118 removing adjacent duplicate points 204 within a same voxel, for example as illustrated above in FIG. 5A-C. The controller 118 may produce an array of values (500, 512), the array comprising a length equal to a number of pixels 304 within the image plane 302 of the sensor 202.
  • the array comprises a concatenation of the rows of pixels 304.
  • the array may comprise the distance measurements across the field of view concatenated in sequence.
  • the controller 118 may, in some embodiments, partition the array into a plurality of partitions, wherein each distance measurement within the partitions may be compared with its adjacent neighboring distance measurement to determine if the two distance measurements localize points 204 within a same voxel.
  • the controller 118 may only compare adjacent distance measurements within a same partition.
  • the partition size may be configured such that the total number of pixels of the image plane 302 comprises an integer multiple of the partition size.
  • the controller 118 may produce an output array (510, 518), the output array may include a plurality of entries, each entry of the output array being either within a separate voxel from other entries of the output array or corresponding to entries of the input array within separate partitions.
  • Block 908 includes the controller 118 ray marching to each occupied voxel.
  • the ray marching may be performed for occupied voxels which remain after the adjacency filtering performed in block 906.
  • Ray marching includes the controller 118 projecting a ray 402 from an origin 210 of the sensor(s) 202 used to produce the point cloud to each occupied voxel.
  • the ray 402 may extend from the origin 210 to either the center of the occupied voxels or the locations of the points 204 therein.
  • the voxels the rays pass through on their paths to the occupied voxels may be denoted as “empty space.” It is appreciated that this step may occupy a substantial majority of the runtime of method 900, wherein use of the adjacency filtering in block 904 substantially reduces the time occupied by this step 908.
  • Block 910 includes the controller 118 marking occupied voxels with a count.
  • the count may correspond to a number of points 204 localized within each voxel based on the original point cloud received in block 902 and discretized in block 904 (i.e., not based on the remaining points 204 after adjacency filtering).
  • Any voxels identified as “occupied” voxels 408 during, e.g., a previous execution of method 900 (i.e., during acquisition of a previous point cloud) may be marked as “empty space” if a ray 402 passes through the “occupied” voxels 408.
  • the controller 118 clears space (i.e., voxels) previously occupied by objects (e.g., moving objects) which are no longer occupied by the objects based on data from the point cloud received in block 902. Similarly, the controller 118 will not clear any “occupied” voxels 408 unless and until a ray 402 passes therein. This is analogous to the controller 118 assuming an object previously seen by the previous point cloud is still present unless and until there is data, e.g., from the current point cloud which indicates the object is no longer present.
  • the marking i.e., assigning a count to each voxel
  • the marking may be performed subsequent to the identification of “empty space” voxels in block 908 such that, when implementing the adjacency filtering using vectorized computational methods or ray marching in parallel (discussed below in regard to FIGs. 10-11), no voxels which comprise a point 204 therein are misidentified as “empty space.”
  • FIG. 10A-B illustrates the ray marching procedure performed on two points 204-1 and 204-2, according to an exemplary embodiment.
  • the two points 204 may be localized by two beams 208 which pass through adjacent/neighboring pixels 304 of the image plane 302 of a depth camera 202.
  • the denoting of voxels 404 as “occupied” is performed subsequent to the denoting of voxels 404 as “empty space” voxels 406 when performing the ray marching process in parallel for reasons which will be discussed now, although one skilled in the art may appreciate that performing the ray marching process in parallel is not intended to be limiting.
  • a controller 118 may extend a ray 402-1 to the point 204-1, wherein voxels 404 through which ray 402-1 passes may be denoted as “empty space” voxels 406.
  • the controller 118 may extend a second ray 402-2 to the second point 204-2 and denote voxels 404 through which the ray 402-2 passes as “empty space,” excluding the voxel 404 in which the second point 204-2 lies.
  • the voxel may be denoted as “empty space” despite a point 204 being localized therein.
  • voxel 1002 is assigned as an “empty” voxel by ray 402-2 because it does not encounter point 204- 1.
  • the controller 118 may mark both the voxels 404 in which the two points 204-1 and 204-2 lie as “occupied” voxels 408 as shown in FIG. 10B. As shown, voxel 1002 containing point 204-1 is reassigned as “occupied” voxel 408-1 and the voxel containing point 204-2 is assigned as “occupied” voxel 408-2.
  • the controller 118 may further assign a count to each occupied voxel 408 corresponding to a number of points 204 of the original point cloud which fall within each occupied voxel 408, wherein the count is based on the original point cloud and the figure 10A-B depicts points 204 which remain after the adjacency filtering (e.g., as described in FIG. 5A-C).
  • identifying all empty space voxels 406 and subsequently (i.e., after all rays 402 to all points 204 have been configured) to identifying occupied voxels 408 enables the controller 118 to extend the rays 402-1 and 402-2 contemporaneously (e.g., using parallel processors 138 or a SIMD processor 1100 shown below in FIG. 11) without the possibility of an occupied voxel 408 being cleared (i.e., denoted as an empty space voxel 406) by an adjacent or neighboring ray 402-2.
  • empty space voxels 406 may be replaced after the ray marching procedure with “occupied” voxels 408 if there is one or more points 204 localized therein (based on the original point cloud), however “occupied” voxels 408 are never replaced with “empty space” voxels 406 until a subsequent point cloud scan is received.
  • block 912 includes the controller 118 projecting the 3D volume of labeled voxels onto a 2D grid to produce a 2D occupancy map. That is, for each column of voxels, the count of all the voxels is summed, wherein the summation of the count corresponds to the count of the pixel of the 2D occupancy map below the column. If the summation yields a value greater than a threshold (e.g., 1 or more), the pixel at the base of the column may include a denotation of “occupied.” If the summation is below a threshold value, then the pixel at the base of the column may include a denotation of “unoccupied,” “free space,” or equivalent.
  • a threshold e.g. 1 or more
  • Voxels denoted as “unknown” may comprise a count of zero.
  • the projection may be performed by the controller 118 removing the vertical component of the points 204 of the point cloud received in block 902. Subsequently, the controller 118 may count a number of points 204 within each pixel and assign the corresponding labels of “occupied,” “free space,” or “unknown” based on the count of each pixel.
  • Block 914 includes the controller 118 removing spatially isolated pixels of the occupancy map.
  • Spatially isolated pixels may include any occupied pixel 704 comprising a count greater than a first threshold required to be denoted as “occupied.”
  • a spatially isolated pixel corresponds to an occupied pixel 704 which is surrounded by pixels which cumulatively comprise a count below a second threshold.
  • the surrounding pixels 704 may include the four neighboring pixels 704 or the eight neighboring and diagonally neighboring pixels 704.
  • the second threshold may correspond to eight (8) times the first threshold.
  • the first and second thresholds may be one (1) or more.
  • FIG. 11 illustrates a single instruction multiple data (“SIMD”) architecture of a processor 1100, according to an exemplary embodiment.
  • SIMD devices are a generalized form of computer architecture denoted by Flynn’s Taxonomy.
  • SIMD devices differ from sequential CPU processors in that data stored in the data pool 1102 (e.g., a computer-readable storage medium) may be processed in small chunks by a plurality of processing units 1104 in parallel, wherein each processing unit 1104 may receive instructions from an instruction pool 1106 (e.g., a computer-readable storage medium). The instructions received may be the same for each processing unit 1104.
  • data pool 1102 e.g., a computer-readable storage medium
  • each processing unit 1104 may receive instructions from an instruction pool 1106 (e.g., a computer-readable storage medium).
  • the instructions received may be the same for each processing unit 1104.
  • Taxonomy may include SISD (single instruction single data stream), MISD (multiple instruction single data stream), and MIMD (multiple instructions multiple data streams).
  • Contemporary graphics processing units (“GPUs”) and sequential CPU processors such as produced by Intel, Nvidia, and Advanced Micro Devices (“AMD”), may include some combination of the four architectures.
  • SIMD and MISD architectures are more closely aligned with graphics processing units (“GPU”) than sequential CPUs because architectures of GPUs are typically configured to maximize throughput (i.e., amount of data processed per second), whereas sequential CPUs are configured to minimize latency (i.e., time between execution of two sequential instructions).
  • SIMD processor 1100 is not intended to be limiting for any of the processors discussed above (e.g., processor 138 or controller 118), wherein FIG. 11 is intended to illustrate how the adjacency filtering may be configured for use of specific hardware to accelerate the adjacency filtering, such as GPUs, which are common within the art to perform the ray marching procedure (i.e., ray tracing).
  • SIMD processor is similar to SIMT (single instruction multiple thread) processors.
  • Contemporary GPUs use a combination of SIMD and SIMT architectures, wherein processor 1100 may illustrate one of many stream multiprocessors (“SM”) of a GPU.
  • SM stream multiprocessors
  • NVDIA GeForce RTX 2080 comprises 46 SMs, where each SM may comprise a SIMD architecture similar to processor 1100.
  • Each SM may be considered as a block of separate threads, each block of threads being configured to execute different instructions from the instruction pool 1106 compared to the other SMs. Threads of the block of threads correspond to the processors 1104 which execute the same computer-readable instructions in parallel.
  • Processing units 1104 may be configured to execute the computer-readable instructions received from the instruction pool 1106. Typically, the instructions executed are of low complexity and size as compared to sequential CPUs (e.g., CISC architecture CPUs) which may execute a substantial number of sequential instructions. Processing units 1104 may comprise a similar architecture as illustrated by processor 138, wherein instructions are received from an instruction pool 1106 which are shared with other processing units 1104 rather than a memory 130. FIG. 11 illustrates three processing units 1104, however one skilled in the art may appreciate that a SIMD processor 1100 may include a plurality of additional processing units 1104 (e.g., hundreds or more). [00160] Use of a SIMD architecture may accelerate both the ray marching procedure and adjacency filtering described above.
  • the instruction pool 1106 may provide instructions to the plurality of processing units 1104 which configure the processing units 1104 to perform operations 506 on data received from the data pool 1102.
  • the data received by each processing unit 1102 may include entries 514 within one partition of an array 512.
  • the array 512 has been partitioned into three partitions of eight (8) entries 514, as illustrated within the data pool 1102, wherein the partitions may be stored in separate addresses.
  • each SM of a GPU may process a partition of the array and each processing unit 1104 may perform a single operation/comparison 506.
  • each entry 514 is stored within a separate address in the data pool 1102, wherein a partition may be retrieved by the data pool 1102 outputting data to a respective processing unit 1104 corresponding to a range of addresses (i.e., a vector).
  • a partition and its corresponding entries may be retrieved from the data pool 1102 simultaneously using a vector address (i.e., a range of addresses). That is, a processing unit 1102 receives entries 514 for a partition and performs operations 506 on the entries 514 within the partition to produce entries 520 of the output array 518 at the same time as the other processing units 1104 receive their respective entries 514 for their respective partitions to perform the same operations.
  • the SIMD processor 1100 may perform the adjacency filtering by executing operations 506 in parallel for each partition, thereby reducing the execution time in performing the adjacency filtering for the entire array 512 down to the time required for a single processing unit 1104 to process a single partition.
  • each processing unit 1102 may receive instructions from the instruction pool 1106 which configure the processing units 1102 to perform the ray marching procedure for a singular ray 402.
  • Each processing unit 1102 may receive an entry 520 of the output array 518 and a location of an origin 210 of a sensor used to produce a point cloud based on which the ray marching is performed. Accordingly, each processing unit 1102 may extend a ray 402 to a point 204 corresponding to an entry 520 of the output array 518 to identify “empty space” voxels 406. In doing so, the SIMD processor 1100 extends a number of rays 402 equal to a number of processing units 1102 in parallel (i.e., contemporaneously) which drastically reduces execution time of the ray marching procedure.
  • a controller 118 of a robot 102 may include a GPU and/or SIMD processor 1100 to perform the ray marching procedure.
  • Ray marching, or ray tracing is typically optimized for GPUs and/or SIMD devices executing a small set of instructions on multiple data sets (e.g., producing rays 402 from origin point 210 and a point 204).
  • robots 102 that perform the ray marching procedure may typically include one or more GPUs as components of controller 118 to accelerate this process.
  • the adjacency fdter may be executed quickly when executed by a SIMD device, such as a GPU of controller 118, which further reduces the execution time of method 900 and is applicable to many robots 102 which may already comprise a GPU or SIMD processor.
  • a SIMD device such as a GPU of controller 118
  • a GPU or SIMD device must be utilized to perform adjacency filtering and ray marching.
  • partitioning during adjacency filtering enables the adjacency filtering to be optimized for use in GPUs and SIMD processors 1100.
  • the term “including” should be read to mean “including, without limitation,” “including but not limited to,” or the like; the term “comprising” as used herein is synonymous with “including,” “containing,” or “characterized by,” and is inclusive or open-ended and does not exclude additional, unrecited elements or method steps; the term “having” should be interpreted as “having at least”; the term “such as” should be interpreted as “such as, without limitation”; the term “includes” should be interpreted as “includes but is not limited to”; the term “example” or the abbreviation “e.g.” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof, and should be interpreted as “example, but without limitation”; the term “illustration” is used to provide illustrative instances of the item in discussion, not an exhaustive or limiting list thereof, and should be interpreted as “illustration, but without limitation.” Adjectives such as “known,” “normal
  • a group of items linked with the conjunction “and” should not be read as requiring that each and every one of those items be present in the grouping, but rather should be read as “and/or” unless expressly stated otherwise.
  • a group of items linked with the conjunction “or” should not be read as requiring mutual exclusivity among that group, but rather should be read as “and/or” unless expressly stated otherwise.
  • the terms “about” or “approximate” and the like are synonymous and are used to indicate that the value modified by the term has an understood range associated with it, where the range may be ⁇ 20%, ⁇ 15%, ⁇ 10%, ⁇ 5%, or ⁇ 1%.
  • a result e.g., measurement value
  • close may mean, for example, the result is within 80% of the value, within 90% of the value, within 95% of the value, or within 99% of the value.
  • defined or “determined” may include “predefined” or “predetermined” and/or otherwise determined values, conditions, thresholds, measurements, and the like.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

Systems and methods for producing occupancy maps for robotic devices are disclosed herein. According to at least one non-limiting exemplary embodiment, producing an occupancy map may include a processor or controller of a robot filtering a point cloud, ray marching to detect empty space, projecting a labeled volume of voxels onto a 2-dimensional plane, and removing spatially isolated occupied pixels of a resulting occupancy map.

Description

SYSTEMS AND METHODS FOR PRODUCING OCCUPANCY MAPS FOR ROBOTIC DEVICES
Copyright
[0001 ] A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
Technological Field
[0002] The present application relates generally to robotics, and more specifically to systems and methods for producing occupancy maps for robotic devices.
Summary
[0003] The present disclosure provides, inter alia, systems and methods for producing occupancy maps for robotic devices. The present disclosure is directed at apractical application of data collection and filtering to produce computer-readable maps for navigating robots at an improved rate to enable the robots to more readily perceive and adapt their motions to their environments.
[0004] Exemplary embodiments described herein have innovative features, no single one of which is indispensable or solely responsible fortheir desirable attributes. Without limiting the scope of the claims, some of the advantageous features will now be summarized. One skilled in the art would appreciate that as used herein, the term robot may generally refer to an autonomous vehicle or object that travels a route, executes a task, or otherwise moves automatically upon executing or processing computer-readable instructions.
[0005] According to at least one non-limiting exemplary embodiment, a method is disclosed. The method comprises a controller of a robot receiving a point cloud, the point cloud produced based on measurements from at least one sensor of the robot; discretizing the point cloud into a plurality of voxels, each voxel comprising a count; associating points of the point cloud which fall within a same voxel with a single point within the voxel; and ray marching from an origin of the sensor to the single point within each voxel to detect empty space surrounding the robot.
[0006] According to at least one non-limiting exemplary embodiment, the method further comprises the controller producing a computer-readable map of an environment of the robot, the computer-readable map comprising a plurality of pixels, each pixel being identified as occupied or empty space based on the counts of the voxels above each of the plurality of pixels. [0007] According to at least one non-limiting exemplary embodiment, the method further comprises the controller setting a count equal to zero for occupied pixels comprising neighboring pixels which include a cumulative total count less than a threshold value.
[0008] According to at least one non-limiting exemplary embodiment, points of the same voxel are determined based on adjacency within an image plane of at least one sensor.
[0009] According to at least one non-limiting exemplary embodiment, the magnitude of the distance measurements is less than a distance threshold.
[0010] According to at least one non-limiting exemplary embodiment, at least one sensor includes one of a scanning planar LiDAR, two-dimensional LiDAR, or depth camera.
[0011] These and other objects, features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the disclosure. As used in the specification and in the claims, the singular form of “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
Brief Description of the Drawings
[0012] The disclosed aspects will hereinafter be described in conjunction with the appended drawings, provided to illustrate and not to limit the disclosed aspects, wherein like designations denote like elements.
[0013 ] FIG. 1 A is a functional block diagram of a robot in accordance with some embodiments of this disclosure.
[0014] FIG. IB is a functional block diagram of a controller or processor in accordance with some embodiments of this disclosure.
[0015] FIG. 2 A(i-ii) illustrates a time of flight or LiDAR sensor used to produce a point cloud, according to an exemplary embodiment.
[0016] FIG. 2B illustrates methods for localizing an origin of a sensor within an environment of a robot, according to an exemplary embodiment.
[0017] FIG. 3 illustrates an image plane of a sensor configured to produce a point cloud in accordance with some embodiments of the present disclosure. [0018] FIG. 4 illustrates a ray marching procedure used to detect empty space surrounding a robot, according to an exemplary embodiment.
[0019] FIG. 5A-B illustrates an adjacency filtering procedure used to accelerate the ray marching procedure, according to an exemplary embodiment.
[0020] FIG. 5C is a process flow diagram illustrating a method for performing the adjacency filtering procedure, according to an exemplary embodiment.
[0021] FIG. 6A-B illustrates a robot utilizing adjacency filtering to accelerate the ray marching procedure, according to an exemplary embodiment.
[0022] FIG. 6C illustrates a sensor collecting a point cloud representing an object, according to an exemplary embodiment.
[0023] FIG. 6D illustrates the point cloud collected in FIG. 6C being utilized for the ray marching procedure subsequent to performance of the adjacency filtering procedure, according to an exemplary embodiment.
[0024] FIG. 7A illustrates a projection of a 3D point cloud onto a 2D plane to produce an occupancy map, according to an exemplary embodiment.
[0025] FIG. 7B illustrates a projection of a 3D volume of labeled voxels onto a 2D plane to produce an occupancy map, according to an exemplary embodiment.
[0026] FIG. 8A-B illustrates the removal of a spatially isolated pixel or voxel of an occupancy map, according to an exemplary embodiment.
[0027] FIG. 9 is a process flow diagram illustrating a method for producing an occupancy map from a point cloud, according to an exemplary embodiment.
[0028] FIGS. 10A-B illustrates the ray marching procedure for two points and a marking of a count for voxels in discrete 3D space, according to an exemplary embodiment.
[0029] FIG. 11 illustrates a single instruction multiple data processor to illustrate how adjacency filtering procedure is configured to be executed in parallel using common hardware elements of a robot, according to an exemplary embodiment.
[0030] All Figures disclosed herein are © Copyright 2021 Brain Corporation. All rights reserved.
Detailed Description
[0031] Currently, robots may utilize computer-readable maps to perceive and map their surrounding environments. The computer-readable maps may additionally be utilized to determine routes or movements of the robots such that the robots avoid objects within their environments. Rapid production of these maps using incoming data from sensors of the robots may be critical for the robots to operate efficiently and avoid collisions.
[0032] Many robots utilize light detection and ranging (“LiDAR”) sensors, which are configured to produce point clouds. The point clouds are typically of high resolution (e.g., floating point accuracy), wherein use of point clouds to navigate an environment may be computationally taxing and/or slow. Further, point clouds do not denote any shapes of objects as points of the point clouds do not comprise any dimensionality or volume. Accordingly, use of an occupancy map may enable robots to more readily perceive their surrounding environments and plan their motions accordingly.
[0033] Occupancy maps may include denotations for occupied regions (i.e., objects) and free space regions, wherein robots may plan their motions within the free space regions to avoid collisions. Ray marching, or ray tracing, may enable robots to detect free space surrounding them as described below, however ray marching may be computationally taxing, slow, and include a plurality of redundant calculations. The redundant calculations are increased as (i) point cloud density increases (i.e., based on sensor resolution), and/or (ii) as objects are localized closer to the robot. Accordingly, there is a need in the art for systems and methods for accelerating the ray marching procedure to enable robots to produce occupancy maps using incoming point cloud data at faster speeds.
[0034] Various denotations of voxel and pixel states will be discussed below. As used herein, a voxel/pixel may be denoted as “unknown” if no measurements from one or more sensors do not detect the region, wherein it is unknown whether or not an object is present. As used herein, a voxel/pixel may be denoted as “occupied” if one or more sensors positively detect an object therein. As used herein, a voxel/pixel may be denoted as “empty” or “unoccupied” if one or more sensors positively do not detect an object at the location of the voxel/pixel, e.g., using a ray marching process discussed below. That is, “empty/unoccupied” voxels/pixels correspond to regions of space known to be empty.
[0035] Various aspects of the novel systems, apparatuses, and methods disclosed herein are described more fully hereinafter with reference to the accompanying drawings. This disclosure can, however, be embodied in many different forms and should not be construed as limited to any specific structure or function presented throughout this disclosure. Rather, these aspects are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Based on the teachings herein, one skilled in the art would appreciate that the scope of the disclosure is intended to cover any aspect of the novel systems, apparatuses, and methods disclosed herein, whether implemented independently of, or combined with, any other aspect of the disclosure. For example, an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein. In addition, the scope of the disclosure is intended to cover such an apparatus or method that is practiced using other structure, functionality, or structure and functionality in addition to or other than the various aspects of the disclosure set forth herein. It should be understood that any aspect disclosed herein may be implemented by one or more elements of a claim. [0036] Although particular aspects are described herein, many variations and permutations of these aspects fall within the scope of the disclosure. Although some benefits and advantages of the preferred aspects are mentioned, the scope of the disclosure is not intended to be limited to particular benefits, uses, and/or objectives. The detailed description and drawings are merely illustrative of the disclosure rather than limiting, the scope of the disclosure being defined by the appended claims and equivalents thereof.
[0037] The present disclosure provides for systems and methods for producing occupancy maps for robotic devices. As used herein, a robot may include mechanical and/or virtual entities configured to carry out a complex series of tasks or actions autonomously. In some exemplary embodiments, robots may be machines that are guided and/or instructed by computer programs and/or electronic circuitry. In some exemplary embodiments, robots may include electro-mechanical components that are configured for navigation, where the robot may move from one location to another. Such robots may include autonomous and/or semi-autonomous cars, floor cleaners, rovers, drones, planes, boats, carts, trams, wheelchairs, industrial equipment, stocking machines, mobile platforms, personal transportation devices (e.g., hover boards, scooters, self-balancing vehicles such as manufactured by Segway, etc.), trailer movers, vehicles, and the like. Robots may also include any autonomous and/or semi-autonomous machine for transporting items, people, animals, cargo, freight, objects, luggage, and/or anything desirable from one location to another.
[0038] As used herein, network interfaces may include any signal, data, or software interface with a component, network, or process including, without limitation, those of the FireWire (e.g., FW400, FW800, FWS800T, FWS1600, FWS3200, etc.), universal serial bus (“USB”) (e.g., USB l.X, USB 2.0, USB 3.0, USB Type-C, etc.), Ethernet (e.g., 10/100, 10/100/1000 (Gigabit Ethernet), I0-Gig- E, etc.), multimedia over coax alliance technology (“MoCA”), Coaxsys (e.g., TVNET™), radio frequency tuner (e.g., in-band or OOB, cable modem, etc.), Wi-Fi (802.11), WiMAX (e.g., WiMAX (802.16)), PAN (e.g., PAN/802.15), cellular (e.g., 3G, LTE/LTE-A/TD-LTE/TD-LTE, GSM, etc.), IrDA families, etc. As used herein, Wi-Fi may include one or more of lEEE-Std. 802.11, variants of lEEE-Std. 802.11, standards related to lEEE-Std. 802.11 (e.g., 802.11 a/b/g/n/ac/ad/af/ah/ai/aj/aq/ax/ay), and/or other wireless standards.
[0039] As used herein, processor, microprocessor, and/or digital processor may include any type of digital processing device such as, without limitation, digital signal processors (“DSPs”), reduced instruction set computers (“RISC”), complex instruction set computers (“CISC”) processors, microprocessors, gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors, and application-specific integrated circuits (“ASICs”). Such digital processors may be contained on a single unitary integrated circuit die or distributed across multiple components.
[0040] As used herein, computer program and/or software may include any sequence or human- or machine -cognizable steps that perform a function. Such computer program and/or software may be rendered in any programming language or environment including, for example, C/C++, C#, Fortran, COBOL, MATLAB™, PASCAL, GO, RUST, SCALA, Python, assembly language, markup languages (e.g., HTML, SGML, XML, VoXML), and the like, as well as object-oriented environments such as the Common Object Request Broker Architecture (“CORBA”), JAVA™ (including J2ME, Java Beans, etc.), Binary Runtime Environment (e.g., “BREW”), and the like.
[0041] As used herein, connection, link, and/or wireless link may include a causal link between any two or more entities (whether physical or logical/virtual), which enables information exchange between the entities.
[0042] As used herein, computer and/or computing device may include, but are not limited to, personal computers (“PCs”) and minicomputers, whether desktop, laptop, or otherwise, mainframe computers, workstations, servers, personal digital assistants (“PDAs”), handheld computers, embedded computers, programmable logic devices, personal communicators, tablet computers, mobile devices, portable navigation aids, J2ME equipped devices, cellular telephones, smart phones, personal integrated communication or entertainment devices, and/or any other device capable of executing a set of instructions and processing an incoming data signal.
[0043] Detailed descriptions of the various embodiments of the system and methods of the disclosure are now provided. While many examples discussed herein may refer to specific exemplary embodiments, it will be appreciated that the described systems and methods contained herein are applicable to any kind of robot. Myriad other embodiments or uses for the technology described herein would be readily envisaged by those having ordinary skill in the art, given the contents of the present disclosure.
[0044] Advantageously, the systems and methods of this disclosure at least: (i) accelerate depth and environmental perception of robotic devices that utilize sensors configured to produce point clouds; (ii) enable robots to accelerate the production of occupancy maps when objects are close to the robots, which may be when robots need up-to-date map information the most; (iii) enhance accuracy of occupancy maps of robots by reducing noise; and (iv) take advantage of common hardware elements of robots to accelerate their production of occupancy maps. Other advantages are readily discernible by one having ordinary skill in the art given the contents of the present disclosure.
[0045] FIG. 1A is a functional block diagram of a robot 102 in accordance with some principles of this disclosure. As illustrated in FIG. 1A, robot 102 may include controller 118, memory 120, user interface unit 112, sensor units 114, navigation units 106, actuator unit 108, and communications unit 116, as well as other components and subcomponents (e.g., some of which may not be illustrated). Although a specific embodiment is illustrated in FIG. 1A, it is appreciated that the architecture may be varied in certain embodiments as would be readily apparent to one of ordinary skill given the contents of the present disclosure. As used herein, robot 102 may be representative at least in part of any robot described in this disclosure.
[0046] Controller 118 may control the various operations performed by robot 102. Controller 118 may include and/or comprise one or more processors (e.g., microprocessors) and other peripherals. As previously mentioned and used herein, processor, microprocessor, and/or digital processor may include any type of digital processor such as, without limitation, digital signal processors (“DSPs”), reduced instruction set computers (“RISC”), complex instruction set computers (“CISC”), microprocessors, gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors and application-specific integrated circuits (“ASICs”). Peripherals may include hardware accelerators configured to perform a specific function using hardware elements such as, without limitation, encryption/description hardware, algebraic processors (e.g., tensor processing units, quadradic problem solvers, multipliers, etc.), data compressors, encoders, arithmetic logic units (“ALU”), and the like. Such digital processors may be contained on a single unitary integrated circuit die, or distributed across multiple components.
[0047] Controller 118 may be operatively and/or communicatively coupled to memory 120. Memory 120 may include any type of integrated circuit or other storage device configured to store digital data including, without limitation, read-only memory (“ROM”), random access memory (“RAM”), non-volatile random access memory (“NVRAM”), programmable read-only memory (“PROM”), electrically erasable programmable read-only memory (“EEPROM”), dynamic randomaccess memory (“DRAM”), Mobile DRAM, synchronous DRAM (“SDRAM”), double data rate SDRAM (“DDR/2 SDRAM”), extended data output (“EDO”) RAM, fast page mode RAM (“FPM”), reduced latency DRAM (“RLDRAM”), static RAM (“SRAM”), flash memory (e.g., NAND/NOR), memristor memory, pseudostatic RAM (“PSRAM”), etc. Memory 120 may provide instructions and data to controller 118. For example, memory 120 may be a non-transitory, computer-readable storage apparatus and/or medium having a plurality of instructions stored thereon, the instructions being executable by a processing apparatus (e.g., controller 118) to operate robot 102. In some cases, the instructions may be configured to, when executed by the processing apparatus, cause the processing apparatus to perform the various methods, features, and/or functionality described in this disclosure. Accordingly, controller 118 may perform logical and/or arithmetic operations based on program instructions stored within memory 120. In some cases, the instructions and/or data of memory 120 may be stored in a combination of hardware, some located locally within robot 102, and some located remote from robot 102 (e.g., in a cloud, server, network, etc.).
[0048] It should be readily apparent to one of ordinary skill in the art that a processor may be internal to or on board robot 102 and/or may be external to robot 102 and be communicatively coupled to controller 118 of robot 102 utilizing communication units 116 wherein the external processor may receive data from robot 102, process the data, and transmit computer-readable instructions back to controller 118. In at least one non-limiting exemplary embodiment, the processor may be on a remote server (not shown).
[0049] In some exemplary embodiments, memory 120, shown in FIG. 1A, may store a library of sensor data. In some cases, the sensor data may be associated at least in part with objects and/or people. In exemplary embodiments, this library may include sensor data related to objects and/or people in different conditions, such as sensor data related to objects and/or people with different compositions (e.g., materials, reflective properties, molecular makeup, etc.), different lighting conditions, angles, sizes, distances, clarity (e.g., blurred, obstructed/occluded, partially off frame, etc.), colors, surroundings, and/or other conditions. The sensor data in the library may be taken by a sensor (e.g., a sensor of sensor units 114 or any other sensor) and/or generated automatically, such as with a computer program that is configured to generate/simulate (e.g., in a virtual world) library sensor data (e.g., which may generate/simulate these library data entirely digitally and/or beginning from actual sensor data) from different lighting conditions, angles, sizes, distances, clarity (e.g., blurred, obstructed/occluded, partially off frame, etc.), colors, surroundings, and/or other conditions. The number of images in the library may depend at least in part on one or more of the amount of available data, the variability of the surrounding environment in which robot 102 operates, the complexity of objects and/or people, the variability in appearance of objects, physical properties of robots, the characteristics of the sensors, and/or the amount of available storage space (e.g., in the library, memory 120, and/or local or remote storage). In exemplary embodiments, at least a portion of the library may be stored on a network (e.g., cloud, server, distributed network, etc.) and/or may not be stored completely within memory 120. As yet another exemplary embodiment, various robots (e.g., that are commonly associated, such as robots by a common manufacturer, user, network, etc.) may be networked so that data captured by individual robots are collectively shared with other robots. In such a fashion, these robots may be configured to learn and/or share sensor data in order to facilitate the ability to readily detect and/or identify errors and/or assist events. [0050] Still referring to FIG. 1A, operative units 104 may be coupled to controller 118, or any other controller, to perform the various operations described in this disclosure. One, more, or none of the modules in operative units 104 may be included in some embodiments. Throughout this disclosure, reference may be to various controllers and/or processors. In some embodiments, a single controller (e.g., controller 118) may serve as the various controllers and/or processors described. In other embodiments different controllers and/or processors may be used, such as controllers and/or processors used particularly for one or more operative units 104. Controller 118 may send and/or receive signals, such as power signals, status signals, data signals, electrical signals, and/or any other desirable signals, including discrete and analog signals to operative units 104. Controller 118 may coordinate and/or manage operative units 104, and/or set timings (e.g., synchronously or asynchronously), turn off/on control power budgets, receive/send network instructions and/or updates, update firmware, send interrogatory signals, receive and/or send statuses, and/or perform any operations for running features of robot 102.
[0051] Returning to FIG. 1A, operative units 104 may include various units that perform functions for robot 102. For example, operative units 104 includes at least navigation units 106, actuator units 108, user interface units 112, sensor units 114, and communication units 116. Operative units 104 may also comprise other units such as specifically configured task units (not shown) that provide the various functionality of robot 102. In exemplary embodiments, operative units 104 may be instantiated in software, hardware, or both software and hardware. For example, in some cases, units of operative units 104 may comprise computer-implemented instructions executed by a controller. In exemplary embodiments, units of operative units 104 may comprise hardcoded logic (e.g., ASICS). In exemplary embodiments, units of operative units 104 may comprise both computer-implemented instructions executed by a controller and hardcoded logic. Where operative units 104 are implemented in part in software, operative units 104 may include units/modules of code configured to provide one or more functionalities.
[0052] In exemplary embodiments, navigation units 106 may include systems and methods that may computationally construct and update a map of an environment, localize robot 102 (e.g., find its position) in a map, and navigate robot 102 to/from destinations. The mapping may be performed by imposing data obtained in part by sensor units 114 into a computer-readable map representative at least in part of the environment. In exemplary embodiments, a map of an environment may be uploaded to robot 102 through user interface units 112, uploaded wirelessly or through wired connection, or taught to robot 102 by a user.
[0053] In exemplary embodiments, navigation units 106 may include components and/or software configured to provide directional instructions for robot 102 to navigate. Navigation units 106 may process maps, routes, and localization information generated by mapping and localization units, data from sensor units 114, and/or other operative units 104.
[0054] Still referring to FIG. 1A, actuator units 108 may include actuators such as electric motors, gas motors, driven magnet systems, solenoid/ratchet systems, piezoelectric systems (e.g., inchworm motors), magnetostrictive elements, gesticulation, and/or any way of driving an actuator known in the art. By way of illustration, such actuators may actuate the wheels for robot 102 to navigate a route; navigate around obstacles; or repose cameras and sensors. According to exemplary embodiments, actuator unit 108 may include systems that allow movement of robot 102, such as motorized propulsion. For example, motorized propulsion may move robot 102 in a forward or backward direction, and/or be used at least in part in turning robot 102 (e.g., left, right, and/or any other direction). By way of illustration, actuator unit 108 may control if robot 102 is moving or is stopped and/or allow robot 102 to navigate from one location to another location.
[0055] Actuator unit 108 may also include any system used for actuating, in some cases actuating task units to perform tasks. For example, actuator unit 108 may include driven magnet systems, motors/engines (e.g., electric motors, combustion engines, steam engines, and/or any type of motor/engine known in the art), solenoid/ratchet system, piezoelectric system (e.g., an inchworm motor), magnetostrictive elements, gesticulation, and/or any actuator known in the art.
[0056] According to exemplary embodiments, sensor units 114 may comprise systems and/or methods that may detect characteristics within and/or around robot 102. Sensor units 114 may comprise a plurality and/or a combination of sensors. Sensor units 114 may include sensors that are internal to or on board robot 102 or external, and/or have components that are partially internal and/or partially external. In some cases, sensor units 114 may include one or more exteroceptive sensors, such as sonars, light detection and ranging (“LiDAR”) sensors, radars, lasers, cameras (including video cameras (e.g., red-blue-green (“RBG”) cameras, infrared cameras, three-dimensional (“3D”) cameras, thermal cameras, etc.), time of flight (“ToF”) cameras, structured light cameras, antennas, motion detectors, microphones, and/or any other sensor known in the art. According to some exemplary embodiments, sensor units 114 may collect raw measurements (e.g., currents, voltages, resistances, gate logic, etc.) and/or transformed measurements (e.g., distances, angles, detected points in obstacles, etc.). In some cases, measurements may be aggregated and/or summarized. Sensor units 114 may generate data based at least in part on distance or height measurements. Such data may be stored in data structures, such as matrices, arrays, queues, lists, stacks, bags, etc.
[0057] According to exemplary embodiments, sensor units 114 may include sensors that may measure internal characteristics of robot 102. For example, sensor units 114 may measure temperature, power levels, statuses, and/or any characteristic of robot 102. In some cases, sensor units 114 may be configured to determine the odometry of robot 102. For example, sensor units 114 may include proprioceptive sensors, which may comprise sensors such as accelerometers, inertial measurement units (“IMU”), odometers, gyroscopes, speedometers, cameras (e.g. using visual odometry), clock/timer, and the like. Odometry may facilitate autonomous navigation and/or autonomous actions of robot 102. This odometry may include robot 102’s position (e.g., where position may include robot’s location, displacement and/or orientation, and may sometimes be interchangeable with the term pose as used herein) relative to the initial location. Such data may be stored in data structures, such as matrices, arrays, queues, lists, stacks, bags, etc. According to exemplary embodiments, the data structure of the sensor data may be called an image.
[0058] According to exemplary embodiments, sensor units 114 may be in part external to robot 102 and coupled to communications units 116. For example, a security camera within an environment of a robot 102 may provide a controller 118 of the robot 102 with a video feed via wired or wireless communication channel(s). In some instances, sensor units 114 may include sensors configured to detect the presence of an object at a location such as, for example without limitation, a pressure or motion sensor may be disposed at a shopping cart storage location of a grocery store, wherein the controller 118 of the robot 102 may utilize data from the pressure or motion sensor to determine if the robot 102 should retrieve more shopping carts for customers.
[0059] According to exemplary embodiments, user interface units 112 may be configured to enable a user to interact with robot 102. For example, user interface units 112 may include touch panels, buttons, keypads/keyboards, ports (e.g., universal serial bus (“USB”), digital visual interface (“DVI”), Display Port, E-Sata, Firewire, PS/2, Serial, VGA, SCSI, audioport, high-definition multimedia interface (“HDMI”), personal computer memory card international association (“PCMCIA”) ports, memory card ports (e.g., secure digital (“SD”) and miniSD), and/or ports for computer-readable medium), mice, rollerballs, consoles, vibrators, audio transducers, and/or any interface for a user to input and/or receive data and/or commands, whether coupled wirelessly or through wires. Users may interact through voice commands or gestures. User interface units 218 may include a display, such as, without limitation, liquid crystal display (“ECDs”), light-emitting diode (“LED”) displays, LED LCD displays, in-plane-switching (“IPS”) displays, cathode ray tubes, plasma displays, high definition (“HD”) panels, 4K displays, retina displays, organic LED displays, touchscreens, surfaces, canvases, and/or any displays, televisions, monitors, panels, and/or devices known in the art for visual presentation. According to exemplary embodiments user interface units 112 may be positioned on the body of robot 102. According to exemplary embodiments, user interface units 112 may be positioned away from the body of robot 102 but may be communicatively coupled to robot 102 (e.g., via communication units including transmitters, receivers, and/or transceivers) directly or indirectly (e.g., through a network, server, and/or a cloud). According to exemplary embodiments, user interface units 112 may include one or more projections of images on a surface (e.g., the floor) proximally located to the robot, e.g., to provide information to the occupant or to people around the robot. The information could be the direction of future movement of the robot, such as an indication of moving forward, left, right, back, at an angle, and/or any other direction. In some cases, such information may utilize arrows, colors, symbols, etc.
[0060] According to exemplary embodiments, communications unit 116 may include one or more receivers, transmitters, and/or transceivers. Communications unit 116 may be configured to send/receive a transmission protocol, such as BLUETOOTH®, ZIGBEE®, Wi-Fi, induction wireless data transmission, radio frequencies, radio transmission, radio-frequency identification (“RFID”), near- field communication (“NFC”), infrared, network interfaces, cellular technologies such as 3G (3GPP/3GPP2), high-speed downlink packet access (“HSDPA”), high-speed uplink packet access (“HSUPA”), time division multiple access (“TDMA”), code division multiple access (“CDMA”) (e.g., IS-95A, wideband code division multiple access (“WCDMA”), etc.), frequency hopping spread spectrum (“FHSS”), direct sequence spread spectrum (“DSSS”), global system for mobile communication (“GSM”), Personal Area Network (“PAN”) (e.g., PAN/802.15), worldwide interoperability for microwave access (“WiMAX”), 802.20, long term evolution (“LTE”) (e.g., LTE/LTE-A), time division LTE (“TD-LTE”), global system for mobile communication (“GSM”), narrowband/frequency-division multiple access (“FDMA”), orthogonal frequency-division multiplexing (“OFDM”), analog cellular, cellular digital packet data (“CDPD”), satellite systems, millimeter wave or microwave systems, acoustic, infrared (e.g., infrared data association (“IrDA”)), and/or any other form of wireless data transmission.
[0061] Communications unit 116 may also be configured to send/receive signals utilizing a transmission protocol over wired connections, such as any cable that has a signal line and ground. For example, such cables may include Ethernet cables, coaxial cables, Universal Serial Bus (“USB”), FireWire, and/or any connection known in the art. Such protocols may be used by communications unit 116 to communicate to external systems, such as computers, smart phones, tablets, data capture systems, mobile telecommunications networks, clouds, servers, or the like. Communications unit 116 may be configured to send and receive signals comprising numbers, letters, alphanumeric characters, and/or symbols. In some cases, signals may be encrypted, using algorithms such as 128-bit or 256-bit keys and/or other encryption algorithms complying with standards such as the Advanced Encryption Standard (“AES”), RSA, Data Encryption Standard (“DES”), Triple DES, and the like. Communications unit 116 may be configured to send and receive statuses, commands, and other data/information. For example, communications unit 116 may communicate with a user operator to allow the user to control robot 102. Communications unit 116 may communicate with a server/network (e.g., a network) in order to allow robot 102 to send data, statuses, commands, and other communications to the server. The server may also be communicatively coupled to computer(s) and/or device(s) that may be used to monitor and/or control robot 102 remotely. Communications unit 116 may also receive updates (e.g., firmware or data updates), data, statuses, commands, and other communications from a server for robot 102.
[0062] In exemplary embodiments, operating system 110 may be configured to manage memory 120, controller 118, power supply 122, modules in operative units 104, and/or any software, hardware, and/or features of robot 102. For example, and without limitation, operating system 110 may include device drivers to manage hardware recourses for robot 102.
[0063] In exemplary embodiments, power supply 122 may include one or more batteries, including, without limitation, lithium, lithium ion, nickel-cadmium, nickel-metal hydride, nickelhydrogen, carbon-zinc, silver-oxide, zinc-carbon, zinc-air, mercury oxide, alkaline, or any other type of battery known in the art. Certain batteries may be rechargeable, such as wirelessly (e.g., by resonant circuit and/or a resonant tank circuit) and/or plugging into an external power source. Power supply 122 may also be any supplier of energy, including wall sockets and electronic devices that convert solar, wind, water, nuclear, hydrogen, gasoline, natural gas, fossil fuels, mechanical energy, steam, and/or any power source into electricity.
[0064] One or more of the units described with respect to FIG. 1A (including memory 120, controller 118, sensor units 114, user interface unit 112, actuator unit 108, communications unit 116, mapping and localization unit 126, and/or other units) may be integrated onto robot 102, such as in an integrated system. However, according to some exemplary embodiments, one or more of these units may be part of an attachable module. This module may be attached to an existing apparatus to automate so that it behaves as a robot. Accordingly, the features described in this disclosure with reference to robot 102 may be instantiated in a module that may be attached to an existing apparatus and/or integrated onto robot 102 in an integrated system. Moreover, in some cases, a person having ordinary skill in the art would appreciate from the contents of this disclosure that at least a portion of the features described in this disclosure may also be run remotely, such as in a cloud, network, and/or server.
[0065] As used herein, a robot 102, a controller 118, or any other controller, processor, or robot performing a task, operation or transformation illustrated in the figures below comprises a controller executing computer-readable instructions stored on a non-transitory computer-readable storage apparatus, such as memory 120, as would be appreciated by one skilled in the art.
[0066] Next referring to FIG. IB, the architecture of a processor or processing device 138 is illustrated according to an exemplary embodiment. As illustrated in FIG. IB, the processor 138 includes a data bus 128, a receiver 126, a transmitter 134, at least one processor 130, and a memory 132. The receiver 126, the processor 130 and the transmitter 134 all communicate with each other via the data bus 128. The processor 130 is configured to access the memory 132 which stores computer code or computer-readable instructions in order for the processor 130 to execute the specialized algorithms. As illustrated in FIG. IB, memory 132 may comprise some, none, different, or all of the features of memory 120 previously illustrated in FIG. 1A. The algorithms executed by the processor 130 are discussed in further detail below. The receiver 126 as shown in FIG. IB is configured to receive input signals 124. The input signals 124 may comprise signals from a plurality of operative units 104 illustrated in FIG. 1 A including, but not limited to, sensor data from sensor units 114, user inputs, motor feedback, external communication signals (e.g., from a remote server), and/or any other signal from an operative unit 104 requiring further processing. The receiver 126 communicates these received signals to the processor 130 via the data bus 128. As one skilled in the art would appreciate, the data bus 128 is the means of communication between the different components — receiver, processor, and transmitter — in the processor. The processor 130 executes the algorithms, as discussed below, by accessing specialized computer-readable instructions from the memory 132. Further detailed description as to the processor 130 executing the specialized algorithms in receiving, processing and transmitting of these signals is discussed above with respect to FIG. 1 A. The memory 132 is a storage medium for storing computer code or instructions. The storage medium may include optical memory (e g., CD, DVD, HD-DVD, Blu-Ray Disc, etc ), semiconductor memory (e g., RAM, EPROM, EEPROM, etc.), and/or magnetic memory (e.g., hard-disk drive, floppy-disk drive, tape drive, MRAM, etc.), among others. Storage medium may include volatile, nonvolatile, dynamic, static, read/write, read-only, random-access, sequential-access, location-addressable, file-addressable, and/or content- addressable devices. The processor 130 may communicate output signals to transmitter 134 via data bus 128 as illustrated. The transmitter 134 may be configured to further communicate the output signals to a plurality of operative units 104 illustrated by signal output 136.
[0067] One of ordinary skill in the art would appreciate that the architecture illustrated in FIG. IB may also illustrate an external server architecture configured to effectuate the control of a robotic apparatus from a remote location. That is, the server may also include a data bus, a receiver, a transmitter, a processor, and a memory that stores specialized computer-readable instructions thereon. [0068] One of ordinary skill in the art would appreciate that a controller 118 of a robot 102 may include one or more processors 138 and may further include other peripheral devices used for processing information, such as ASICS, DPS, proportional-integral-derivative (“PID”) controllers, hardware accelerators (e.g., encryption/decryption hardware), and/or other peripherals (e.g., analog to digital converters) described above in FIG. 1A. The other peripheral devices when instantiated in hardware are commonly used within the art to accelerate specific tasks (e.g., multiplication, encryption, etc.) which may alternatively be performed using the system architecture of FIG. IB. In some instances, peripheral devices are used as a means for intercommunication between the controller 118 and operative units 104 (e.g., digital to analog converters and/or amplifiers for producing actuator signals). Accordingly, as used herein, the controller 118 executing computer-readable instructions to perform a function may include one or more processors 138 thereof executing computer-readable instructions and, in some instances, the use of any hardware peripherals known within the art. Controller 118 may be illustrative of various processors 138 and peripherals integrated into a single circuit die or distributed to various locations of the robot 102 which receive, process, and output information to/from operative units 104 of the robot 102 to effectuate control of the robot 102 in accordance with instructions stored in a memory 120, 132. For example, controller 118 may include a plurality of processors 138 for performing high-level tasks (e.g., planning a route to avoid obstacles) and processors 138 for performing low-level tasks (e.g., producing actuator signals in accordance with the route).
[0069] FIG. 2A(i-ii) illustrates a light detection and ranging (“LiDAR”) sensor 202 coupled to a robot 102, which collects distance measurements to an object, such as wall 206, along a measurement plane in accordance with some exemplary embodiments of the present disclosure. LiDAR sensor 202, illustrated in FIG. 2A(i), may be configured to collect distance measurements to the wall 206 by projecting a plurality of beams 208, representing the path traveled by an electromagnetic pulse of energy at discrete angles along a measurement plane, to determine the distance to the wall 206 based on a time of flight (“ToF”) of the beams 208 leaving the LiDAR sensor 202, reflecting off the wall 206, and returning back to the LiDAR sensor 202. The measurement plane of the LiDAR 202 comprises a plane along which the beams 208 are emitted which, for this exemplary embodiment illustrated, is the plane of the page. In some embodiments, LiDAR sensor 202 may emit beams 208 across a two- dimensional field of view instead of a one -dimensional planar field of view, wherein the additional dimension may be orthogonal to the plane of the page. Such sensors may be referred to typically as 3- dimensional (scanning) LiDAR or ToF depth cameras.
[0070] Individual beams 208 of photons may localize a respective point 204 of the wall 206 in a point cloud, the point cloud comprising a plurality of points 204 localized in 2D or 3D space as illustrated in FIG. 2(ii). The location of the points 204 may be defined about a local origin 210 of the sensor 202 and is based on the ToF of the respective beam 208 and the angle at which the beam 208 was emitted from the sensor 202. Distance 212 to a point 204 may comprise half the time of flight of a photon of a respective beam 208 used to measure the point 204 multiplied by the speed of light, wherein coordinate values (x, y) of each point 204 depend both on distance 212 and an angle at which the respective beam 208 was emitted from the sensor 202. The local origin 210 may comprise a predefined point of the sensor 202 to which all distance measurements are referenced (e.g., location of a detector within the sensor 202, focal point of a lens of sensor 202, etc.). For example, a 5-meter distance measurement to an object corresponds to 5 meters from the local origin 210 to the object.
[0071 ] According to at least one non-limiting exemplary embodiment, a laser emitting element of the LiDAR sensor 202 which emits the beams 208 may include a spinning laser, wherein the individual beams 208 illustrated in FIG. 2A(i-ii) may correspond to discrete measurements of the ToF of the laser.
[0072] According to at least one non-limiting exemplary embodiment, sensor 202 may be illustrative of a depth camera or other ToF sensor configured to measure distance, wherein the sensor 202 being a planar LiDAR sensor is not intended to be limiting. Depth cameras may operate similarly to planar LiDAR sensors (i.e., measure distance based on a ToF of beams 208); however, depth cameras may emit beams 208 using a single pulse or flash of electromagnetic energy, rather than sweeping a laser beam across a field of view. Depth cameras may additionally comprise a two-dimensional field of view.
[0073] According to at least one non-limiting exemplary embodiment, sensor 202 may be illustrative of a structured light LiDAR sensor configured to sense distance and shape of an object by projecting a structured pattern onto the object and observing deformations of the pattern. For example, the size of the projected pattern may represent distance to the object and distortions in the pattern may provide information of the shape of the surface of the object. Structured light sensors may emit beams 208 along a plane as illustrated or in a predetermined pattern (e.g., a circle or series of separated parallel lines).
[0074] FIG. 2B illustrates a robot 102 comprising an origin 216 defined based on a transformation 214 from a world origin 220, according to an exemplary embodiment. World origin 220 may comprise a fixed or stationary point in an environment of the robot 102 which defines a (0,0,0) point within the environment. Origin 216 of the robot 102 may define a location of the robot 102 within its environment. For example, if the robot 102 is at a location (x = 5 m, y = 5 m, z = 0 m), then origin 216 is at a location (5, 5, 0) with respect to the world origin 220. The origin 216 may be positioned anywhere inside or outside the robot 102 body such as, for example, between two wheels of the robot at z = 0 (i.e., on the floor). The transform 214 may represent a matrix of values, which configures a change in coordinates from being centered about the world origin 220 to the origin 216 of the robot 102. The value(s) of transform 214 may be based on a current position of the robot 102 and may change over time as the robot 102 moves, wherein the current position may be determined via navigation units 106 and/or using data from sensor units 114 of the robot 102. [0075] The robot 102 may include one or more exteroceptive sensors 202 of sensor units 114, wherein each sensor 202 includes an origin 210. The positions of the sensor 202 may be fixed onto the robot 102 such that its origin 210 does not move with respect to the robot origin 216 as the robot 102 moves. Measurements from the sensor 202 may include, for example, distance measurements, wherein the distances measured correspond to a distance from the origin 210 of the sensor 202 to one or more objects. Transform 218 may define a coordinate shift from being centered about an origin 210 of the sensor 202 to the origin 216 of the robot 102, or vice versa. Transform 218 may be a fixed value, provided the sensor 202 does not change its position. In some embodiments, sensor 202 may be coupled to one or more actuator units 108 configured to change the position of the sensor 202 on the robot 102 body, wherein the transform 218 may further depend on the current pose of the sensor 202.
[0076] Controller 118 of the robot 102 may always localize the robot origin 216 with respect to the world origin 220 during navigation, using transform 214 based on the robot 102 motions and position in the environment, and thereby localize sensor origin 210 with respect to the robot origin 216, using a fixed transform 218. The transform 218 is only presumed as a fixed value as it is presumed the sensor 202 will not move on its own, wherein various contemporary calibration methods may be used to update the transform 218 if the sensor 202 changes position. In doing so, the controller 118 may convert locations of points 204 defined with respect to sensor origin 210 to locations defined about either the robot origin 216 or world origin 220. For example, transforms 214, 218 may enable the controller 118 of the robot 102 to translate a 5-m distance measured by the sensor 202 (defined as a 5- m distance between the point 204 and origin 210) into a location of the point 204 with respect to the robot origin 216 (e.g., distance of the point 204 to the robot 102) or world origin 220 (e.g., location of the point 204 in the environment).
[0077] It may be appreciated that the position of the sensor 202 on the robot 102 is not intended to be limiting. Rather, sensor 202 may be positioned anywhere on the robot 102 and transform 218 may denote a coordinate transformation from being centered about the robot origin 216 to the sensor origin 210 wherever the sensor origin 210 may be. Further, robot 102 may include two or more sensors 202 in some embodiments, wherein there may be two or more respective transforms 218 which denote the locations of the origins 210 of the two or more sensors 202. Similarly, the relative position of the robot 102 and world origin 220 as illustrated is not intended to be limiting.
[0078] FIG. 3 illustrates an image plane 302 of a sensor 202 in accordance with some exemplary embodiments of this disclosure. Image plane 302 may comprise a size (i.e., width and height) corresponding to a field of view of a sensor 202. Image plane 302 may comprise a plane upon which a visual scene is projected to produce, for example, images (e.g., RGB images, depth images, etc.). The image plane 302 is analogous to the plane formed by a printed photograph on which a visual scene is depicted. The image plane 302 subtends a solid angle about the origin 210 corresponding to a field of view of the sensor 202, the field of view being illustrated by dashed lines, which denote the edges of the field of view.
[0079] Image plane 302 may include a plurality of pixels 304. Each pixel 304 may include or be encoded with distance information and, in some instances, color information. If the depth camera 202 is configured to produce colorized depth imagery, each pixel 304 of the plane 302 may include a color value equal to the color of the visual scene as perceived by a point observer at a location of a sensor origin 210 (e.g., using data from color-sensitive sensors such as CCDs and optical filters). The distance information may correspond to a time of flight of a beam 208 emitted from the origin 210, traveling through a pixel 304 (the intersection of the beams 208 with pixels 304 of the image plane 302 being shown with dots in the center of the pixels 304), and reaching an object (not shown), wherein the points 204 may be localized on the surface of the object. The distance measurement may be based on a ToF of a beam 208 emitted at the origin 210, passing through a pixel 304, and reflecting off an object in the visual scene back to the depth camera 202. The distance and color information for each pixel 304 may be stored as a matrix in memory 120 or as an array (e.g., by concatenating rows/columns of distance and color information for each pixel 304) for further processing, as shown in FIG. 3 A-B below. [0080] For planar LiDAR sensors configured to measure distance along a measurement plane, the image plane 302 may instead comprise a one -dimensional (i.e., linear) row of pixels 304. The number of pixels 304 along the row may correspond to the angular resolution of the planar LiDAR sensor.
[0081 ] By way of an analogous visual illustration, if the image plane 302 is an opaque surface and one pixel 304 is removed or made transparent to allow for viewing of a visual scene behind the opaque surface through the “removed/transparent” pixel 304, the color value of the pixel 304 may be the color seen by an observer at the origin 210 looking through the “removed/transparent” pixel 304. Similarly, the depth value may correspond to the distance between the origin 210 to an object as traveled by a beam 208 through the “removed” pixel 304. It is appreciated, following the analogy, that depth cameras 202 may “look” through each pixel contemporaneously by emitting flashes or pulses of beams 208 through each pixel 304.
[0082] The number of pixels 304 may correspond to the resolution of the depth camera 202. In the illustrated embodiment, simplified for clarity, the resolution is only 8x8 pixels, however one skilled in the art may appreciate depth cameras may include higher resolutions such as, for example, 480x480 pixels, 1080x1080 pixels, or larger/smaller. Further, the resolution of the depth camera 202 is not required to include the same number of pixels along the horizontal (i.e., y) axis as the vertical (i.e., z) axis. [0083] Depth imagery may be produced by the sensor emitting a beam 208 through each pixel 304 of the image plane 302 to record a distance measurement associated with each pixel 304, the depth image being represented based on a projection of the visual scene onto the image plane 302 as perceived by an observer at origin 210. Depth imagery may further include color values for each pixel 304 if the sensor 202 is configured to detect color or greyscale representations of color, the color value of a pixel 304 being the color as perceived by a point observer at the origin 210 viewing a visual scene through each pixel 304. The size (in steradians) of the pixels 304 may correspond to a resolution of the resulting depth image and/or sensor. The angular separation between two horizontally adjacent beams 0 may be the angular resolution of the depth image, wherein the vertical angular resolution may be of the same or different value.
[0084] Depth imagery may be utilized to produce a point cloud, or a plurality of localized points in 3-dimensional (“3D”) space, each point comprising no volume and a defined (x, y, z) position. Each point typically comprises non-integer (i.e., non-discrete) values for (x, y, z), such as floating-point values. It may be desirable for a robot 102 to identify objects and localize them accurately within its environment to avoid collisions and/or perform tasks. Robotic devices may utilize one or more computer-readable maps to navigate and perceive their environments, wherein the use of raw point cloud data may be computationally taxing and may be inaccurate because the points do not define volumes or surfaces of objects. Accordingly, point clouds are discretized into voxel space to enable the controller 118 to more readily utilize the point cloud data to perceive its surrounding environment.
[0085] Voxels, as used herein, may comprise 3D pixels, or non-overlapping rectangular/cubic prisms of space with a defined width, height, and length, wherein the width, height, and length may be of the same or different values. In some embodiments, voxels may be other non-overlapping 3- dimensional shapes such as, for example, voxels defined using spherical or cylindrical coordinates. To simplify the below description and illustrations, each voxel as shown and described hereinafter may be defined by Cartesian coordinates and comprise cubes which include width, height, and length dimensions of the equivalent values unless specifically stated otherwise. As will be further discussed below, it may be advantageous for a robot 102 to localize empty space surrounding the robot 102 to plan its motions accordingly. Empty space may include any region or voxel in which no points 204 are localized therein. A method of ray marching, discussed below in FIG. 4, includes identifying a path followed by each beam 208 to a point 204, wherein any voxel that the beam 208 passes through may be presumed to comprise empty space (e.g., as shown by ray 402 in FIG. 4). Accordingly, it is advantageous to identify beams 304 and/or localized points 204 that fall within or occupy a same voxel to reduce a number of ray marching operations performed. [0086] To determine if two adjacent beams 208-1, 208-2 localize points 204-1, 204-2 within a same voxel, the point cloud may be discretized into voxel space by assigning each point 204 to a voxel. Adjacent beams 208, as used herein, refers to two beams (e.g., 208-1, 208-2) which pass through pixels 304 of the image plane 302 which are directly neighboring each other, or are contiguous, either vertically or horizontally (i.e., share a common edge or side, also referred to as “directly adjacent”). “Diagonally adjacent” pixels or voxels as used herein share a common vertex. Each point 204 may comprise an (x, y, z) location defined using real valued numbers (e.g., floating point values), wherein the point 204 may be associated with a voxel in discretized 3D space. The voxel assigned to each point may follow Equations 1-3 below:
Figure imgf000022_0001
[0087] In Equations 1-3, parameters (x, y, z) represent real valued (x, y, z) coordinate values of a point 204, which comprise a unit of distance (e.g., centimeters, meters, inches, etc.), wherein the parameters (x, y, z) are defined about the origin 216 of the robot 102, origin 210 of the sensor 202, or origin 220 of the environment of the robot 102, wherein transforms 214, 218 enable the controller 118 to move between respective coordinate systems. Parameters lx, ly, and lz represent the spatial resolution of a voxel along the x, y, and z axis respectively and comprise a unit of distance. Parameters Vx, Vy, and Vz are integer values corresponding to the x, y, z indices of the voxel assigned to the point 204 and comprise no units. The “floor” function outputs a nearest integer number, which is less than its argument, i.e. it rounds down to an integer. It is appreciated that Equations 1-3 above are purely illustrative of the process of assigning points of a point cloud to discrete voxels in a discretized 3D space, wherein other methods for assigning points 204 to a voxel in discretized 3D space are also considered without limitation.
[0088] For example, if a voxel space comprises voxels of Ixlxl cm spatial resolution and a point 204 comprises a location of (3.532, 1.232, 2.960) cm, then the point 204 may be assigned to voxel (3, 1, 2). If the resolution was 0.5x0.5x0.5 cm, the assigned voxel may be (6, 2, 4), and so forth. Depth information, and in some instances color information, corresponding to pixels 304 of an image plane 302 may be stored as an array of values in memory 120. For example, the bottom right pixel 304 may comprise pixel (0, 0) and may be the first entry in the array, whereas the top left pixel 304 may comprise pixel (M-l, N-l) to be stored as the last entry in the array, M and /V representing the length and height, respectively, of the image plane 302 in units of pixels. In some instances, the array may comprise a two-dimensional array (i.e., a matrix), wherein each row/column of pixels 304 is stored as a row/column entry of the two-dimensional array. Pixels horizontally or vertically adjacent to each other may be stored as adjacent entries in the array. One skilled in the art may envision a plurality of methods for configuring the array in memory 120 based on the width or size of the data able to be stored within an address of the memory 120, wherein the specific configuration is not intended to be limiting. The array is illustrated as array 500, 512 depicted and discussed further in FIG. 5A-B below.
[0089] Each point 204 corresponding to a beam 208 which passes through a pixel 304 in the image plane may be localized within a voxel in 3D space as shown by Equations 1-3 above. Notably, beams 208 that pass through adjacent pixels may localize points 204 in the same voxel, in adjacent voxels, or in non-adjacent voxels. Accordingly, the array that stores the distance measurements for each pixel 304 of the image plane 302 may further include a voxel number (e.g., voxel #1500) or location of the voxel in which the point 204 falls within (e.g., voxel at (Vx, Vy, Vz), Vx, Vy, and Vz being integer numbers).
[0090] To determine voxels, or 3D representations of discretized space, which comprise no objects nearby a robot 102, a method of ray marching may be utilized as illustrated next in FIG. 4. FIG. 4 illustrates voxel space in two dimensions, the third dimension omitted for clarity, according to an exemplary embodiment. The voxel space corresponds to a 3D discretization of space, which surrounds a robot 102. The robot 102 may include a sensor 202, comprising a sensor origin 210 located at the illustrated position during acquisition of a scan or distance measurement (e.g., capture of a depth image), wherein the sensor 202 may localize a point 204 within the 3D space based on transforms 214, 218 shown in FIG. 2B above. A controller 118 of the robot 102 may localize itself within the environment via transform 214 shown in FIG. 2B and thereby localize the origin 210 of the sensor 202 using a predetermined transform 218. While the sensor origin 210 is at the illustrated position, the sensor 202 may localize a point 204, of many (omitted for clarity), within the illustrated voxel 408 (hatched)
[0091 ] The ray marching process includes the controller 118 of the robot 102 calculating a ray 402, which traces a path from the sensor origin 210 to the localized point 204. In doing so, each voxel that the ray 402 passes through may be identified as an “empty space” voxel 406 (shaded). This is analogous to a beam 208 emitted from the sensor 202 not detecting any objects along the path of ray 402, which corresponds to no objects being present along the path. The ray marching may be repeated for each point 204 localized by the sensor 202 for each scan or measurement acquired by the sensor 202. One skilled in the art may appreciate that conventional ToF and LiDAR sensors 202 may localize hundreds or thousands of points 204 per scan, wherein performing the ray marching for each point 204 of each scan may utilize a substantial amount of computing resources and time. It is appreciated that a ray 402 extending to another point 204 within the voxel 408 will pass through a substantial majority of the same voxels 406 along its path, thereby redundantly identifying the same voxels 406 as empty space.
[0092] Typically, the ray marching step occupies a substantial portion of runtime when a controller 118 is mapping its environment from 3D point cloud representation to discrete occupancy maps. Further, as the distance between the objects and the sensor decreases, the likelihood that two or more points 204 lie within a same voxel 404 increases, wherein ray marching to two points 204 within a same voxel may result in a plurality of “empty space” voxels 406 being identified as “empty space” multiple times. Accordingly, the systems and methods of the present disclosure are, inter alia, directed towards accelerating the ray marching procedure for producing an occupancy map for robots 102 to enable the robots 102 to map their environments from 3D point clouds to occupancy maps in real time by reducing the number of redundant calculations performed during the ray marching. Wherein, 3D point clouds are typically discretized to floating point accuracy or other accuracy which may be substantially more than a resolution of a 3D voxel space or 2D pixel space in which robots 102 operate, wherein converting the 3D point clouds to occupancy maps may enable the controller 118 to more readily perceive its environment and plan its trajectory accordingly.
[0093] FIG. 5 A illustrates an array 500 corresponding to an array of depth values of pixels 304 of an image plane of a sensor 202, according to an exemplary embodiment. Depth values from each pixel 304 from a sensor 202 may be represented in an array by concatenating distance measurements (and color value(s), if applicable) of each row of pixels 304 in a series. That is, each entry 502 of the array 500 may represent a point 204 localized by a beam 208 which passes through a pixel 304 of the image plane 302. Entries 502 may include a voxel location (i.e., (Vx, Vy, Vz) location) of which the point 204 localized by a beam 208 is within following Equations 1-3 above and may comprise of matrices of values. The entries 502 are denoted Dn, wherein n is an integer number and Dn corresponds to a point 204 measured by a beam 208 adjacent to Dn-i and Dn+i. The voxel number or location of each of the entries 502 of the array 500 (representative of individual pixels 304 of the image plane 302) may be utilized to fdter adjacent duplicates, or two points 204 localized within a same voxel, the two points 204 being localized by beams 208 which pass through adjacent pixels 304 on the image plane 302.
[0094] For illustrative purposes, an array 504 has been shown comprising a lettered notation for each entry 502 of the array 500, wherein the same letters denote adjacent entries 502 that lie within a same voxel. Array 504 is not a separate array stored in memory 120 and is provided herein for ease of describing the method. For example, distance measurements Di through D3 may represent three beams 208 which pass through three adjacent pixels 304 of the image plane 302 and lie within a voxel “A” located at (XA, yA, ZA). AS a separate example, in some instances measurements D2 and Dx may each localize points 204 within a same voxel, but the two entries are not adjacent/neighboring each other in the arrays 500/504 (i.e., do not correspond to adjacent beams 208 which pass through neighboring pixels 304 of the image plane 302) and are denoted using different letters “A” and “D”. That is, “adjacent beams” 208 for a given entry Dn are Dn-i and/or Dn+i, wherein adjacent beams Dn-i and/or Dn+i are further grouped together (e.g., as shown by the lettering in array 504) if the corresponding entries points 204 of Dn-i and/or Dn+i lie within the same voxel as the given entry Dn.
[0095] The controller 118 may parse the array 500 by comparing adjacent entries 502 to each other as shown by operations 506. Operations 506 include the controller comparing a first entry 502 Di to its adjacent entry 502 D2 and if the two entries comprise points 204 within a same voxel (i.e., comprise the same letter in array 504), the controller 118 compares the adjacent entry 502 D2 to its subsequent adjacent entry 502 D3, and so forth. The controller 118 may continue comparing pairs of adjacent entries 502 (i.e., Dn and Dn+i) until the adjacent entries 502 Dn and Dn+i comprise two points 204 localized within two separate voxels (i.e., comprise a different letter in array 504), as shown by cross 508. Upon the controller 118 comparing a first entry 502 D3 to a second entry 502 D4 adjacent to the first entry 502 and determining the two entries denote points 204 localized within separate voxels, the first entry 502 may be provided to an output array 510 and the second entry 502 may be compared with its adjacent neighboring entry 502. Controller 118 may continue such comparison for each pair of entries 502 of array 500, wherein the final entry (e.g., Dx) does not include a comparison and therefore comprises the final entry in output array 510.
[0096] For example, entries 502 D3 and D4 of array 500 may localize points 204 within separate voxels, as shown by different lettering in array 504 of A and B. Accordingly, the controller 118 may provide the entry 502 of D3 to the output array 510 and compare D4 with its neighboring entry 502 D5. The prior entries Di and D2 that lie within the same voxel A may be omitted from the output array 510 such that the output array 510 is of equal or smaller size than the input array 500.
[0097] As shown, the output array 510 is smaller than the input array 500 corresponding to fewer points 204 being utilized by the ray marching process. If distance measurements in entries 502 localize an object substantially close to the sensor 202, the output array 510 may be substantially smaller than the input array 500, since substantially more points 204 may he within a same voxel. Accordingly, this reduction in the points 204 utilized by the ray marching may enhance a speed at which a robot 102 may update a computer-readable map based on new sensor 202 data in the instance where objects are very close to the robot 102, which may be when the robot 102 must accurately perceive its environment and make decisions quickly to avoid collision.
[0098] FIG. 5B illustrates an input array 512 comprising a plurality of distance measurements 514, each distance measurement 514 corresponding to a measured distance of a pixel of an image plane 302 of a sensor 202, according to an exemplary embodiment. A LiDAR scan (e.g., 2D or 3D) and/or a depth image may be represented by a plurality of pixels 304 of an image plane 302, wherein each pixel comprises a depth value. The depth value may localize a point 204 within 3D space. The depth values of the pixels 304 are utilized in Equations 1-3 above to determine a voxel in which the localized point 204 lies. For example, entry 514 of Di may correspond to a pixel 304 in an upper left comer and entry 514 of Dn may correspond to a pixel 304 in a bottom right comer of a 4x3 pixel depth encoded image (i.e., in an image plane 302 comprising a 4x3 arrangement of pixels 304). In some embodiments, the image may be, following the above example, 3x4 pixels, wherein the array 512 may be formed by concatenating columns of pixels 304 of the image plane 302 instead of rows. One skilled in the art may envision a plurality of ways of configuring the array 512 such that adjacent entries 514 correspond to adjacent distance measurements or points 204 measured through vertically or horizontally adjacent pixels 304 of the image plane 302. Entries 514 of the array 512 denoted by Dn represent a depth measurement which localizes a point 204 in 3D space for beam 208 which passes through an nth pixel 304 of an image plane 302, the point 204 being within a voxel determined by equations 1-3 above, wherein parameter n is an integer number.
[0099] As similarly illustrated previously with respect to arrays 500 and 504 in FIG. 5A, an array 516 is illustrated which comprises entries 514 which localize adjacent points 204 within a same voxel denoted with a same letter. For example, Di through D3 may lie within voxel A (e.g., a voxel at location (XA, yA, ZA) or an integer voxel number A). That is, array 516 is not illustrative of another array stored in memory 120 separate from array 512 and is intended to illustrate an alternative denotation of adjacent entries 514 in array 512 which lie within a same voxel for ease of description.
[00100] Controller 118 may partition the input array 512 into at least two partitions Pm, each partition Pm may comprise a fixed number of array entries 514 (i.e., pixels 304 and their corresponding measurements Dn) wherein parameter m is an integer number. The controller 118 may compare each pair of adjacent entries 514 of the array 512, as shown by operations 506, to determine if the pair localize points 204 within a same voxel. The controller 118 may stop comparisons at the end of each partition such that adjacent entries 514 within two separate partitions Pm and Pm+i are not compared. [00101] For example, the controller 118 may not compare entries 514 of D4 with D5 nor D« with D9 for the 4-pixel partition size that is illustrated. For each pair of entries 514 Dn and Dn+i, which correspond to two points 204 that lie within the same voxel, the prior of the pair Dn may be removed or ignored and the later Dn+i may be compared with the next entry 514 Dn+2 of the array 512, provided the next entry 514 Dn+2 falls within the same partition Pm as the prior Dn and Dn+i entries 514. Upon the controller 118 determining that an adjacent pair Dn and Dn+i do not localize points 204 within a same voxel, the first of the pair (Dn) may be provided to output array 518 and the later Dn+i compared with the subsequent entry Dn+2, unless Dn+2 is outside of the partition of Dn and Dn+i, in which case both Dn and Dn+i are provided to the output array 518. Upon the controller 118 reaching an entry 514 at the end of a partition Pm, that entry 514 may be provided to the output array 518. Entries of the output array 518 may be concatenated together, wherein the empty space between each entry is illustrated to show the reduction in size of output array 518 from the input array 512. In the example shown in Fig. 5B, twelve entries in array 512 are reduced to seven entries in output array 518. Accordingly, ray marching may be performed by extending a ray 402 from an origin 210 of the sensor 202 to each point 204 represented by each entry 520 of the output array 518.
[00102] Returning to FIG. 4, the controller 118 may produce a ray 402 which extends from the location of the origin 210 of the sensor 202 used to produce the array 500, 512 (i.e., capture a depth image or LiDAR scan) to each point 204 of the output array 510, 518, wherein the location of the origin 210 corresponds to its location during acquisition of the array 500, 514 (i.e., acquisition of a depth image or LiDAR scan). In some embodiments, each ray 402 may extend from the origin 210 to each point 204 denoted by entries of the output array 510, 518. In some embodiments, the ray 402 may extend from the origin 210 to the center or side of the voxel in which points 204 of the output array 510, 518 are localized. For example, with reference to FIG. 5B, the controller 118 may produce a first ray 402 extending from origin 210 to a center of voxel A or point 204 corresponding to entry D3, a second ray 402 extending from origin 210 to a center of voxel B or to points 204 corresponding to entries D4 and D5, and so forth for each entry 520 of the output array 518. Each voxel 404 in which the rays 402 pass through may be denoted as “empty space” voxels 406 (shaded). Upon all “empty space” voxels 406 being identified, voxels comprising points 204 may be denoted as “occupied” voxels 408 (hashed). The denoting of occupied voxels 408 may be performed subsequent to the identification of the empty space voxels 406 if the ray marching process is performed in parallel, for reasons discussed below in regard to FIGs. 10-11. Voxels in which the rays 402 never pass through and which comprise no points 204 therein may be denoted as “unknown” voxels (white) representing voxels of which sensor data does not provide localization information of any object, or lack thereof, within the voxels.
[00103] In some instances, a ray 402 may pass through an occupied voxel 408. For example, the occupied voxel 408 may comprise a point 204 localized therein during a previous scan or acquisition of a depth image at a time t-1. At a subsequent timestep at time t, the same voxel 408 may not include a point 204 localized therein (e.g., an object may have moved). The voxel 408 denoted as occupied at time t-1 may lie within the path of a ray 402 which extends from the sensor origin 210 to a point 204 localized during the current time step at time t. Accordingly, the formerly occupied voxel 408 is replaced with an empty space voxel 406. This may be referred to as “clearing” of the occupied voxel 408. If no ray 402 passes through the occupied voxel 408, the voxel remains designated as occupied until cleared by a ray 402, based on a later scan at time t or later. Stated differently, a voxel comprising a point 204 therein may be denoted as “occupied” unless and until a future ray 402 extends through the voxel, thereby “clearing” the presence of the object. Essentially the robot 102 will presume an object to be present until it “sees through” the object (e.g., due to the object no longer being present) via ray marching through the occupied pixels.
[00104] One advantage of the partitioning method shown in FIG. 5B is that more data (i.e., more points 204) are preserved, wherein the amount of data preserved may depend on a size of each partition Pm. Operators of robots 102 may tune the partition size based on (i) computing capabilities of the controller 118, (ii) time spent performing the ray marching, (iii) resolution of the sensor 202 used to produce the point cloud, and (iv) desired accuracy of a resulting occupancy map. Partition sizes of 1 entry 514 (i.e., no partitioning nor filtering) may yield the most accurate occupancy map but at the cost of substantially increased computation time during the ray marching because more rays 402 must be simulated and more calculations performed to determine empty voxels 506, wherein a plurality of empty space voxels 406 may be identified as empty space multiple times. A second advantage is that partitioning of the array 512 may allow for rapid parallel computation of operations 506 using vectorized algorithms and/or hardware, as discussed in more detail below in regard to FIG. 11.
[00105] According to at least one non-limiting exemplary embodiment, points 204 localized beyond a threshold distance from the sensor origin 210 may be ignored for ray marching as objects/free space far from the robot 102 may not be impactful in its path planning decisions. Further, performing ray marching for only points 204 far from the robot 102 yields: (i) increased distance causes the likelihood that two adjacent beams 208 localize points 204 within the same voxel, and (ii) increased calculation time due to longer beams 402 for little marginal gain in path planning/execution by the robot 102.
[00106] FIG. 5C is a process flow diagram illustrating a method 522 for a controller 118 of a robot 102 to perform adjacency filtering as used herein, according to an exemplary embodiment. Steps of method 522 may be effectuated by the controller 118 executing computer-readable instructions from memory 120.
[00107] Block 524 includes the controller 118 receiving a point cloud from a depth image or a LiDAR scan. The point cloud may include a plurality of points 204 localized in 3D space using, for example, floating point numbers or other non-discretized values. The point cloud may be produced by a singular depth image from a depth camera or a singular scan across a measurement plane of a planar LiDAR sensor (e.g., 202).
[00108] Block 526 includes the controller 118 discretizing the point cloud into voxels and assigning each point 204 of the point cloud to a voxel. The voxel assigned to each point 204 corresponds to the location of the point 204 in 3D discretized space. Equations 1-3 above illustrate how the points 204 may be translated from (x, y, z) distance measurements, in units of distance, to a voxel (xv, yv, zv) location, with no units. In some embodiments, the assigned voxel may be denoted by its position (xv, yv, zv) in 3D space, wherein (xv, yv, zv) are integer numbers. In some embodiments, the assigned voxel may correspond to a number (e.g., voxel #10250).
[00109] Block 528 includes the controller producing an array 512, each entry 514 of the array 512 may include the assigned voxel of each point 204 of the point cloud. Adjacent entries 514 of the array 512 represent points 204 measured by distance measurements (i.e., beams 208) which are adjacent to each other in the image plane 302, as shown by beams 208-1 and 208-2 in FIG. 3 for example. The array 512 may comprise a concatenation of rows or columns of the pixels 304, and their respective distance measurements and assigned voxel, as viewed on the image plane 302.
[00110] By way of illustration, with reference to FIG. 3, the first eight (8) entries of the array may include the distance measurements and the assigned voxels of the points 204 localized by beams 208 which pass through the eight (8) pixels 304 of the top row of the image plane 302; the next eight (8) entries of the array may include the same for the second row of pixels 304; and so forth, wherein the last eight (8) entries may include the same for the bottom row of pixels 304. Accordingly, the array 512 may include 64 entries 514 for the 8x8 pixel image plane 302 depicted in FIG. 3. However, it is appreciated that the image plane 302 may include more or fewer pixels 304 in other embodiments.
[00111] Returning to FIG. 5C, block 528 further denotes that the array may include N entries 514 representing N pixels 304 of the image plane 302 and N distance measurements for the pixels 304, wherein N may be any integer number. The array 512 may be partitioned into at least one partition. The image plane may include a number of pixels 304 equal to an integer multiple of the partition size, wherein partition size corresponds to a number of entries 514 within each partition.
[00112] Block 530 includes the controller 118 setting a value for a parameter n equal to one. Parameter n may be a value stored in memory 120 and manipulated by the controller 118 executing computer-readable instructions or may be purely illustrative for the purpose of explaining the remaining blocks 532-540. Parameter n may correspond to the entry 514 number within the array 512 of which the controller 118 performs the operations as described in the following blocks, wherein the value of n may be incremented in block 540 following method 522.
[00113] Block 532 includes the controller 118 comparing n to N representative of the controller 118 determining if it has reached the end of the array 512. Parameter N may comprise a fixed-value integer value equal to the total number of entries in the array 512 which also corresponds to the number of pixels 304 of the image plane 302. If the controller 118 has reached the end of the array (i.e., n = N), the controller 118 may proceed to block 542 to perform the ray marching procedure illustrated in FIG. 4 above. If the controller 118 has not reached the end of the array 512 (i.e., n < N), the controller 118 proceeds to block 534.
[00114] For example, with reference to FIG. 5B, N may be equal to 12, wherein upon the controller 118 reaching entry 514 D|2 (i.e., n = 12) the controller 118 does not have an adjacent entry 514 in the array 512 to compare D|2 with (i.e., controller 118 is unable to perform operation 506). In the example illustrated in Figures 5B and 5C, n is a counter, set initially to one and incremented by one integer as the process flow iterates block 534, and /Vis 12, the number of entries 514 in the array. When n reaches 12, it equals N (block 532) and the process flow moves to block 542.
[00115] Block 534 includes the controller determining if the entry 514 Dn is within a same partition Pm of the adjacent entry 514 Dn+i . If the subsequent entry 514 Dn+i is within a different partition than the entry 514 Dn, then the controller 118 may move to block 538. If the subsequent entry 514 Dn+i is within the same partition as entry Dn, the controller 118 proceeds to block 536.
[00116] Block 536 includes the controller 118 determining if the entry 514 Dn and the subsequent entry 514 Dn+i represent points 204 which lie within the same voxel of the discretized point cloud. In some instances, the entries 514 of the array 512 may include the voxel in which the respective point 204 corresponding to each entry 514 lies. In some embodiments, the controller 118 may determine which voxel points 204 corresponding to the entries 514 Dn and Dn+i lie within following Equations 1 -3 above and subsequently compare the voxels to determine if the points 204 lie within the same voxel. If the controller 118 determines the points 204 of the entries 514 Dn and Dn+i lie within the same voxel, the controller 118 moves to block 540. If the points 204 of the entries 514 Dn and Dn+i lie within different voxels, the controller 118 moves to block 538.
[00117] According to at least one non-limiting exemplary embodiment, the comparisons performed in blocks 534-536 may be skipped if the distance measurement associated with the entry Dn is greater than a threshold value. For example, the threshold value may correspond to a distance away from the robot 102 at which points 204 comprise a high unlikelihood that two or more points 204 may fall within a same voxel. Because beams 208 extend radially away from the image plane 302/origin 210, the spatial distance between the two beams may increase while the angular separation remains constant, corresponding to an increased unlikelihood that two adjacent points 204 may fall within a same voxel as distance increases. The threshold distance may be 1 meter, 5 meters, 10 meters, etc. and may depend on the angular resolution of the sensor and the volume of the voxels. That is, prior to the controller 118 reaching block 534, the controller 118 may first compare the distance value of the entry 514 Dn to the threshold distance and, if the entry 514 comprises a distance greater than the threshold, the controller 118 may jump to block 538.
[00118] Block 538 includes the controller 118 providing the entry Dn to the output array 518. [00119] Block 540 includes the controller 118 incrementing the value of n by one (1).
[00120] The loop created by blocks 532-540 may illustrate the process of adjacency filtering, as shown illustratively by operations 506 in FIG. 5A-B above, wherein the controller 118 may compare adjacent entries 514 within the array 512 and produce an output array 518 which includes only entries 520 representing points 204 of the original point cloud (received in block 524) which he within unique voxels from adjacent points 204. Adjacency is with respect to adjacent pixels 304 of the image plane 302 of the sensor 202 used to produce the point cloud.
[00121] Block 542 includes the controller 118 performing the ray marching procedure. The ray marching procedure includes the controller 118 extending one or more rays 402 between an origin 210 of the sensor used to produce the point cloud and the voxels or points of the output array 518. In some embodiments, the ray 402 may extend between the origin 210 and the points 204 corresponding to entries 520 of the output array 518. In some embodiments, the rays 402 may extend from the origin 210 to the center of the voxels corresponding to entries 520 of the output array 518. In some embodiments, the rays 402 may extend from the origin 210 to the nearest or furthest border of the voxels corresponding to entries 520 of the output array 518. Each voxel that the rays 402 pass through without reaching a point 204 may be denoted as “empty space,” “unoccupied,” “free space,” or equivalent, used interchangeably herein. Upon extending all rays 402 to each entry 520 of the output array 518, the controller 118 may mark each voxel of the output array 518 which comprises a point 204 localized therein as an “occupied” voxel 408. The marking (i.e., denoting of occupied voxels 408) may be performed subsequent to the clearing (i.e., denoting of empty space voxels 406) such that adjacent rays 402 do not clear marked voxels, as further illustrated in regard to FIG. 10 below.
[00122] As used hereinafter, adjacency filtering may refer to the operations 506 performed in FIG. 5A-B and/or blocks 528-540 of FIG. 5C. That is, an adjacency filter may include a controller 118 producing an array (e.g., 500, 512) representing a depth measurements through an image plane 302, partitioning the array into a plurality of sections or partitions, and comparing adjacent or neighboring entries within each partition to determine if the neighboring entries localize points 204 within a same voxel to produce an output filtered array (e.g., 510, 518) comprising points 204 within unique voxels. The output array may comprise a plurality of coordinate locations for points 204 within separate voxels from other points 204 localized by distance measurements of adjacent pixels, although some points 204/entries of the output array may lie within the same voxel if the entries are within separate partitions or are not adjacent/neighboring in the input array. In some instances, the output array may include the voxel location of the respective points 204 corresponding to the distance measurements, wherein the controller 118 may translate the distance measurements into coordinate locations based on the position of the sensor. [00123 ] FIG. 6A illustrates a robot 102 performing a ray marching procedure to identify voxels 406 corresponding to empty space without any adjacency filtering shown in FIG. 5A-B above, according to an exemplary embodiment. As illustrated, two points 204 may be localized for a scan from a sensor 202, wherein an origin 210 of the sensor 202 is illustrated at its location when the scan was captured. As shown, two adjacent points 204 may fall within the same voxel 602. Voxel 602 may be considered “occupied” and is illustrated with a hatched fdl to denote the occupied state. Without adjacency filtering, two rays 402 are required to detect empty space voxels 406 between the origin 210 and both points 204. It is appreciated that both rays 402 pass through substantially the same voxels 406, wherein performing the ray marching twice yields little to no new information as to unoccupied or free space voxels 406. In some instances, one or more voxels may be identified as empty space 406 if the two points 204 are far from the sensor origin 210.
[00124] Next, in FIG. 6B, a robot 102 is performing the ray marching procedure after adjacency filtering of points 204 that lie within a same voxel 602, according to an exemplary embodiment. As shown, after adjacency filtering, only one point 204 may remain within the voxel 602. Accordingly, only one ray 402 is utilized to perform the ray marching. Advantageously, a substantial amount of redundant calculations have been omitted due to the removal of adjacent points 204 which he within a single voxel 602.
[00125] One skilled in the art may appreciate that use of an adjacency filter may cause one or more voxels 406 identified as empty space in FIG. 6A to no longer encompass a ray 402 and thereby no longer be identified as empty space but as unknown space , as shown by a voxel 604 outlined by dashed lines. It is appreciated that some information may be lost, resulting in reduced accuracy of an occupancy map, for the advantage of a substantial reduction in time in performing the ray marching procedure by removing a plurality of redundant calculations. Point clouds produced by depth cameras and/or LiDAR sensors 202 may be substantially dense when sensing objects nearby the robot 102 (e.g., 5 or more points 204 within a single voxel), wherein performing a plurality of redundant calculations may increase a time in which the controller 118 may map its environment and make decisions as to how the robot 102 should move in response to its environment. It should also be noted that if a point 204 was located within the voxel 406 directly to the left of voxel 602 for example, a ray 402 may pass through the voxel 604 such that the voxel 604 may be denoted as “empty space” rather than “unknown.” When objects are substantially close to the robot 102, reducing the mapping time may be critical for robot 102 to operate effectively and safely (i.e., without collisions). Further, use of partitions may configure a maximum amount of potential data loss (i.e., points 204 filtered) which may be tuned by operators of robots 102 based on (i) parameters of the sensor 202 (e.g., resolution), (ii) navigation capabilities/precision of the robot 102, (iii) computing resources available to the controller 118, and in some instances (iv) the environment in which the robot 102 operates, allowing for the adjacency filter to be applied to a plurality of different sensors and/or robotic devices.
[00126] FIG. 6C-D shows another illustration which depicts the advantages of adjacency filtering shown in FIG. 5A-C, according to an exemplary embodiment. First, in FIG. 6C, a sensor 202 (not shown) comprising an origin 210 at the illustrated location may localize a plurality of points 204 of a surface of an object 606. The object 606 may include any object within an environment of a robot 102. The depicted embodiment may comprise a 2-dimensional slice of a depth image viewed from above or a planar scan from a planar LiDAR sensor. The slice may comprise a row of pixels 304 (not shown) on the image plane 302, wherein points 204 localized by beams 208 which pass through the row of pixels 304 are illustrated (the remaining points 204 omitted for clarity). The row may form a portion of an input array 512 which may be partitioned into a plurality of sections as shown by lines 604. Each line 608 may bound a specified number of pixels 304 of the image plane 302 corresponding to a size of each partition Pm in pixels. For example, each partition Pm may represent 2, 5, 10, 50, etc. pixels 304 of a row of the image plane 302. Due to object 606 being substantially close to the sensor origin 210, the plurality of points 204 may be densely clustered along the surface of the object 606. Accordingly, each point 204 and its adjacent point 204 on the image plane 302 may be compared using the adjacency filtering described in FIG. 5A-C above to filter points 204 which may he within a same voxel.
[00127] As shown next in FIG. 6D, a plurality of the points 204 shown in FIG. 6C have been removed using the adjacency filtering described in FIG. 5B-C. Most voxels 404 may include one point 204, wherein the remaining points 204 within the voxels 404 have been removed (i.e., filtered). Some voxels 404 may include two points 204 therein, the two points 204 being included in two different partitions. It is appreciated that the partitions separate groups of pixels 304 of the 2D image plane 302 and do not separate groups of voxels 404 within 3D space, wherein the number of voxels 404 encompassed by each partition comprises an angular dependence (i.e., angular resolution of the image plane 302 multiplied by the number of pixels 304 within each partition) and distance dependence (i.e., distance from the origin 210 to the object 606). That is, the dashed lines 604 representing a size of each partition may pass through an integer number of pixels 304 of the image plane 302, wherein the lines 604 may extend anywhere within the voxels 404 and are shown for visual clarity only.
[00128] Each of the remaining points 204 may be utilized for ray marching, wherein rays 402 are extended from the origin 210 of the sensor 202. The voxels 404 within which rays 402 pass through are denoted as empty space as shown by a grey shaded region. The voxels 404 which include the points 204 are denoted as “occupied” voxels 408.
[00129] FIG. 7A illustrates a projection of a point cloud 700 onto a two-dimensional (“2D”) grid 702 for use in producing an occupancy map, according to an exemplary embodiment. Occupancy maps, as discussed above, comprise a plurality of pixels, each of the pixels being denoted as “occupied” or “unoccupied” pixels and, in some embodiments, may further be denoted as “unknown.” Pixels corresponding to the “occupied” denotation may include pixels which represent objects within an environment of the robot, pixels denoted as “unoccupied” may represent empty space, and pixels denoted as “unknown” may represent regions of the environment of which sensor units 114 have not sensed. Each pixel 704 of the grid 702 may represent a specified length and width of an environment of a robot 102. Grid 702 may represent the same plane of the occupancy map (e.g., a floor occupancy map for robots 102 navigating on a floor).
[00130] To produce an occupancy map from the point cloud 700, each point 204 may be projected onto the plane of grid 702. For example, the z component of each point 204 may be removed or set to zero if the plane of the occupancy map is z = 0 (e.g., a floor). Each pixel 704 of the grid 702 may comprise a count equal to a number of points 702 projected therein. Each pixel 704 comprising a count greater than a threshold number, which may be any number greater than zero (i.e., 1 or more points 204 per pixel 704), may correspond to an occupied pixel (dark hatched). For example, one pixel 708 of the plurality of pixels 704 includes a ray 706 extending vertically therefrom, wherein four points 204 (open circles) may be directly above the pixel 708 and be projected therein upon removal of their z component, corresponding to the pixel 708 being occupied by an object and comprising a count of four (4). Ray 706 is intended to illustrate the points 204, illustrated with open circles, which comprise (x, y) components that fall within the pixel 708 (light hatched) at the base of ray 706. It is appreciated that the plane z = 0 may correspond to any height in the physical world, such as, for example, z = 0 being the plane of a floor or a height above the floor. Each of the other points 204 (closed circles) could be mapped onto an occupied pixel 704 (dark hatched) in similar fashion, not shown for simplicity of illustration.
[00131] According to at least one non-limiting exemplary embodiment, the projection may be bounded by a maximum height, wherein points 204 above the maximum height may not be considered during the projection. For example, the maximum height may correspond to a height of a robot 102, wherein points 204 above maximum height may be ignored because the robot 102 may move beneath the object represented by the points 204. According to at least one non-limiting exemplary embodiment, the projection may be bounded by a minimum height, wherein points 204 below a specified minimum height value may be ignored during the projection.
[00132] FIG. 7B illustrates a method for projecting a 3D volume of voxels onto a 2D plane for use in producing an occupancy map from a side perspective, according to an exemplary embodiment. FIG. 7B illustrates a plurality of voxels 404, the plurality of voxels 404 along a z axis (i.e., vertical axis), wherein the horizontal axis illustrated may comprise either the x or y axis. That is, the voxels 404 illustrated may comprise a “slice” or plane of a 3D volume of voxels, wherein additional voxels may be behind and/or in front of (i.e., into or out from the plane of the page) the voxels 404 illustrated.
[00133] A sensor 202, comprising an origin 210 at the illustrated location, of a robot 102 may measure its surrounding environment to produce a point cloud comprising a plurality of points 204. A controller 118 of the robot 102 may perform the ray marching procedure, shown by rays 402, for selected points 204. The selected points 204 include points 204 which correspond to entries 520 of an output array 518 following the adjacency filtering procedure described in FIG. 5A-C above. Thus, the controller 118 only projects one ray 402 for each occupied voxel 408. Additional points 204 collected and localized using the original measurements (i.e., prior to adjacency fdtering) have been illustrated to show that the rays 402 utilized during the ray marching only extend from the sensor origin 210 to one point 204 within each occupied voxel 404. In some instances, two points within a same voxel may be utilized for the ray marching if the two points 204 fall within the same voxel but are denoted in different partitions in the input array 514 used for the adjacency filtering.
[00134] Voxels 404 of which the rays 402 pass through may be denoted as empty space voxels 406 (grey). Afterthe identification ofthe empty space voxels 406, the controller 118 may denote voxels 404 comprising at least one point 204 therein as occupied voxels 408 (hashed lines). The at least one point 204 used for the identification of occupied voxels 408 may be based on the original point cloud or points 204 remaining after the adjacency filtering procedure is executed. Each occupied voxel 408 may include a count corresponding to a number of points 204 of the original point cloud localized therein (i.e., the count is based on points 204 of array 500, not the points 204 of the filtered arrays 510/518). Voxels 404 which include no points 204 therein and wherein rays 402 do not pass through may be denoted as “unknown” voxels (white), corresponding to the controller 118 being unable to determine whether the voxel is free space or there is an object within the voxel, due to a lack of sensory data (i.e., points 204) at or nearby the unknown voxels. The 3D volume of voxels 404, 406, 408 which comprise voxels labeled as “occupied,” “free space” and “unknown” are collectively denoted hereinafter as a 3D volume of labeled voxels.
[00135] To produce an occupancy map, the controller 118 may project the 3D volume of labeled voxels onto a plane. The illustration in FIG. 7B comprises a slice or plane of voxels 404, the projection may include a projection onto a 1-dimensional line of pixels 704, representative of a line or column of pixels 704 of plane 702 shown in FIG. 7A. However one skilled in the art would appreciate that the projection includes projecting the 3D volume of labeled voxels onto a 2D plane. Each pixel 704 corresponds to the base of each column of voxels (i.e., voxels along the z axis), wherein the pixels 704 may include x, y dimensions equal to the x, y dimensions of the voxels 404. For each column of voxels, if a singular voxel within the column is an occupied voxel 408, then the corresponding pixel 704 comprises the denotation of “occupied,” wherein the occupied pixel may include a count equal to the total count of all occupied voxels 408 above it. Similarly, if a column of voxels only includes empty space voxels 406 and unknown voxels (white), the pixel 704 at the base of the column may comprise the empty space denotation. Lastly, if a column of voxels only includes unknown voxels, the resulting pixel 704 may include the denotation of unknown (white pixels). The projection has been illustrated by arrows 710 for each column of voxels corresponding to a pixel 704 of the occupancy map. The counts for the occupied pixels 704 (hashed pixels) have been illustrated for clarity.
[00136] According to at least one non-limiting exemplary embodiment, voxels 404 occupied by the robot 102 may also be denoted either as, e.g., “occupied” voxels 406 or as “robot” voxels. For example, the voxel which comprises the sensor origin 210 may be considered a “robot” voxel as it is occupied by a portion of the robot 102 (i.e., its depth camera 202). Such pixels may also be referred to as a footprint of the robot 102. According to at least one non-limiting exemplary embodiment, the resulting occupancy map may include pixels 704 encoded as “robot” pixels which represent the robot 102 (i.e., area occupied by the robot 102 within the plane of the occupancy map), in addition to the “occupied,” “empty space,” and “unknown” pixels. For example, the leftmost pixel 704 may be encoded with the “robot” denotation. The pixels occupied by the robot 102 on the resulting occupancy map may be based on (i) the location of the robot 102 origin 216 on the map (e.g., based on data from navigation units 106), and (ii) a footprint of the robot 102, the footprint corresponding to the area occupied by the robot 102 on the plane of the occupancy map. It is appreciated, however, that the robot 102 may not be required to map itself in 3D space (i.e., denoting voxels occupied by the robot 102) and instead simply map its 2D projected area occupied to simplify motion/localization calculations by the controller 118.
[00137] FIG. 8 A illustrates an occupancy map 800, or portion thereof, according to an exemplary embodiment. Occupancy map 800 may comprise a plurality of pixels 704, each pixel 704 being representative of a discretized area of an environment of a robot 102. The pixels 704 may each represent, for example, 3cm x 3cm areas, or any other spatial resolution (e.g., Ixlcm, 2x2cm, 4x4cm, etc.). Typically, the pixels 704 may comprise the same width and height as the voxels 404. For clarity and simplicity of illustration, the sizes of the pixels 704 have been enlarged (e.g., the illustrated pixels 704 may be 30x30 cm or other resolution). Following the methods illustrated in FIG. 4-7 above, the occupancy map 800 may be produced by a controller 118 of the robot 102 identifying voxels in discretized 3D space which are occupied, unoccupied (i.e., free space), or unknown. By projecting the 3D point cloud or 3D volume of labeled voxels onto a 2D grid, as shown in FIG. 7, each pixel 704 of the occupancy map may comprise either an “occupied” (hatched pixels 704), “unoccupied” (grey pixels 704), or “unknown” (white pixels 704) label or encoding.
[00138] “Occupied” pixels 704 include pixels comprising a count greater than a threshold value, wherein the count is equal to the number of points 204 localized within voxels above the pixel 704 prior to the 2D projection. “Unoccupied” pixels 704 include pixels below voxels identified to comprise empty space based on the ray marching procedure. “Unknown” pixels correspond to the remaining pixels 704 below only “unknown” voxels, the “unknown” voxels correspond to voxels of which the rays 402 configured during the ray marching procedure did not pass through. That is, the “unknown” pixels are below only “unknown” voxels and are not under (or projected thereon by) any “empty space” or “occupied” voxels.
[00139] Occupancy map 800 may be utilized by a controller 118 of a robot 102 to perceive, map, and navigate its environment. Occupancy map 800 may further include pixels 704 which denote the area (footprint) occupied by the robot 102, as shown by pixels 802 (black). To navigate the environment safely (i.e., without collisions), the controller 118 may specify a predetermined distance between the robot 102 and any occupied pixels 704. For example, the controller 118 may configure the robot 102 to maintain at least one unoccupied pixel 704 between the robot 102 and any occupied pixels 704 to account for sensor noise and/or imperfections in localization of the robot origin 216.
[00140] Occupancy map 800 may further include a route 802 represented by one or more pixels 704 (dark shaded pixels). The route 802 may comprise a width of one or more pixels 704. As shown, the route 802 may cause the robot 102 to navigate close to an occupied pixel 804, but the occupied pixel 804 is spatially isolated (no adjacent occupied pixels). Typically, the spatial resolution of pixels 704 corresponds to a few centimeters (e.g., each pixel may represent 1x1 cm, 2x2 cm, 3x3 cm, etc. spatial regions, wherein the apparent size of pixel 704s in the figure is enlarged for clarity and ease of illustration), wherein an object occupying only a single pixel 704 (i.e., a few centimeters of space) is either (i) not easily sensed by point cloud producing sensors, or (ii) may not exist within the environment of the robot 102 and is thus more likely to be the result of sensor noise or produced in error. Accordingly, it may be advantageous to remove any spatially isolated occupied pixels 704 from the occupancy map 800, as shown next in FIG. 8B. For example, if the controller 118 of the robot 102 is configured to maintain a distance of at least 1 pixel from any occupied pixel 704, then the route 802 as illustrated is unnavigable.
[00141] FIG. 8B illustrates the removal of a spatially isolated occupied pixel 804, according to an exemplary embodiment. In some embodiments, the spatially isolated occupied pixel 804 may be required to be surrounded only by “unoccupied” pixels 704 (light grey) in order to be reset to “free space” (i.e., removed). In some embodiments, the spatially isolated occupied pixel 804 may be required to be surrounded by only “unoccupied” pixels 704 and “unknown” pixels 704 (white) to be reset to “free space.” In some embodiments, if the count of the neighboring pixels 704 is below a threshold number, the pixel 804 may be reset to “free space,” wherein the count is based on the 3D point cloud prior to adjacency filtering. Accordingly, the robot 102 may now navigate the route 802 and maintain an at least one pixel 704 distance between the robot footprint 802 and any occupied pixels 704 on the occupancy map 800.
[00142] According to at least one non-limiting exemplary embodiment, an occupied pixel may comprise one or more neighboring pixels which is/are also denoted as occupied. The pixel may be considered a spatially isolated pixel 804, and therefore may be reset to “free space,” if its neighboring four pixels (excluding diagonally neighboring pixels) or neighboring eight pixels (including diagonally neighboring pixels) comprise a count less than a threshold value. The threshold number may depend on the resolution of the sensor and noise level of the sensor. For example, if an occupied pixel comprises a count of three (3) and only one neighboring pixel is occupied and comprises a count of one (1), both pixels may be reset to “free space” as spatially isolated pixels if the threshold count is five (5).
[00143] According to at least one non-limiting exemplary embodiment, the removal of spatially isolated pixels may instead be performed in 3D voxel space prior to the 2D projection of the point cloud or discretized 3D volume of labeled voxels. That is, the controller 118 may identify one or more voxels comprising a nonzero count with neighboring voxels comprising a zero count and/or “free space” denotation and subsequently reset the count of the one or more voxels to zero. The neighboring voxels may be the six directly adjacent voxels (i.e., excluding diagonally adjacent voxels) or may include the twenty-six (26) adjacent and diagonally adjacent voxels. In some embodiments, to remove a voxel comprising a nonzero count (i.e., reset its count to zero such that it becomes a “free space” voxel), its neighboring voxels must in total include a count less than a predetermined threshold value.
[00144] FIG. 9 is a process flow diagram illustrating a method 900 for a controller 118 to produce an occupancy map based on a point cloud produced by one or more sensors 202, according to an exemplary embodiment. Steps of method 900 may be effectuated by the controller 118 executing computer-readable instructions from memory 120.
[00145] Block 902 includes the controller 118 receiving a point cloud from a sensor 202. In some instances, the point cloud may be received based on data from two or more sensors 202 if the sensors 202 operate synchronously. In some instances, the point cloud may be received based on data from two or more sensors 202 if the sensors 202 include non-overlapping fields of view.
[00146] Block 904 includes the controller 118 discretizing the point cloud into a plurality of voxels. Each voxel may comprise a count if one or more points 204 are localized therein following Equations 1-3 above. The count may be an integer number corresponding to the number of points 204 localized within each voxel of the plurality. [00147] Block 906 includes the controller 118 removing adjacent duplicate points 204 within a same voxel, for example as illustrated above in FIG. 5A-C. The controller 118 may produce an array of values (500, 512), the array comprising a length equal to a number of pixels 304 within the image plane 302 of the sensor 202. For two-dimensional depth images, the array comprises a concatenation of the rows of pixels 304. For planar LiDAR sensors, the array may comprise the distance measurements across the field of view concatenated in sequence. The controller 118 may, in some embodiments, partition the array into a plurality of partitions, wherein each distance measurement within the partitions may be compared with its adjacent neighboring distance measurement to determine if the two distance measurements localize points 204 within a same voxel. The controller 118 may only compare adjacent distance measurements within a same partition. The partition size may be configured such that the total number of pixels of the image plane 302 comprises an integer multiple of the partition size. The controller 118 may produce an output array (510, 518), the output array may include a plurality of entries, each entry of the output array being either within a separate voxel from other entries of the output array or corresponding to entries of the input array within separate partitions.
[00148] Block 908 includes the controller 118 ray marching to each occupied voxel. The ray marching may be performed for occupied voxels which remain after the adjacency filtering performed in block 906. Ray marching includes the controller 118 projecting a ray 402 from an origin 210 of the sensor(s) 202 used to produce the point cloud to each occupied voxel. The ray 402 may extend from the origin 210 to either the center of the occupied voxels or the locations of the points 204 therein. The voxels the rays pass through on their paths to the occupied voxels may be denoted as “empty space.” It is appreciated that this step may occupy a substantial majority of the runtime of method 900, wherein use of the adjacency filtering in block 904 substantially reduces the time occupied by this step 908.
[00149] Block 910 includes the controller 118 marking occupied voxels with a count. The count may correspond to a number of points 204 localized within each voxel based on the original point cloud received in block 902 and discretized in block 904 (i.e., not based on the remaining points 204 after adjacency filtering). Any voxels identified as “occupied” voxels 408 during, e.g., a previous execution of method 900 (i.e., during acquisition of a previous point cloud) may be marked as “empty space” if a ray 402 passes through the “occupied” voxels 408. In doing so, the controller 118 clears space (i.e., voxels) previously occupied by objects (e.g., moving objects) which are no longer occupied by the objects based on data from the point cloud received in block 902. Similarly, the controller 118 will not clear any “occupied” voxels 408 unless and until a ray 402 passes therein. This is analogous to the controller 118 assuming an object previously seen by the previous point cloud is still present unless and until there is data, e.g., from the current point cloud which indicates the object is no longer present.
[00150] The marking (i.e., assigning a count to each voxel) may be performed subsequent to the identification of “empty space” voxels in block 908 such that, when implementing the adjacency filtering using vectorized computational methods or ray marching in parallel (discussed below in regard to FIGs. 10-11), no voxels which comprise a point 204 therein are misidentified as “empty space.” [00151] By way of an illustrative example, FIG. 10A-B illustrates the ray marching procedure performed on two points 204-1 and 204-2, according to an exemplary embodiment. The two points 204 may be localized by two beams 208 which pass through adjacent/neighboring pixels 304 of the image plane 302 of a depth camera 202. As mentioned previously above, the denoting of voxels 404 as “occupied” is performed subsequent to the denoting of voxels 404 as “empty space” voxels 406 when performing the ray marching process in parallel for reasons which will be discussed now, although one skilled in the art may appreciate that performing the ray marching process in parallel is not intended to be limiting. First in FIG. 10A, a controller 118 may extend a ray 402-1 to the point 204-1, wherein voxels 404 through which ray 402-1 passes may be denoted as “empty space” voxels 406. The voxel 1002 in which the point 204-1 would not be denoted as empty space based on ray 402-1. Contemporaneously, the controller 118 may extend a second ray 402-2 to the second point 204-2 and denote voxels 404 through which the ray 402-2 passes as “empty space,” excluding the voxel 404 in which the second point 204-2 lies. In some embodiments, if a ray 402 passes within a voxel 404 the voxel may be denoted as “empty space” despite a point 204 being localized therein. As shown in Figure 10A, voxel 1002 is assigned as an “empty” voxel by ray 402-2 because it does not encounter point 204- 1.
[00152] Subsequently, after the identification of all empty space voxels 406, the controller 118 may mark both the voxels 404 in which the two points 204-1 and 204-2 lie as “occupied” voxels 408 as shown in FIG. 10B. As shown, voxel 1002 containing point 204-1 is reassigned as “occupied” voxel 408-1 and the voxel containing point 204-2 is assigned as “occupied” voxel 408-2. The controller 118 may further assign a count to each occupied voxel 408 corresponding to a number of points 204 of the original point cloud which fall within each occupied voxel 408, wherein the count is based on the original point cloud and the figure 10A-B depicts points 204 which remain after the adjacency filtering (e.g., as described in FIG. 5A-C).
[00153] Had the controller 118 instead marked voxel 1002 in Figure 10A as “occupied” voxel 408-1 and subsequently extended ray 402-2 to identify “empty space” voxels 406, the voxel 408-1 may have been changed from an “occupied” voxel 408 to an “empty space” voxel 406. It is appreciated that additional logic may be implemented in the computer-readable instructions which may ignore “occupied” voxels 408 during the identification of “free space” voxels 406, however this may slow the ray marching procedure further, wherein the ray marching procedure occupies a substantial amount of time during the execution of method 900. Further, identifying all empty space voxels 406 and subsequently (i.e., after all rays 402 to all points 204 have been configured) to identifying occupied voxels 408 enables the controller 118 to extend the rays 402-1 and 402-2 contemporaneously (e.g., using parallel processors 138 or a SIMD processor 1100 shown below in FIG. 11) without the possibility of an occupied voxel 408 being cleared (i.e., denoted as an empty space voxel 406) by an adjacent or neighboring ray 402-2. That is, “empty space” voxels 406 may be replaced after the ray marching procedure with “occupied” voxels 408 if there is one or more points 204 localized therein (based on the original point cloud), however “occupied” voxels 408 are never replaced with “empty space” voxels 406 until a subsequent point cloud scan is received. In short, it may be better to designate all voxels determined by all rays as “empty” until the controller identifies a point in a voxel by mapping the point cloud into the voxels, when the voxels having a point mapped therein are reassigned as “occupied.”
[00154] Returning to FIG. 9, block 912 includes the controller 118 projecting the 3D volume of labeled voxels onto a 2D grid to produce a 2D occupancy map. That is, for each column of voxels, the count of all the voxels is summed, wherein the summation of the count corresponds to the count of the pixel of the 2D occupancy map below the column. If the summation yields a value greater than a threshold (e.g., 1 or more), the pixel at the base of the column may include a denotation of “occupied.” If the summation is below a threshold value, then the pixel at the base of the column may include a denotation of “unoccupied,” “free space,” or equivalent. If no voxels above are denoted as “free space” or “occupied” for a column of voxels, the pixel at the base of the column may be denoted as “unknown,” or equivalent. Voxels denoted as “unknown” may comprise a count of zero.
[00155] According to at least one non-limiting exemplary embodiment, the projection may be performed by the controller 118 removing the vertical component of the points 204 of the point cloud received in block 902. Subsequently, the controller 118 may count a number of points 204 within each pixel and assign the corresponding labels of “occupied,” “free space,” or “unknown” based on the count of each pixel.
[00156] Block 914 includes the controller 118 removing spatially isolated pixels of the occupancy map. Spatially isolated pixels may include any occupied pixel 704 comprising a count greater than a first threshold required to be denoted as “occupied.” A spatially isolated pixel corresponds to an occupied pixel 704 which is surrounded by pixels which cumulatively comprise a count below a second threshold. The surrounding pixels 704 may include the four neighboring pixels 704 or the eight neighboring and diagonally neighboring pixels 704. In some instances, the second threshold may correspond to eight (8) times the first threshold. In some instances, the first and second thresholds may be one (1) or more.
[00157] FIG. 11 illustrates a single instruction multiple data (“SIMD”) architecture of a processor 1100, according to an exemplary embodiment. SIMD devices are a generalized form of computer architecture denoted by Flynn’s Taxonomy. SIMD devices differ from sequential CPU processors in that data stored in the data pool 1102 (e.g., a computer-readable storage medium) may be processed in small chunks by a plurality of processing units 1104 in parallel, wherein each processing unit 1104 may receive instructions from an instruction pool 1106 (e.g., a computer-readable storage medium). The instructions received may be the same for each processing unit 1104. Other computer architectures denoted by Flynn’s Taxonomy may include SISD (single instruction single data stream), MISD (multiple instruction single data stream), and MIMD (multiple instructions multiple data streams). Contemporary graphics processing units (“GPUs”) and sequential CPU processors, such as produced by Intel, Nvidia, and Advanced Micro Devices (“AMD”), may include some combination of the four architectures. SIMD and MISD architectures, however, are more closely aligned with graphics processing units (“GPU”) than sequential CPUs because architectures of GPUs are typically configured to maximize throughput (i.e., amount of data processed per second), whereas sequential CPUs are configured to minimize latency (i.e., time between execution of two sequential instructions). That is, SIMD processor 1100 is not intended to be limiting for any of the processors discussed above (e.g., processor 138 or controller 118), wherein FIG. 11 is intended to illustrate how the adjacency filtering may be configured for use of specific hardware to accelerate the adjacency filtering, such as GPUs, which are common within the art to perform the ray marching procedure (i.e., ray tracing).
[00158] A SIMD processor is similar to SIMT (single instruction multiple thread) processors. Contemporary GPUs use a combination of SIMD and SIMT architectures, wherein processor 1100 may illustrate one of many stream multiprocessors (“SM”) of a GPU. For example, NVDIA GeForce RTX 2080 comprises 46 SMs, where each SM may comprise a SIMD architecture similar to processor 1100. Each SM may be considered as a block of separate threads, each block of threads being configured to execute different instructions from the instruction pool 1106 compared to the other SMs. Threads of the block of threads correspond to the processors 1104 which execute the same computer-readable instructions in parallel.
[00159] Processing units 1104 may be configured to execute the computer-readable instructions received from the instruction pool 1106. Typically, the instructions executed are of low complexity and size as compared to sequential CPUs (e.g., CISC architecture CPUs) which may execute a substantial number of sequential instructions. Processing units 1104 may comprise a similar architecture as illustrated by processor 138, wherein instructions are received from an instruction pool 1106 which are shared with other processing units 1104 rather than a memory 130. FIG. 11 illustrates three processing units 1104, however one skilled in the art may appreciate that a SIMD processor 1100 may include a plurality of additional processing units 1104 (e.g., hundreds or more). [00160] Use of a SIMD architecture may accelerate both the ray marching procedure and adjacency filtering described above. For example, with respect to adjacency filtering, the instruction pool 1106 may provide instructions to the plurality of processing units 1104 which configure the processing units 1104 to perform operations 506 on data received from the data pool 1102. The data received by each processing unit 1102 may include entries 514 within one partition of an array 512. In the illustrated embodiment, the array 512 has been partitioned into three partitions of eight (8) entries 514, as illustrated within the data pool 1102, wherein the partitions may be stored in separate addresses. Alternatively, each SM of a GPU may process a partition of the array and each processing unit 1104 may perform a single operation/comparison 506. In some embodiments, each entry 514 is stored within a separate address in the data pool 1102, wherein a partition may be retrieved by the data pool 1102 outputting data to a respective processing unit 1104 corresponding to a range of addresses (i.e., a vector). Each partition and its corresponding entries may be retrieved from the data pool 1102 simultaneously using a vector address (i.e., a range of addresses). That is, a processing unit 1102 receives entries 514 for a partition and performs operations 506 on the entries 514 within the partition to produce entries 520 of the output array 518 at the same time as the other processing units 1104 receive their respective entries 514 for their respective partitions to perform the same operations. In doing so, the SIMD processor 1100 may perform the adjacency filtering by executing operations 506 in parallel for each partition, thereby reducing the execution time in performing the adjacency filtering for the entire array 512 down to the time required for a single processing unit 1104 to process a single partition.
[00161] As another example, each processing unit 1102 may receive instructions from the instruction pool 1106 which configure the processing units 1102 to perform the ray marching procedure for a singular ray 402. Each processing unit 1102 may receive an entry 520 of the output array 518 and a location of an origin 210 of a sensor used to produce a point cloud based on which the ray marching is performed. Accordingly, each processing unit 1102 may extend a ray 402 to a point 204 corresponding to an entry 520 of the output array 518 to identify “empty space” voxels 406. In doing so, the SIMD processor 1100 extends a number of rays 402 equal to a number of processing units 1102 in parallel (i.e., contemporaneously) which drastically reduces execution time of the ray marching procedure.
[00162] According to at least one non-limiting exemplary embodiment, a controller 118 of a robot 102 may include a GPU and/or SIMD processor 1100 to perform the ray marching procedure. Ray marching, or ray tracing, is typically optimized for GPUs and/or SIMD devices executing a small set of instructions on multiple data sets (e.g., producing rays 402 from origin point 210 and a point 204). Accordingly, robots 102 that perform the ray marching procedure may typically include one or more GPUs as components of controller 118 to accelerate this process. Advantageously, the adjacency fdter may be executed quickly when executed by a SIMD device, such as a GPU of controller 118, which further reduces the execution time of method 900 and is applicable to many robots 102 which may already comprise a GPU or SIMD processor. There is no limitation that a GPU or SIMD device must be utilized to perform adjacency filtering and ray marching. However the use of partitioning during adjacency filtering enables the adjacency filtering to be optimized for use in GPUs and SIMD processors 1100.
[00163] It will be recognized that while certain aspects of the disclosure are described in terms of a specific sequence of steps of a method, these descriptions are only illustrative of the broader methods of the disclosure, and may be modified as required by the particular application. Certain steps may be rendered unnecessary or optional under certain circumstances. Additionally, certain steps or functionality may be added to the disclosed embodiments, or the order of performance of two or more steps permuted. All such variations are considered to be encompassed within the disclosure disclosed and claimed herein.
[00164] While the above detailed description has shown, described, and pointed out novel features of the disclosure as applied to various exemplary embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated may be made by those skilled in the art without departing from the disclosure. The foregoing description is of the best mode presently contemplated of carrying out the disclosure. This description is in no way meant to be limiting, but rather should be taken as illustrative of the general principles of the disclosure. The scope of the disclosure should be determined with reference to the claims.
[00165] While the disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. The disclosure is not limited to the disclosed embodiments. Variations to the disclosed embodiments and/or implementations may be understood and effected by those skilled in the art in practicing the claimed disclosure, from a study of the drawings, the disclosure and the appended claims.
[00166] It should be noted that the use of particular terminology when describing certain features or aspects of the disclosure should not be taken to imply that the terminology is being re-defined herein to be restricted to include any specific characteristics of the features or aspects of the disclosure with which that terminology is associated. Terms and phrases used in this application, and variations thereof, especially in the appended claims, unless otherwise expressly stated, should be construed as open-ended as opposed to limiting. As examples of the foregoing, the term “including” should be read to mean “including, without limitation,” “including but not limited to,” or the like; the term “comprising” as used herein is synonymous with “including,” “containing,” or “characterized by,” and is inclusive or open-ended and does not exclude additional, unrecited elements or method steps; the term “having” should be interpreted as “having at least”; the term “such as” should be interpreted as “such as, without limitation”; the term “includes” should be interpreted as “includes but is not limited to”; the term “example” or the abbreviation “e.g.” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof, and should be interpreted as “example, but without limitation”; the term “illustration” is used to provide illustrative instances of the item in discussion, not an exhaustive or limiting list thereof, and should be interpreted as “illustration, but without limitation.” Adjectives such as “known,” “normal,” “standard,” and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass known, normal, or standard technologies that may be available or known now or at any time in the future; and use of terms like “preferably,” “preferred,” “desired,” or “desirable,” and words of similar meaning should not be understood as implying that certain features are critical, essential, or even important to the structure or function of the present disclosure, but instead as merely intended to highlight alternative or additional features that may or may not be utilized in a particular embodiment. Likewise, a group of items linked with the conjunction “and” should not be read as requiring that each and every one of those items be present in the grouping, but rather should be read as “and/or” unless expressly stated otherwise. Similarly, a group of items linked with the conjunction “or” should not be read as requiring mutual exclusivity among that group, but rather should be read as “and/or” unless expressly stated otherwise. The terms “about” or “approximate” and the like are synonymous and are used to indicate that the value modified by the term has an understood range associated with it, where the range may be ±20%, ±15%, ±10%, ±5%, or ±1%. The term “substantially” is used to indicate that a result (e.g., measurement value) is close to a targeted value, where close may mean, for example, the result is within 80% of the value, within 90% of the value, within 95% of the value, or within 99% of the value. Also, as used herein “defined” or “determined” may include “predefined” or “predetermined” and/or otherwise determined values, conditions, thresholds, measurements, and the like.

Claims

CLAIMS What is claimed is:
1. A method for producing occupancy maps for a robot, comprising: receiving, from at least one sensor of the robot, a point cloud, the point cloud produced based on measurements from the at least one sensor of the robot; discretizing, via a controller of the robot, the point cloud into a plurality of voxels, each voxel of the plurality of voxels comprising a count; associating, via the controller, one or more points of the point cloud, which fall within a same voxel, to a single point within the same voxel; and ray marching, via the controller, from an origin of the at least one sensor to the single point within the same voxel to detect empty space surrounding the robot.
2. The method of Claim 1, further comprising: producing, via the controller, a computer readable map of an environment of the robot, the computer readable map comprising a plurality of pixels, each pixel being identified either as occupied or empty space based on the counts of the voxels above each of the plurality of pixels.
3. The method of Claim 2, further comprising: setting, via the controller, a count equal to zero for occupied pixels comprising neighboring pixels which include a cumulative total count less than a threshold value.
4. The method of Claim 1, wherein the one or more points in the same voxel are determined based on adjacency within an image plane of the at least one sensor.
44 The method of Claim 4, wherein a magnitude of the distance measurements for the one or more points is less than a distance threshold. The method of Claim 1, wherein, the at least one sensor includes one of a scanning planar LiDAR, two-dimensional LiDAR, or a depth camera. A robotic system for producing occupancy maps, comprising: a non-transitory computer readable storage medium comprising a plurality of computer readable instructions embodied thereon; and at least one processor configured to execute the computer readable instructions to, receive a point cloud, the point cloud produced based on measurements from at least one sensor of the robotic system; discretize the point cloud into a plurality of voxels, each voxel of the plurality of voxels comprising a count; associate one or more points of the point cloud, which fall within a same voxel, to a single point within the same voxel; and ray march by extending rays from an origin of the at least one sensor to the single point within the same voxel to detect empty space surrounding the robotic system. The robotic system of Claim 7, wherein the at least one processor is further configured to execute the instructions to: produce a computer readable map of an environment of the robot, the computer readable map comprising a plurality of pixels, each pixel being identified either as occupied or empty space based on the counts of the voxels above each of the plurality of pixels. The robotic system of Claim 8, wherein the at least one processor is further configured to execute the instructions to, set a count equal to zero for occupied pixels comprising neighboring
45 pixels which include a cumulative total count less than a threshold value. The robotic system of Claim 7, wherein the one or more points in the same voxel are determined based on adjacency within an image plane of the at least one sensor. The robotic system of Claim 10, wherein a magnitude of the distance measurements for the one or more points is less than a distance threshold. The robotic system of Claim 7, wherein the at least one sensor includes one of a scanning planar LiDAR, two-dimensional LiDAR, or depth camera. A non-transitory computer readable storage medium comprising a plurality of computer readable instructions embodied thereon that, when executed by at least one processor, configure the at least one processor to, receive a point cloud, the point cloud produced based on measurements from at least one sensor of a robotic system; discretize the point cloud into a plurality of voxels, each voxel of the plurality of voxels comprising a count; associate one or more points of the point cloud, which fall within a same voxel, to a single point within the same voxel; and ray march by extending rays from an origin of the at least one sensor to the single point within the same voxel to detect empty space surrounding the robotic system. The non-transitory computer readable storage medium of Claim 13, wherein the instructions further configure the at least one processor to: produce a computer readable map of an environment of the robot, the computer readable map comprising a plurality of pixels, each pixel being identified either as occupied or empty space based on the counts of the voxels above each of the plurality of pixels.
46 The non-transitory computer readable storage medium of Claim 14, wherein the instructions further configure the at least one processor to: set a count equal to zero for occupied pixels comprising neighboring pixels which include a cumulative total count less than a threshold value. The non-transitory computer readable storage medium of Claim 13, wherein the one or more points of the same voxel are determined based on adjacency within an image plane of the at least one sensor. The non-transitory computer readable storage medium of Claim 16, wherein a magnitude of the distance measurements for the one or more points is less than a distance threshold. The non-transitory computer readable storage medium of Claim 13, wherein the at least one sensor includes one of a scanning planar LiDAR, two-dimensional LiDAR, or depth camera.
PCT/US2021/055677 2020-10-20 2021-10-19 Systems and methods for producing occupancy maps for robotic devices WO2022087014A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063093997P 2020-10-20 2020-10-20
US63/093,997 2020-10-20

Publications (1)

Publication Number Publication Date
WO2022087014A1 true WO2022087014A1 (en) 2022-04-28

Family

ID=81290042

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/055677 WO2022087014A1 (en) 2020-10-20 2021-10-19 Systems and methods for producing occupancy maps for robotic devices

Country Status (1)

Country Link
WO (1) WO2022087014A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120089295A1 (en) * 2010-10-07 2012-04-12 Samsung Electronics Co., Ltd. Moving robot and method to build map for the same
US20190197774A1 (en) * 2017-12-22 2019-06-27 Magic Leap, Inc. Multi-stage block mesh simplification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120089295A1 (en) * 2010-10-07 2012-04-12 Samsung Electronics Co., Ltd. Moving robot and method to build map for the same
US20190197774A1 (en) * 2017-12-22 2019-06-27 Magic Leap, Inc. Multi-stage block mesh simplification

Similar Documents

Publication Publication Date Title
US11613016B2 (en) Systems, apparatuses, and methods for rapid machine learning for floor segmentation for robotic devices
US20210294328A1 (en) Systems and methods for determining a pose of a sensor on a robot
US20220269943A1 (en) Systems and methods for training neural networks on a cloud server using sensory data collected by robots
US20210354302A1 (en) Systems and methods for laser and imaging odometry for autonomous robots
US11886198B2 (en) Systems and methods for detecting blind spots for robots
JP7462891B2 (en) Systems, Apparatus and Methods for Escalator Detection - Patent application
US11951629B2 (en) Systems, apparatuses, and methods for cost evaluation and motion planning for robotic devices
US11865731B2 (en) Systems, apparatuses, and methods for dynamic filtering of high intensity broadband electromagnetic waves from image data from a sensor coupled to a robot
US20210232136A1 (en) Systems and methods for cloud edge task performance and computing using robots
US20230083293A1 (en) Systems and methods for detecting glass and specular surfaces for robots
US20230168689A1 (en) Systems and methods for preserving data and human confidentiality during feature identification by robotic devices
US20230071953A1 (en) Systems, and methods for real time calibration of multiple range sensors on a robot
US20220365192A1 (en) SYSTEMS, APPARATUSES AND METHODS FOR CALIBRATING LiDAR SENSORS OF A ROBOT USING INTERSECTING LiDAR SENSORS
WO2022087014A1 (en) Systems and methods for producing occupancy maps for robotic devices
WO2020123612A1 (en) Systems and methods for improved control of nonholonomic robotic systems
WO2022246180A1 (en) Systems and methods for configuring a robot to scan for features within an environment
WO2021252425A1 (en) Systems and methods for wire detection and avoidance of the same by robots
US20210215811A1 (en) Systems, methods and apparatuses for calibrating sensors mounted on a device
US20230358888A1 (en) Systems and methods for detecting floor from noisy depth measurements for robots
US20240168487A1 (en) Systems and methods for detecting and correcting diverged computer readable maps for robotic devices
WO2022183096A1 (en) Systems, apparatuses, and methods for online calibration of range sensors for robots
US20240096103A1 (en) Systems and methods for constructing high resolution panoramic imagery for feature identification on robotic devices
US20220163644A1 (en) Systems and methods for filtering underestimated distance measurements from periodic pulse-modulated time-of-flight sensors
WO2023167968A2 (en) Systems and methods for aligning a plurality of local computer readable maps to a single global map and detecting mapping errors
US20230350420A1 (en) Systems and methods for precisely estimating a robotic footprint for execution of near-collision motions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21883735

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21883735

Country of ref document: EP

Kind code of ref document: A1